Learning to generalize from sparse and underspecified rewards, ICML, 2019. ,
Td or not td: Analyzing the role of temporal differencing in deep reinforcement learning, 2018. ,
Learning to act using realtime dynamic programming, Artif. Intell, vol.72, pp.81-138, 1995. ,
Some necessary conditions for a master chess program, Proceedings of the 3rd International Joint Conference on Artificial Intelligence, IJCAI'73, pp.77-85, 1973. ,
, Bertsekas. Dynamic Programming and Optimal Control, vol.I, 2005.
, Bertsekas. Dynamic Programming and Optimal Control, vol.II, p.9781886529304, 2007.
of Optimization and neural computation series, Athena Scientific, vol.3, 1996. ,
Stochastic shortest path problems under weak conditions, 2013. ,
Optimal classification trees, Mach. Learn, vol.106, issue.7, pp.1039-1082, 2017. ,
Classification and regression trees, 1984. ,
A survey of monte carlo tree search methods, IEEE Trans. Comput. Intellig. and AI in Games, vol.4, issue.1, pp.1-43, 2012. ,
Generalized rapid action value estimation, Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI'15, pp.754-760, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01436522
, , 2015.
Elements of Information Theory, Series in Telecommunications and Signal Processing, 2006. ,
Integrating state representation learning into deep reinforcement learning, IEEE Robotics and Automation Letters, vol.3, issue.3, pp.1394-1401, 2018. ,
Decomposition techniques for planning in stochastic domains, Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol.2, pp.1121-1127, 1995. ,
Cyberchondria: Parsing health anxiety from online behavior, Psychosomatics, pp.390-400, 2016. ,
Combining online and offline knowledge in uct, Proceedings of the 24th International Conference on Machine Learning, ICML '07, pp.273-280, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00164003
Monte-carlo tree search and rapid action value estimation in computer go, Artificial Intelligence, vol.175, pp.1856-1875, 2011. ,
A formal basis for the heuristic determination of minimum cost paths, IEEE Transactions on Systems Science and Cybernetics, vol.4, issue.2, pp.100-107, 1968. ,
Actor-critic reinforcement learning with energy-based policies, Proceedings of the Tenth European Workshop on Reinforcement Learning, pp.45-58, 2013. ,
Playing 20 question game with policy-based reinforcement learning. CoRR, abs/1808.07645, p.89, 2018. ,
Planning and acting in partially observable stochastic domains, Artificial Intelligence, vol.101, issue.1, pp.99-134, 1998. ,
Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning, AAAI, 2018. ,
Bandit based monte-carlo planning, Proceedings of the 17th European Conference on Machine Learning, ECML'06, pp.282-293, 2006. ,
,
,
The human phenotype ontology in 2017, Nucleic Acids Research, 2017. ,
Actor-critic algorithms, NIPS, 1999. ,
Depth-first iterative-deepening: An optimal admissible tree search, Artif. Intell, vol.27, issue.1, pp.97-109, 1985. ,
A dynamic adaptive questionnaire for improved disease diagnostics, Advances in Intelligent Data Analysis XVI, pp.162-172, 2017. ,
Clinical diagnostics in human genetics with semantic similarity searches in ontologies ,
, The American Journal of Human Genetics, vol.85, issue.4, pp.457-464, 2009.
, Chapter
GraphMDP: A New Decomposition Tool for Solving Markov Decision Processes, International Journal on Artificial Intelligence Tools, vol.10, issue.3, pp.325-343, 2001. ,
URL : https://hal.archives-ouvertes.fr/inria-00100821
Map Partitioning to Approximate an Exploration Strategy in Mobile Robotics. Multiagent and Grid Systems, An International Journal of Cloud Computing, vol.8, issue.3, pp.275-288, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00971653
Sorting out symptoms: design and evaluation of the 'babylon check' automated triage system, 2016. ,
Playing atari with deep reinforcement learning, 2013. ,
,
Asynchronous methods for deep reinforcement learning, ICML, 2016. ,
Algorithms for inverse reinforcement learning, Proceedings of the Seventeenth International Conference on Machine Learning, ICML '00, pp.663-670, 2000. ,
Flexible decomposition algorithms for weakly coupled markov decision problems, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, UAI'98, pp.422-430, 1998. ,
Refuel: Exploring sparse features in deep reinforcement learning for fast disease diagnosis ,
, Advances in Neural Information Processing Systems 31, pp.7333-7342, 2018.
Induction of decision trees, Mach. Learn, vol.1, issue.1, pp.81-106, 1986. ,
A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis. CoRR, abs/1806.10698, 2018. ,
Evaluation of symptom checkers for self diagnosis and triage: audit study. page 351, British Medical Journal, 2015. ,
Improving exploration in uct using local manifolds, AAAI, 2015. ,
Introduction to Reinforcement Learning, 2018. ,
Policy gradient methods for reinforcement learning with function approximation, NIPS, 1999. ,
Algorithms for Reinforcement Learning. Morgan and Claypool Publishers, p.9781608454921, 2010. ,
Inquire and diagnose : Neural symptom checking ensemble using deep reinforcement learning, 2016. ,
On subspace decompositions of finite horizon dynamic programming problems, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), pp.1890-1895, 2012. ,
, Algebraic Decompositions of DP Problems with Linear Dynamics. arXiv e-prints, art, 2014.
Potential-based shaping and q-value initialization are equivalent. CoRR, abs/1106, vol.5267, 2011. ,
Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, vol.8, pp.229-256, 1992. ,
Memory-augmented monte carlo tree search, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence ,
Collective reasoning under uncertainty and inconsistency. Doctoral thesis, 2014. ,
Synthetic population generation without a sample, Transportation Science, vol.47, pp.266-279, 2013. ,
A maximum entropy approach to natural language processing, Comput. Linguist, vol.22, issue.1, pp.39-71, 1996. ,
Discrete multivariate analysis: Theory and practice, 1975. ,
The bayesian basis of common sense medical diagnosis, AAAI, 1983. ,
Integrating expert knowledge with data in bayesian networks, Expert Syst. Appl, vol.56, pp.197-208, 2016. ,
Elements of Information Theory, Series in Telecommunications and Signal Processing, 2006. ,
Information projections revisited, IEEE Trans. Information Theory, vol.49, issue.6, pp.1474-1490, 2003. ,
Mémoire sur la probabilité des causes par les évènements. Mémoires de mathématique et de physique présentés à l'Académie royale des sciences par divers sçavans et lus dans les assemblées, p.1774 ,
On a least squares adjustment of a sampled frequency table when the expected marginal totals are known, Ann. Math. Statist, vol.11, issue.4, p.1940 ,
Bayesian Data Analysis, 2004. ,
Bayesian reinforcement learning: A survey. CoRR, abs/1609.04436, 2016. ,
, Learning a model of the environment
Learning bayesian networks: The combination of knowledge and statistical data, Machine Learning, vol.20, issue.3, pp.197-243, 1995. ,
Inverse reinforcement learning with simultaneous estimation of rewards and dynamics ,
, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, vol.51, pp.9-11, 2016.
Uncertain reasoning using maximum entropy inference, Proceedings of the First Conference on Uncertainty in Artificial Intelligence, UAI'85, pp.21-27, 1985. ,
Contingency tables with given marginals, Biometrika, vol.55, pp.179-88, 1968. ,
Information theory and statistical mechanics, Phys. Rev, vol.106, issue.4, pp.620-630, 1957. ,
A survey of methods used in probabilistic expert systems for knowledge integration, Knowl.-Based Syst, vol.3, issue.1, pp.7-12, 1990. ,
Probabilistic Graphical Models: Principles and Techniques -Adaptive Computation and Machine Learning, p.9780262013192, 2009. ,
Concentration inequalities for the empirical distribution, 2018. ,
A polynomial time algorithm for finding bayesian probabilities from marginal constraints. CoRR, abs/1304.1104, 2013. ,
Jeffreys centroids: A closed-form expression for positive histograms and a guaranteed tight approximation for frequency histograms, IEEE Signal Processing Letters, vol.20, pp.657-660, 2013. ,
Sided and symmetrized bregman centroids, IEEE Transactions on Information Theory, vol.55, p.123, 2009. ,
Probabilistic reasoning in intelligent systems -networks of plausible inference, Morgan Kaufmann series in representation and reasoning, 1989. ,
Relative entropy, probabilistic inference and AI. CoRR, abs/1304, vol.3423, 2013. ,
Mastering the game of go with deep neural networks and tree search, Nature, vol.529, pp.484-489, 2016. ,
,
Mastering the game of go without human knowledge, Nature, vol.550, pp.354-359, 2017. ,
Bayesian analysis in expert systems, Statist. Sci, vol.8, issue.3, pp.219-247, 1993. ,
Probabilistic inverse reinforcement learning in unknown environments, 2013. ,
Iterative methods for concave programming, Studies in Linear and Nonlinear Programming, pp.154-165, 1958. ,
The centroid of the symmetrical kullback-leibler distance, IEEE Signal Processing Letters, vol.9, pp.96-99, 2002. ,
An empirical study of bayesian network parameter learning with monotonic influence constraints, Decision Support Systems, vol.87, pp.69-79, 2016. ,
Pr-owl: A framework for probabilistic ontologies, Proceedings of the 2006 Conference on Formal Ontology in Information Systems: Proceedings of the Fourth International Conference (FOIS 2006), pp.237-249, 2006. ,
Elements of Information Theory, Series in Telecommunications and Signal Processing, 2006. ,
Representation of the signs in the biomedical ontologies for the help to the diagnosis. Theses, Université Rennes 1, 2013. ,
Clinical diagnostics in human genetics with semantic similarity searches in ontologies ,
, The American Journal of Human Genetics, vol.85, issue.4, pp.457-464, 2009.
Using information content to evaluate semantic similarity in a taxonomy, Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol.1, pp.448-453, 1995. ,
Vacterl/vater association, Orphanet J Rare Dis, 2011. ,
Ontology and medical diagnosis, Inform Health Soc Care, vol.4, 2012. ,