Sutton present a strategic research plan based on the premise that a genuine understanding of intelligence is imminent and—when it is achieved—will be the greatest scientific prize in human history. To contribute to this achievement and share in its glory will require laser-like focus on its essential challenges; identifying those, however provisionally, is the objective of the Alberta Plan for AI research.
Sutton present a strategic research plan based on the premise that a genuine understanding of intelligence is imminent and—when it is achieved—will be the greatest scientific prize in human history. To contribute to this achievement and share in its glory will require laser-like focus on its essential challenges; identifying those, however provisionally, is the objective of the Alberta Plan for AI research. The overall setting is the familiar one common to many fields (reinforcement learning, psychology, control theory, economics, neuroscience, and operations research): a computationally-limited agent interacts with a vastly more complex environment to maximize reward.
The agent’s machinery is divided into four parts: 1) that which maintains the agent’s situational state (perception), 2) that which maps state to action (policy), 3) that which maps state to expected future reward (value function), and 4) that which maps imagined states and actions to next states (transition model) and enables planning. The Alberta Plan extends this common view to include feature-based subtasks and temporally extended options to solve them; the policy and the value function each become multiple, one each for each of the subtasks and the main task. The setting is then potentially complete and the focus shifts to finding the right abstractions, in state (features) and time (options), and to planning efficiency. The Alberta Plan incorporates continual learning and meta-learning into all of its 12 steps, and expends no effort trying to capture domain knowledge.
Richard S. Sutton is a Canada CIFAR AI Chair and a Distinguished Fellow of CIFAR’s Learning in Machines & Brains program. He is the Chief Scientific Advisor of Amii, a Distinguished Research Scientist at DeepMind and a Professor at the University of Alberta’s Department of Computing Science. Sutton is one of the pioneers of reinforcement learning, an approach to artificial and natural intelligence that emphasizes learning and planning from sample experience, and a field in which he continues to lead the world. He is most interested in understanding what it means to be intelligent, to predict and influence the world, to learn, perceive, act, and think. He seeks to identify general computational principles underlying what we mean by intelligence and goal-directed behaviour. Over his career, he has made a number of significant contributions to the field, including the theory of temporal-difference learning, the actor-critic (policy gradient) class of algorithms, the Dyna architecture (integrating learning, planning and reacting), the Horde architecture, and gradient and emphatic temporal-difference algorithms. Sutton seeks to extend reinforcement learning ideas to an empirically grounded approach to knowledge representation based on prediction.
Its program consists of a one-hour lecture followed by a discussion. The lecture is based on an (internationally) exceptional or remarkable achievement of the lecturer, presented in a way which is comprehensible and interesting to a broad computer science community. The lectures are in English.
The seminar is organized by the organizational committee consisting of Roman Barták (Charles University, Faculty of Mathematics and Physics), Jaroslav Hlinka (Czech Academy of Sciences, Computer Science Institute), Michal Chytil, Pavel Kordík (CTU in Prague, Faculty of Information Technologies), Michal Koucký (Charles University, Faculty of Mathematics and Physics), Jan Kybic (CTU in Prague, Faculty of Electrical Engineering), Michal Pěchouček (CTU in Prague, Faculty of Electrical Engineering), Jiří Sgall (Charles University, Faculty of Mathematics and Physics), Vojtěch Svátek (University of Economics, Faculty of Informatics and Statistics), Michal Šorel (Czech Academy of Sciences, Institute of Information Theory and Automation), Tomáš Werner (CTU in Prague, Faculty of Electrical Engineering), and Filip Železný (CTU in Prague, Faculty of Electrical Engineering)
The idea to organize this seminar emerged in discussions of the representatives of several research institutes on how to avoid the undesired fragmentation of the Czech computer science community.