The course covers how to build and solve decision making problems in uncertain environments using the POMDPs.jl ecosystem of Julia packages. Topics covered include sequential decision making frameworks—namely, Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs)—running simulations, online and offline solution methods (value iteration, Q-learning, SARSA, and Monte Carlo tree search), reinforcement learning, deep reinforcement learning (including proximal policy optimization (PPO), deep Q-networks (DQN), and actor-critic methods), imitation learning through behavior cloning of expert demonstrations, state estimation through particle filtering, belief updating, alpha vectors, approximate methods (including grid interpolation for local approximation value iteration), and black-box stress testing to validate autonomous systems. The course is intended for a wide audience—no prior MDP/POMDP knowledge is expected.
Robert Moss is a computer science Ph.D. student at Stanford University studying algorithms to validate safety-critical autonomous systems. He holds an M.S. in computer science from Stanford where his research received the best computer science master’s thesis award and he also received the Centennial TA award for his teaching efforts. He earned his B.S. in computer science with a minor in physics from the Wentworth Institute of Technology in Boston, MA. Robert was an associate research staff member at MIT Lincoln Laboratory where he was on the team that designed, developed, and validated the next-generation aircraft collision avoidance system for commercial aircraft, unmanned vehicles, and rotorcraft. Robert was also a research engineer at the NASA Ames Research Center, developing decision support tools for the VIPER autonomous Lunar rover mission searching for water deposits on the Moon. Robert is a member of the Stanford Intelligent Systems Laboratory and part of the Stanford Center for AI Safety conducting research on methods for efficient risk assessment of autonomous vehicles in simulation using reinforcement learning, deep learning, and stochastic optimization.