博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Awesome Reinforcement Learning
阅读量:7089 次
发布时间:2019-06-28

本文共 8093 字,大约阅读时间需要 26 分钟。

Awesome Reinforcement Learning

A curated list of resources dedicated to reinforcement learning.

We have pages for other topics: , , 

Maintainers: , 

We are looking for more contributors and maintainers!

Contributing

Please feel free to 

Table of Contents

Codes

  • Codes for examples and exercises in Richard Sutton and Andrew Barto's Reinforcement Learning: An Introduction
  • Simulation code for Reinforcement Learning Control Problems
  •  (standard interface for RL) and 
  •  - Python-Based Reinforcement learning, Artificial intelligence, and Neural network
  •  - Value-Function-Based Reinforcement Learning Framework for Education and Research
  •  - Machine learning framework for problems in Reinforcement Learning in python
  •  - Java based Reinforcement Learning framework
  •  - Platform Implementing Q-LEarning and other RL algorithms
  •  - Bayesian reinforcement learning library and toolkit
  •  - A deep Q learning demonstration using Google Tensorflow

Theory

Lectures

  • [UCL]  by David Silver
  • [UC Berkeley] CS188 Artificial Intelligence by Pieter Abbeel[Udacity (Georgia Tech.)] 
  • [Stanford]  by Andrew Ng

Books

  • Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction  
  • Csaba Szepesvari, Algorithms for Reinforcement Learning 
  • David Poole and Alan Mackworth, Artificial Intelligence: Foundations of Computational Agents 
  • Dimitri P. Bertsekas and John N. Tsitsiklis, Neuro-Dynamic Programming  
  • Mykel J. Kochenderfer, Decision Making Under Uncertainty: Theory and Application 

Surveys

  • Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement Learning: A Survey, JAIR, 1996. 
  • S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning, Sadhana, 1994. 
  • Matthew E. Taylor, Peter Stone, Transfer Learning for Reinforcement Learning Domains: A Survey, JMLR, 2009. 
  • Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement Learning in Robotics, A Survey, IJRR, 2013. 
  • Michael L. Littman, "Reinforcement learning improves behaviour from evaluative feedback." Nature 521.7553 (2015): 445-451. 
  • Marc P. Deisenroth, Gerhard Neumann, Jan Peter, A Survey on Policy Search for Robotics, Foundations and Trends in Robotics, 2014. 

Papers / Thesis

  • Foundational Papers

    • Marvin Minsky, Steps toward Artificial Intelligence, Proceedings of the IRE, 1961. 
      • discusses issues in RL such as the "credit assignment problem"
    • Ian H. Witten, An Adaptive Optimal Controller for Discrete-Time Markov Environments, Information and Control, 1977. 
      • earliest publication on temporal-difference (TD) learning rule.
  • Methods

    • Dynamic Programming (DP):
      • Christopher J. C. H. Watkins, Learning from Delayed Rewards, Ph.D. Thesis, Cambridge University, 1989. 
    • Monte Carlo:
      • Andrew Barto, Michael Duff, Monte Carlo Inversion and Reinforcement Learning, NIPS, 1994. 
      • Satinder P. Singh, Richard S. Sutton, Reinforcement Learning with Replacing Eligibility Traces, Machine Learning, 1996. 
    • Temporal-Difference:
      • Richard S. Sutton, Learning to predict by the methods of temporal differences. Machine Learning 3: 9-44, 1988.
    • Q-Learning (Off-policy TD algorithm):
      • Chris Watkins, Learning from Delayed Rewards, Cambridge, 1989. 
    • Sarsa (On-policy TD algorithm):
      • G.A. Rummery, M. Niranjan, On-line Q-learning using connectionist systems, Technical Report, Cambridge Univ., 1994. 
      • Richard S. Sutton, Generalization in Reinforcement Learning: Successful examples using sparse coding, NIPS, 1996. 
    • R-Learning (learning of relative values)
      • Andrew Schwartz, A Reinforcement Learning Method for Maximizing Undiscounted Rewards, ICML, 1993.
    • Function Approximation methods (Least-Sqaure Temporal Difference, Least-Sqaure Policy Iteration)
      • Steven J. Bradtke, Andrew G. Barto, Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, 1996. 
      • Michail G. Lagoudakis, Ronald Parr, Model-Free Least Squares Policy Iteration, NIPS, 2001.  
    • Policy Search / Policy Gradient
      • Richard Sutton, David McAllester, Satinder Singh, Yishay Mansour, Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999. 
      • Jan Peters, Sethu Vijayakumar, Stefan Schaal, Natural Actor-Critic, ECML, 2005. 
      • Jens Kober, Jan Peters, Policy Search for Motor Primitives in Robotics, NIPS, 2009. 
      • Jan Peters, Katharina Mulling, Yasemin Altun, Relative Entropy Policy Search, AAAI, 2010. 
      • Freek Stulp, Olivier Sigaud, Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, 2012.
      • Nate Kohl, Peter Stone, Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004.
      • Marc Deisenroth, Carl Rasmussen, PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011. 
      • Scott Kuindersma, Roderic Grupen, Andrew Barto, Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011. 
    • Hierarchical RL
      • Richard Sutton, Doina Precup, Satinder Singh, Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning, Artificial Intelligence, 1999. 
      • George Konidaris, Andrew Barto, Building Portable Options: Skill Transfer in Reinforcement Learning, IJCAI, 2007.
    • Deep Learning + Reinforcement Learning (A sample of recent works on DL+RL)
      • V. Mnih, et. al., Human-level Control through Deep Reinforcement Learning, Nature, 2015. 
      • Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard Lewis, Xiaoshi Wang, Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS, 2014. 
      • Sergey Levine, Chelsea Finn, Trevor Darrel, Pieter Abbeel, End-to-End Training of Deep Visuomotor Policies. ArXiv, 16 Oct 2015. 
      • Tom Schaul, John Quan, Ioannis Antonoglou, David Silver, Prioritized Experience Replay, ArXiv, 18 Nov 2015.
      • Hado van Hasselt, Arthur Guez, David Silver, Deep Reinforcement Learning with Double Q-Learning, ArXiv, 22 Sep 2015. 
      • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, Asynchronous Methods for Deep Reinforcement Learning, ArXiv, 4 Feb 2016.

Applications

Game Playing

  • Traditional Games

    • Backgammon - "TD-Gammon" game play using TD(λ) (Tesauro, ACM 1995) 
    • Chess - "KnightCap" program using TD(λ) (Baxter, arXiv 1999) 
    • Chess - Giraffe: Using deep reinforcement learning to play chess (Lai, arXiv 2015) 
  • Computer Games

    • Human-level Control through Deep Reinforcement Learning (Mnih, Nature 2015)   
    •  
    • MarI/O - learning to play Mario with evolutionary reinforcement learning using artificial neural networks (Stanley, Evolutionary Computation 2002) 

Robotics

  • Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (Kohl, ICRA 2004) 
  • Robot Motor SKill Coordination with EM-based Reinforcement Learning (Kormushev, IROS 2010)  
  • Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (Hester, ICRA 2010)  
  • Autonomous Skill Acquisition on a Mobile Manipulator (Konidaris, AAAI 2011)  
  • PILCO: A Model-Based and Data-Efficient Approach to Policy Search (Deisenroth, ICML 2011) 
  • Incremental Semantically Grounded Learning from Demonstration (Niekum, RSS 2013) 
  • Efficient Reinforcement Learning for Robots using Informative Simulated Priors (Cutler, ICRA 2015)  

Control

  • An Application of Reinforcement Learning to Aerobatic Helicopter Flight (Abbeel, NIPS 2006)  
  • Autonomous helicopter control using Reinforcement Learning Policy Search Methods (Bagnell, ICRA 2011) 

Operations Research

  • Scaling Average-reward Reinforcement Learning for Product Delivery (Proper, AAAI 2004) 
  • Cross Channel Optimized Marketing by Reinforcement Learning (Abe, KDD 2004) 

Human Computer Interaction

  • Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (Singh, JAIR 2002)

Tutorials / Websites

  • Mance Harmon and Stephanie Harmon, 
  • C. Igel, M.A. Riedmiller, et al., Reinforcement Learning in a Nutshell, ESANN, 2007. 
  • UNSW - 
  • Scholarpedia articles on:Repository with useful 
  • UC Berkeley - CS 294: Deep Reinforcement Learning, Fall 2015 (John Schulman, Pieter Abbeel) 
  •  by Travis DeWolf
  •  - Atari 2600 games environment for developing AI agents
  •  by Andrej Karpathy

Online Demos

  •  - A deep Q learning demonstration using ConvNetJS
  •  - A deep Q learning demonstration using Google Tensorflow
  •  - A reinforcement learning demo using reinforcejs by Andrej Karpathy

转载地址:http://fsbql.baihongyu.com/

你可能感兴趣的文章
Java Objects-------------工具类使用
查看>>
Intellij IDEA 自动生成 serialVersionUID
查看>>
将CentOS设置为用光盘做yum源
查看>>
终于用上了比较完美的lion 10.7.3
查看>>
【CentOS 7笔记47】,rsync文件同步工具#171205
查看>>
word2007设置标题自动编号
查看>>
Ubuntu添加自定义快捷方式
查看>>
mysql 基本操作
查看>>
我的友情链接
查看>>
Xcode 使用Git User Interface State 问题
查看>>
我在群硕实习的日子
查看>>
个人知识管理是职场必修课
查看>>
基于 Android NDK 的学习之旅----- C调用Java(附源码)
查看>>
Python主流IDE对比:Eric VS. PyCharm
查看>>
alchim31压缩js和css文件
查看>>
J2EE 之二------------------- Servlet
查看>>
python argparse
查看>>
美团客户端响应式框架 EasyReact 开源啦
查看>>
《Java并发编程的艺术》笔记
查看>>
前有BAT,后出独角兽,第二梯队很焦虑
查看>>