policy-gradient · GitHub Topics

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

deep-reinforcement-learning reinforcement-learning dqn ppo a3c q-learning sarsa imitation-learning policy-gradient ddpg double-dqn dueling-dqn td3

Jupyter Notebook 12.05 k

1 个月前

MorvanZhou / Reinforcement-learning-with-tensorflow

#计算机科学#Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

reinforcement-learning 教程 q-learning sarsa sarsa-lambda deep-q-network a3c ddpg policy-gradient dqn double-dqn dueling-dqn deep-deterministic-policy-gradient actor-critic Tensorflow proximal-policy-optimization ppo 机器学习

Python 9.27 k

1 年前

thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.

PyTorch policy-gradient dqn double-dqn a2c ddpg ppo td3 sac imitation-learning mujoco atari rl cql

Python 8.67 k

14 天前

sweetice / Deep-reinforcement-learning-with-pytorch

#算法刷题#PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

policy-gradient PyTorch actor-critic-algorithm alphago deep-reinforcement-learning a2c dqn sarsa ppo a3c resnet 算法深度学习 reinforce actor-critic sac td3

Python 4.4 k

2 年前

rlcode / reinforcement-learning

#计算机科学#Minimal and Clean Reinforcement Learning Examples

reinforcement-learning 深度学习 deep-reinforcement-learning 机器学习 policy-gradient deep-q-network dqn actor-critic a3c

Python 3.56 k

2 年前

nikhilbarhate99 / PPO-PyTorch

#计算机科学#Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

pytorch-implmention PyTorch pytorch-tutorial proximal-policy-optimization reinforcement-learning-algorithms deep-reinforcement-learning ppo policy-gradient 深度学习 reinforcement-learning

Python 2.13 k

1 年前

kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

PyTorch reinforcement-learning deep-reinforcement-learning benchmark policy-gradient dqn ppo sac a2c a3c

Python 1.29 k

6 个月前

Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

reinforcement-learning policy-gradient pytorch-rl proximal-policy-optimization ppo PyTorch a2c Generative Adversarial Network deep-reinforcement-learning

Python 1.23 k

4 年前

Kismuz / btgym

#时序数据库#Scalable, event-driven, deep-learning-friendly backtesting library

reinforcement-learning deep-reinforcement-learning gym-environment openai-gym backtesting-trading-strategies algorithmic-trading-library time-series a3c Tensorflow unreal advantage-actor-critic policy-gradient statistical-arbitrage Hacktoberfest

Python 1.01 k

4 年前

sudharsan13296 / Hands-On-Reinforcement-Learning-With-Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

reinforcement-learning deep-reinforcement-learning sarsa q-learning deep-q-network 深度学习 deep-deterministic-policy-gradient double-dqn dueling-dqn ppo markov-decision-processes policy-gradient openai-gym

Jupyter Notebook 856

5 年前

yaserkl / RLSeq2Seq

#自然语言处理#Deep Reinforcement Learning For Sequence to Sequence Models

reinforcement-learning actor-critic policy-gradient 自然语言处理

Python 767

2 年前

omerbsezer / Reinforcement_learning_tutorial_with_demo

#计算机科学#Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Papers,...

reinforcement-learning 教程机器学习 q-learning sarsa policy-gradient deep-reinforcement-learning imitation-learning meta-learning actor-critic pomdps dynamic-programming a3c

Jupyter Notebook 760

7 年前