GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
vwxyzjn

vwxyzjn / ppo-implementation-details

星标834
复刻115


问题 官网
 
Loading

该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README


0 条讨论

登录后发表评论

关于

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

iclr-blog-track.github.io
创建时间

2022-01-14

是否国产

否

  修改时间

2024-03-23T04:47:28Z

Readme
相关推荐

语言

  • Python81.3%
  • Shell18.7%

vwxyzjn 的其他开源项目

cleanrl
@vwxyzjn

#计算机科学#High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

wandbreinforcement-learningPyTorchPythongym
Python7.77 k
2 个月前
gym-microrts-paper
@vwxyzjn

The source code for the gym-microrts paper.

Python37
3 年前
envpool-cleanrl
@vwxyzjn

Python7
3 年前
vectorized-value-methods
@vwxyzjn

[WIP] Vectorized architecture for value-based methods such as DQN and DDPG

Python3
3 年前

您可能感兴趣的

cleanrl
@vwxyzjn

#计算机科学#High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

wandbreinforcement-learningPyTorchPythongym
Python7.77 k
2 个月前
DLR-RM/stable-baselines3
stable-baselines3
@DLR-RM

#计算机科学#PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

reinforcement-learningreinforcement-learning-algorithms机器学习gymopenai
Python11.43 k
3 天前
OpenAI
spinningup
OpenAI@openai

An educational resource to help anyone learn deep reinforcement learning.

Python11.18 k
1 年前
OpenAI
baselines
OpenAI@openai

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python16.41 k
1 年前
PPO-PyTorch
@nikhilbarhate99

#计算机科学#Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

pytorch-implmentionPyTorchpytorch-tutorialproximal-policy-optimizationreinforcement-learning-algorithms
Python2.16 k
1 年前
DLR-RM/rl-baselines3-zoo
rl-baselines3-zoo
@DLR-RM

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

rlreinforcement-learningstable-baselinesopenaigym
Python2.54 k
1 个月前
Open-Sora
@hpcaitech

Open-Sora: 完全开源的高效复现类Sora视频生成方案

Python27.11 k
4 个月前
grok-1
@xai-org

大模型Grok-1开源

Python50.48 k
1 年前
OpenAI
transformer-debugger
OpenAI@openai

Python4.09 k
1 年前
tensorflow
agents
@tensorflow • 谷歌公司

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

reinforcement-learningTensorflowcontextual-bandits
Python2.95 k
3 个月前
humanoid-bench
@carlosferrazza

Python622
3 个月前
stable-baselines
@Stable-Baselines-Team

#计算机科学#Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

rlreinforcement-learningopenaiopenai-gymgym
Python302
2 年前
AI2
reward-bench
AI2@allenai

RewardBench: the first evaluation tool for reward models.

rlhf
Python628
3 个月前
awesome-RLHF
@opendilab

#计算机科学#A curated list of reinforcement learning with human feedback resources (continually updated)

深度学习deep-reinforcement-learninghuman-feedbackreinforcement-learningrlhf
4.12 k
2 个月前
ray-project
ray
ray-project@ray-project

#大语言模型#Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

raydistributedparallel机器学习reinforcement-learning
Python38.73 k
7 小时前
Lissy93/web-check
web-check
@Lissy93

🕵️‍♂️ All-in-one OSINT tool for analysing any website

OSINT隐私安全sysadmin
TypeScript26.29 k
1 个月前
DRL-code-pytorch
@Lizhi-sjtu

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

ddpg-pytorchdqn-pytorchPyTorch
Python1.37 k
2 年前
LlamaGym
@KhoomeiK

Fine-tune LLM agents with online reinforcement learning

Python689
1 年前
OpenHands
@All-Hands-AI

#大语言模型#🙌 OpenHands: Code Less, Make More

agent人工智能大语言模型ChatGPTclaude-ai
Python63.01 k
8 小时前
Microsoft
retina
Microsoft@microsoft

eBPF distributed networking observability tool for Kubernetes

eBPFKubernetesNetworkobservability
Go3.03 k
4 天前