GitHub 中文社区

回车: Github搜索 Shift+回车: Google搜索

©2025 GitHub中文社区论坛 GitHub官网网站地图 GitHub官方翻译

GitHub on X
GitHub on Facebook
GitHub on LinkedIn
GitHub on YouTube
GitHub on Twitch
GitHub on TikTok
GitHub’s organization on GitHub

vwxyzjn / ppo-implementation-details

Loading

该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README

0 条讨论

登录后发表评论

关于

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

iclr-blog-track.github.io

创建时间

2022-01-14

是否国产

否

修改时间

2024-03-23T04:47:28Z

语言

Python81.3%
Shell18.7%

vwxyzjn 的其他开源项目

#计算机科学#High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

wandb reinforcement-learning PyTorch Python gym

Python7.77 k

2 个月前

gym-microrts-paper

The source code for the gym-microrts paper.

Python37

3 年前

envpool-cleanrl

Python7

3 年前

vectorized-value-methods

[WIP] Vectorized architecture for value-based methods such as DQN and DDPG

Python3

3 年前

您可能感兴趣的

#计算机科学#High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

wandb reinforcement-learning PyTorch Python gym

Python7.77 k

2 个月前

DLR-RM/stable-baselines3

stable-baselines3

#计算机科学#PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

reinforcement-learning reinforcement-learning-algorithms 机器学习 gym openai

Python11.43 k

3 天前

An educational resource to help anyone learn deep reinforcement learning.

Python11.18 k

1 年前

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python16.41 k

1 年前

@nikhilbarhate99

#计算机科学#Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

pytorch-implmention PyTorch pytorch-tutorial proximal-policy-optimization reinforcement-learning-algorithms

Python2.16 k

1 年前

DLR-RM/rl-baselines3-zoo

rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

rl reinforcement-learning stable-baselines openai gym

Python2.54 k

1 个月前

Open-Sora：完全开源的高效复现类Sora视频生成方案

Python27.11 k

4 个月前

大模型Grok-1开源

Python50.48 k

1 年前

transformer-debugger

Python4.09 k

1 年前

@tensorflow • 谷歌公司

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

reinforcement-learning Tensorflow contextual-bandits

Python2.95 k

3 个月前

@carlosferrazza

Python622

3 个月前

stable-baselines

@Stable-Baselines-Team

#计算机科学#Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

rl reinforcement-learning openai openai-gym gym

Python302

2 年前

RewardBench: the first evaluation tool for reward models.

Python628

3 个月前

#计算机科学#A curated list of reinforcement learning with human feedback resources (continually updated)

深度学习 deep-reinforcement-learning human-feedback reinforcement-learning rlhf

4.12 k

2 个月前

ray-project@ray-project

#大语言模型#Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

ray distributed parallel 机器学习 reinforcement-learning

Python38.73 k

7 小时前

Lissy93/web-check

🕵️‍♂️ All-in-one OSINT tool for analysing any website

OSINT 隐私安全 sysadmin

TypeScript26.29 k

1 个月前

DRL-code-pytorch

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

ddpg-pytorch dqn-pytorch PyTorch

Python1.37 k

2 年前

Fine-tune LLM agents with online reinforcement learning

Python689

1 年前

#大语言模型#🙌 OpenHands: Code Less, Make More

agent 人工智能大语言模型 ChatGPT claude-ai

Python63.01 k

8 小时前

Microsoft@microsoft

eBPF distributed networking observability tool for Kubernetes

eBPF Kubernetes Network observability

Go3.03 k

4 天前