#

self-play

https://static.github-zh.com/github_avatars/suragnair?size=40
Jupyter Notebook 4.24 k
8 个月前
https://static.github-zh.com/github_avatars/opendilab?size=40

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1.43 k
4 天前
https://static.github-zh.com/github_avatars/opendilab?size=40

#计算机科学#An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

Python 1.29 k
6 个月前
https://static.github-zh.com/github_avatars/uclaml?size=40

#计算机科学#The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1.2 k
1 年前
https://static.github-zh.com/github_avatars/uclaml?size=40

#计算机科学#The official implementation of Self-Play Preference Optimization (SPPO)

Python 580
8 个月前
https://static.github-zh.com/github_avatars/inspirai?size=40
Python 353
3 年前
https://static.github-zh.com/github_avatars/ChuaCheowHuan?size=40

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

Jupyter Notebook 150
20 天前
https://static.github-zh.com/github_avatars/spiral-rl?size=40

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 143
14 天前
https://static.github-zh.com/github_avatars/blanyal?size=40

#计算机科学#AlphaZero implementation for Othello, Connect-Four and Tic-Tac-Toe based on "Mastering the game of Go without human knowledge" and "Mastering Chess and Shogi by Self-Play with a General Reinforcement ...

Python 90
7 年前
https://static.github-zh.com/github_avatars/seungeunrho?size=40

The exact codes used by the team "liveinparis" at the kaggle football competition ranked 6th/1141

Python 57
5 年前
https://static.github-zh.com/github_avatars/cestpasphoto?size=40

A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available

Python 53
4 天前
https://static.github-zh.com/github_avatars/Sebastian-Schuchmann?size=40
C# 37
3 年前
https://static.github-zh.com/github_avatars/tobiasemrich?size=40

AI agents for the bavarian card game Schafkopf trained with reinforcement learning

Python 37
6 天前
https://static.github-zh.com/github_avatars/ShibiHe?size=40

This is the implementation of paper Model Free Episodic Control

Python 36
6 年前
loading...
Website
Wikipedia