mcts · GitHub Topics

#大语言模型#A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

chain-of-thought Code 大语言模型数学 mcts openai-o1 strawberry reinforcement-learning

6.81 k

8 天前

#计算机科学#A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Tensorflow PyTorch Keras gobang alpha-zero alphago-zero alphago reinforcement-learning self-play mcts monte-carlo-tree-search 深度学习 alphazero 神经网络

Jupyter Notebook 4.2 k

7 个月前

junxiaosong / AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphazero mcts alphago-zero gobang monte-carlo-tree-search alphago reinforcement-learning rl board-game self-learning PyTorch Tensorflow

Python 3.5 k

1 年前

werner-duvaud / muzero-general

#计算机科学#MuZero

muzero reinforcement-learning alphazero PyTorch Python self-learning monte-carlo-tree-search 深度学习 deep-reinforcement-learning 神经网络 rl tensorboard gym mcts alphago 机器学习

Python 2.68 k

1 年前

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

alphazero atari continuous-control monte-carlo-tree-search muzero PyTorch reinforcement-learning mcts board-game gym self-play

Python 1.42 k

1 天前

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

benchmark mcts o1 prm reasoning rl o3

Python 1.21 k

2 个月前

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

chain-of-thought cot deepseek-r1 instruction-tuning large-vision-language-model multimodal multimodal-chain-of-thought multimodal-large-language-models openai-o1 reasoning survey mcts

732

17 天前

chauvinSimon / My_Bibliography_for_Research_on_Autonomous_Driving

Personal notes about scientific and research works on "Decision-Making for Autonomous Driving"

reinforcement-learning inverse-reinforcement-learning planning model-based-reinforcement-learning decision-making game-theory mcts prediction bibliography carla imitation-learning end-to-end interaction risk-assessment

460

5 年前

s-casci / tinyzero

Easily train AlphaZero-like agents on any environment you want!

alphazero mcts reinforcement-learning

Python 430

2 年前

hrpan / tetris_mcts

#计算机科学#MCTS project for Tetris

reinforcement-learning mcts tetris 深度学习 game tetris-bots

Python 348

10 个月前

dylandjian / SuperGo

#计算机科学#A student implementation of Alpha Go Zero

alphago-zero alphago reinforcement-learning PyTorch mcts Python 机器学习

Python 280

7 年前

QueensGambit / CrazyAra

#计算机科学#A Deep Learning UCI-Chess Variant Engine written in C++ & Python 🦜

Python chess-engine 深度学习人工智能 convolutional-neural-network mcts alphazero mxnet gluon Open Source 机器学习 lichess alphago

Jupyter Notebook 267

2 天前

DataCanvasIO / Hypernets

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

neural-architecture-search hyperparameter-optimization hyperparameter-tuning evolutionary-algorithms monte-carlo-tree-search automl autodl reinforcement-learning mcts nas Keras

Python 265

3 个月前

vgarciasc / mcts-viz

Visualization of MCTS algorithm applied to Tic-tac-toe.

mcts 可视化 p5js

JavaScript 250

4 年前

sungyubkim / Deep_RL_with_pytorch

A pytorch tutorial for DRL(Deep Reinforcement Learning)

deep-reinforcement-learning PyTorch dqn a2c ppo soft-actor-critic mcts

Jupyter Notebook 218

2 年前

initial-h / AlphaZero_Gomoku_MPI

#算法刷题#An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku

alphazero parallel Tensorflow alphago mcts tensorlayer tree-search 算法 deep-reinforcement-learning

Python 209

5 个月前

thuxugang / doudizhu

AI斗地主

人工智能 collectible-card-game dqn reinforcement-learning doudizhu mcts

Python 184

7 年前

kaesve / muzero

#计算机科学#A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

muzero alphazero reinforcement-learning Tensorflow tensorflow2 mcts tf2 深度学习 deep-reinforcement-learning

Jupyter Notebook 160

4 年前

zjeffer / chess-deep-rl

#计算机科学#Research project: create a chess engine using Deep Reinforcement Learning

chess alphazero reinforcement-learning 人工智能神经网络 mcts 深度学习机器学习 deep-reinforcement-learning neural-networks chess-engine

Jupyter Notebook 143

1 年前

PuYuuu / vehicle-interaction-decision-making

The decision-making of multiple vehicles at intersection bases on level-k game and MCTS

mcts game-theory

C++ 131

6 个月前