exploration-exploitation · GitHub Topics

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

reinforcement-learning multiagent-reinforcement-learning self-play imitation-learning inverse-reinforcement-learning exploration-exploitation distributed-system Python impala smac atari mujoco r2d2 reinforcement-learning-algorithms pytorch-rl model-based-reinforcement-learning

Python 3.5 k

2 天前

wzhe06 / Reco-papers

#计算机科学#Classic papers and resources on recommendation

recommender-system 深度学习机器学习 recommendation exploration-exploitation reinforcement-learning

Python 3.39 k

5 年前

tigerneil / awesome-deep-rl

For deep RL and the future of AI.

deep-reinforcement-learning reinforcement-learning game artificial-general-intelligence exploration-exploitation multiagent-reinforcement-learning planning icml aaai agi iclr

HTML 1.48 k

1 年前

imsheridan / DeepRec

#计算机科学#推荐、广告工业界经典以及最前沿的论文、资料集合/ Must-read Papers on Recommendation System and CTR Prediction

深度学习 recommendation-system recommendation reinforcement-learning exploration-exploitation

1.01 k

2 年前

david-cortes / contextualbandits

Python implementations of contextual bandits algorithms

contextual-bandits reinforcement-learning exploration-exploitation

Python 795

1 个月前

opendilab / awesome-exploration-rl

#Awesome#A curated list of awesome exploration RL resources (continually updated)

exploration-exploitation reinforcement-learning Awesome Lists exploration exploratory reinforcement-learning-algorithms

531

1 个月前

YaoYao1995 / MEEE

Code to reproduce the experiments in Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation (MEEE).

reinforcement-learning exploration-exploitation model-based-reinforcement-learning

Python 460

2 年前

TianhongDai / self-imitation-learning-pytorch

This is the pytorch implementation of ICML 2018 paper - Self-Imitation Learning.

reinforcement-learning-algorithms exploration-exploitation a2c atari-games

Python 66

7 年前

holarissun / RewardShifting

Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL

ensemble exploration-exploitation offline-reinforcement-learning reinforcement-learning deep-q-network ensemble-learning

Python 29

2 年前

stratisMarkou / sample-efficient-bayesian-rl

Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL

reinforcement-learning bayesian-methods bayesian-inference q-learning exploration-exploitation exploration reproducible-research

Jupyter Notebook 24

3 年前

hmishfaq / LSAC

The official code release for "Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning", ICLR 2025

exploration-exploitation policy-gradient reinforcement-learning soft-actor-critic

Python 10

2 个月前

gokceuludogan / interactive-music-recommendation

Personalized and Interactive Music Recommendation with Bandit approach

exploration-exploitation

Jupyter Notebook 10

6 年前

Amshra267 / Thompson-Greedy-Comparison-for-MultiArmed-Bandits

Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits

exploration-exploitation

Python 10

4 年前

kkm24132 / ReinforcementLearning

Focuses on Reinforcement Learning related concepts, use cases, and learning approaches

reinforcement-learning exploration-exploitation sarsa q-learning policy-gradient

Jupyter Notebook 8

2 个月前

mbhenaff / neural-e3

#计算机科学#

deep-reinforcement-learning exploration-exploitation 深度学习

Python 7

6 年前

kakaobrain / leco

Official implementation of LECO (NeurIPS'22)

exploration-exploitation reinforcement-learning

Python 7

2 年前

hmishfaq / LMC-LSVI

The official code release for Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo, ICLR 2024.

exploration-exploitation reinforcement-learning

Python 7

1 年前

baturaysaglam / DISCOVER

Deep Intrinsically Motivated Exploration in Continuous Control

actor-critic deep-reinforcement-learning exploration-exploitation

Python 5

1 年前

guptav96 / bandit-algorithms

A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB

reinforcement-learning exploration-exploitation

Python 5

3 年前

panxulab / LSVI-ASE

The official code release for "More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling", Reinforcement Learning Conference (RLC) 2024

exploration-exploitation reinforcement-learning

Python 4

1 年前