Sign-up to participate on [AIcrowd]!

We are holding a competition on sample-efficient reinforcement learning using human priors. Standard methods require months to years of game time to attain human performance in complex games such as Go and StarCraft. In our competition, participants develop a system to obtain a diamond in Minecraft using only four days of training time.

The MineRL Diamond competition offers a set of Gym environments paired with human demonstrations to provide participants with the ability to tackle the difficult Minecraft sample efficiently. This year we have two tracks:

  1. Research Track - continues the challenge from last year where the action and observation spaces are vectorized and obfuscated to prevent participants from using domain knowledge to solve the ObtainDiamond task.
  2. Intro Track - Removes the obfuscation and allows for any creative solution to solving the task, whether entirely scripted, entirely learned, or a hybrid approach.
Sample snippets of the dataset.

Top Submissions

Competition Overview

All submissions are through AIcrowd. There you can find detailed rules and as well as the leaderboard. You can find the baselines on Github.

Round 1

  1. Participants train their agents to play Minecraft. During the round, they submit trained models for evaluation to determine leaderboard ranks.
  2. At the end of the round, participants submit source code. The models at the top of the leaderboard are re-trained (from scratch) for four days to compute the final score used for ranking.
  3. 20 participants move on to the second round, 15 from the main track and 5 from the data only track.

Round 2

  1. Participants may submit code up to four times. Each submission is trained for four days to compute score. Final ranking is based on best submission for each participant.
  2. The top participants will present their work at a workshop at NeurIPS 2021.

The Task: Obtain Diamond in Minecraft

Minecraft is a 3D, first-person, open-world game centered around the gathering of resources and creation of structures and items. These structures and items have prerequisite tools and materials required for their creation. As a result, many items require the completion of a series of natural subtasks.

The procedurally generated world is composed of discrete blocks that allow modification. Over the course of gameplay, players change their surroundings by gathering resources and constructing structures.

In this competition, the goal is to obtain a diamond. The agent begins in a random starting location without any items, and receives rewards for obtaining items which are prerequisites for diamond.

The stages of obtaining a diamond.
Wood Pickaxe
Mine Stone
and Create
Stone Pickaxe
Iron Ore
drawing drawing drawing
Smelt Iron
and Create
Iron Pickaxe
Search Mine
drawing drawing drawing


Top-ranking teams in round 2 will receive rewards from our sponsors. Details will be announced as we finalize agreements.


The organizing team consists of:

The advisory committee consists of:


If you have any questions, please feel free to contact us:


NeurIPS 2020 Competition: The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors

William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Keisuke Nakata, Stephanie Milani, Sharada Mohanty, Ruslan Salakhutdinov, Shinya Shiroshita, John Schulman, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals

NeurIPS 2020 Competition Track


[BibTex] [Competition Details]

The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors

William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang

NeurIPS 2019 Competition Track


[BibTex] [Competition Details]