The MineRL Dataset

The MineRL-v0 dataset is collected from client side re-simulated Minecraft demonstration data. If you would like to contribute please see the data-collection guide

Sampling The Dataset

Now that your agent can act in the environment, we should show it how to leverage human demonstrations.

To get started, let’s ensure the data has been downloaded.

# Unix, Linux
$MINERL_DATA_ROOT="your/local/path" python3 -m

# Windows
$env:MINERL_DATA_ROOT="your/local/path"; python3 -m

Or we can simply download a single experiment

# Unix, Linux
$MINERL_DATA_ROOT="your/local/path" python3 -m "MineRLObtainDiamond-v0"

# Windows
$env:MINERL_DATA_ROOT="your/local/path"; python3 -m "MineRLObtainDiamond-v0"

For a complete list of published experiments, checkout the environment documentation. You can also download the data in your python scripts

Now we can build the datast for MineRLObtainDiamond-v0

data =

for current_state, action, reward, next_state, done \
    in data.batch_iter(
        batch_size=1, num_epochs=1, seq_len=32):

        # Print the POV @ the first step of the sequence

        # Print the final reward pf the sequence!

        # Check if final (next_state) is terminal.

        # ... do something with the data.
        print("At the end of trajectories the length"
              "can be < max_sequence_len", len(reward))


The minerl package uses environment variables to locate the data directory. For portability, plese define MINERL_DATA_ROOT as /your/local/path/ in your system environment variables.

Vectorized Obfuscation Environments

With the 2020 MineRL competition, we introduced vectorized obfuscated environments which abstract non-visual state information as well as the action space of the agent to be continuous vector spaces. For example in MineRLObtainDiamondVectorObf-v0, rather than being provided a dictionary of observations and actions – which could be used to illegally hard-code a meta-policy based on the observation – the action space and non-visual observation space are provided as unlabeled 64 dimensional vectors.

These vectors serve as wrappers of their base environment and were produced by training two auto-encoders over the action and observations in the MineRL dataset respectively. Uniform sampling of both the action and observation spaces was also trained to ensure that random actions and observatins stay within the original space when unwraped.


In basic environments, env.action_space.sample() uniformly samples the action space. However, in vectorized obfuscated environemnts, action_space.sample() does NOT provide a uniform sampling of the underlying wraped space, but rather a uniform sampling of the vector space. This will cause bias in exploration and can be mitigated by using human actions rather than random samples. See the k-means tutorial for an example.

Moderate Human Demonstrations

MineRL-v0 uses community driven demonstrations to help researchers develop sample efficient techniques. Some of these demonstrations are less than optimal. Some may feature anomolies, server errors, or adversarial behavior.

Using the MineRL viewer, you can help curate this dataset by viewing these demonstrations and reporting bad streams by submitting an issue to Github with the following information:

  1. The stream name of the stream to be reviewed

  2. The reason the stream or segment needs to be modified

  3. The sample / frame number(s) (shown at the bottom of the viewer)