Get the data!
Then get the dataset by running the following in
To get started with the data check out the data sampling tutorial!
William H Guss, Brandon Houghton, Nicholay Topin, Phillip Wang, Cayden Codel, Manuela Veloso, Ruslan Salakhutdinov
Twenty-Eighth International Joint Conference on Artificial Intelligence
We collected human observations from a set of four main task families, each of which we explain more about below. Throughout all tasks, the agent has access to the same set of actions and observations as a human player. All tasks have a time limit, which is a part of the observation. With the exception of “Navigate,” all tasks center around obtaining specific items and have sparse rewards (+1 only for obtaining the required items). In order of what we think the difficulty is (easy to hard):
In this task, the agent must move to a goal location.This represents a basic primitive used in many tasks throughout Minecraft. In addition to standard observations, the agent has access to a “compass” observation, which points to a set location, 64 meters from the start location. The goal has a small random horizontal offset from this location and may be slightly below surface level. On the goal location is a unique block, so the agent must find the final goal by searching based on local visual features.
We present two variants of this task:
- Normal navigate: set in a random biome
- Extreme hills navigate: set in the “extreme hills” biome, requiring the agent to climb and bypass steep terrain.
In both cases, the agent is given a sparse reward (+100 upon reaching the goal, at which point the episode terminates). We also support dense reward-shaped version of Navigate, in which you are given a reward every tick for how much closer (or negative reward for farther) you get to the target.
In treechop, the agent must collect as much wood as possible. This replicates a common scenario in Minecraft, as is necessary to craft a large amount of items in the game, and is a key resource in Minecraft.
The agent begins in a forest biome (near many trees) and with an iron axe for cutting trees. The agent is given +1 reward for obtaining each unit of wood, and the episode terminates once the agent obtains 64 units.
We include a number of related tasks which require the agent to obtain a more complex item. The agent begins in a random starting location without any items, matching the normal starting conditions for human players in Minecraft. Each task variant requires the agent to obtain one instance of a separate item, from a set of frequently used items:
- Iron pickaxe: a frequently used tool required for obtaining important raw materials
- Diamond: an item central to high-level Minecraft play with a lot of gameplay centering around their discovery
- Cooked meat: cooked meat of a (cow, chicken, sheep, or pig), which is necessary for survival in Minecraft. In this task, the agent is given a specific kind of meat to obtain
- Bed: made out of dye, wool, and wood, an item that is also vital to Minecraft survival. In this task, the agent is given a specific color of bed to create
Together, these items represent what a player would need to be able to survive and obtain access to further areas of the game.
In addition to data on specific, designed tasks, we provide data in "Survival." This is the standard open-ended game mode used by most players. Starting from a random location without any items, players formulate their own high-level goals and obtain items to complete these goals.
Since gameplay involves navigation and obtaining specific items, this data could also be used to train agents attempting to complete the other, structured tasks. There is no known reward function, and one must be extracted from examples of human play. Additionally, Survival is a multi-player setting where players may work cooperatively or play competitively.