For this assignment, you will be creating agents that can traverse a maze in Project Malmo (Minecraft)
Now that we know some general tree searching algorithms (ask and/or go back and check the resources, for example slides, I’ve given if you’re unclear on the algorithms), we’re going to implement an agent that can find a goal block in a Minecraft maze.
We’ll be using Project Malmo for this assignment. - https://github.com/Microsoft/malmo
Once the Malmo environment has completed loaded (see appendix for the instructions), you can run your model using the normal commands.
The MazeSim code below can be run (after your Minecraft server has already been loaded) using the normal python MazeSim.py
The objective of this assignment is to make an Agent that can use both Breath-First Search and A* Search. To complete both, you’ll need to decide how to implement the data structures needed to represent the tree and search it (e.g., for the frontier-set).
Your agent will be given a 2D list that represents the initial state of the agent, the goal state, possible places in the grid to move, and places to which your agent cannot move (the representations for each of these is included in the code documentation).
Normally Malmo (as it’s setup with this file/maze) would give you a flattened list of strings that actually represent where an object is in a 2d grid. For example, the list that would correspond to the figure below would be: [“0”,“1”,“2”,“3”,“4”,“5”,“6”,“7”,“8”]. To go from “0” to “3” would be moving north 1 block or “movenorth 1”. Thus, it would look like the figure below:
To help out with this, I parse the list and give you a 2D grid that should prove a bit more natural for your path planning algorithm [[“0”,“1”,“2”], [“3”,“4”,“5”], [“6”,“7”,“8”]]. I’ve also simplified things by changing the strings to integers that can easily be parsed initial, goal, accessible, and inaccessible states (where the latter is a point on the grid where the agent cannot go).
For this assignment you’ll need to complete the MazeAgent.py class so that it can get the correct movements to the goal using the get_path method. You should also complete the __plan_path_breadth and __plan_path_astar method. Both should work within the get_path method, but when you turn in your MazeAgent.py file, only one of them must be actually used within the get_path method (basically the file you turn in should be such that I can run the MazeSim.py file without errors).
In addition to the above file, you also should turn in a README file that includes the following:
Also, remember that if your agent is not working as you expect, you can just terminate the agent script and keep the server running (this should allow you to not have to restart the server). You can exit “Save and Quit to Title”, but if you do this before stopping the agent/simulation script the server may not respond the next time and have to be restarted.
Grade item | Points |
---|---|
Breadth-First & A* implemented | 5 pts |
Well documented (including readme) | 3 pts |
Below is the starter code for your maze agent. If you add functions/change naming & functions, make sure you document changes and your agent still works!
For our Maze Agent class, we are just going to have a few attributes
__grid
holds our actual representation of the map
__goal_state
holds the representation for our goal. I did not want to specify to allow you to figure out how you want to represent the problem, but the MazeSim code has constants & numbers to tell us what current state we are in (take a look at that code after)
class MazeAgent(object):
'''
Agent that uses path planning algorithm to figure out path to take to reach goal
Built for Malmo discrete environment and to use Malmo discrete movements
'''
def __init__(self, grid=None):
'''
Arguments
grid -- (optional) a 2D list that represents the map
'''
self.__frontier_set = None
self.__explored_set = None
self.__goal_state = None
self.__grid = grid
Normal accessors & mutators....not much to see here ;-)
def get_eset(self):
return self.__explored_set
def get_fset(self):
return self.__frontier_set
def get_goal(self):
return self.__goal_state
def set_grid(self, grid):
self.__grid = grid
You should place your functionality into the __plan
methods and use (one of) them within get_path
def __plan_path_breadth(self):
'''Breadth-First tree search'''
pass
def __plan_path_astar(self):
'''A* tree search'''
pass
def get_path(self):
'''should return list of strings where each string gives movement command
(these should be in order)
Example:
["movenorth 1", "movesouth 1", "moveeast 1", "movewest 1"]
(these are also the only four commands that can be used, you
cannot move diagonally)
On a 2D grid (list), "move north" would move us
from, say, [0][0] to [1][0]
'''
pass
The MazeSim class creates the environment and runs your agent MazeAgent
Use this code to run your agent. You should expect that when I run your agent, I will use this class.
Thus, don't use a modified version of this for your own testing!
Most of this code is just setup and things you won't need to worry about. However below the full code, I highlight where your code affects the running of the simulation.
'''
Modified from Maze python example by Chris Dancy @ Bucknell University for
AI & Cognitive Science course
'''
import os, random, argparse, sys, time, json, errno
import malmoenv
from MazeAgent import MazeAgent
class MazeSim():
MAP_SIZE = 60
MS_PER_TICK = 50
FLOOR_BLOCK = "grass"
GAP_BLOCK = "stone"
PATH_BLOCK = "sandstone"
START_BLOCK = "emerald_block"
GOAL_BLOCK = "gold_block"
#Canvas params
CANVAS_BORDER = 20
CANVAS_WIDTH = 400
CANVAS_HEIGHT = CANVAS_BORDER + ((CANVAS_WIDTH - CANVAS_BORDER))
CANVAS_SCALEX = (CANVAS_WIDTH-CANVAS_BORDER)/MAP_SIZE
CANVAS_SCALEY = (CANVAS_HEIGHT-CANVAS_BORDER)/MAP_SIZE
CANVAS_ORGX = -MAP_SIZE/CANVAS_SCALEX
CANVAS_ORGY = -MAP_SIZE/CANVAS_SCALEY
DEFAULT_MAZE = '''
<MazeDecorator>
<SizeAndPosition length="''' + str(MAP_SIZE-1) + '''"\
width="''' + str(MAP_SIZE-1) + '''" \
yOrigin="225" zOrigin="0" height="180"/>
<GapProbability variance="0.4">0.5</GapProbability>
<Seed>15</Seed>
<MaterialSeed>random</MaterialSeed>
<AllowDiagonalMovement>false</AllowDiagonalMovement>
<StartBlock fixedToEdge="true" type="emerald_block"/>
<EndBlock fixedToEdge="true" type="''' + GOAL_BLOCK + '''" height="12"/>
<PathBlock type="''' + PATH_BLOCK + '''" colour="WHITE ORANGE MAGENTA LIGHT_BLUE YELLOW LIME PINK GRAY SILVER CYAN PURPLE BLUE BROWN GREEN RED BLACK" height="1"/>
<FloorBlock type="''' + FLOOR_BLOCK + '''"/>
<GapBlock type="'''+ GAP_BLOCK + '''" height="2"/>
<AddQuitProducer description="finished maze"/>
</MazeDecorator>
'''
def __init__(self, maze_str=None, agent=None):
if (not(maze_str is None)):
self.__maze_str = maze_str
else:
self.__maze_str = MazeSim.DEFAULT_MAZE
self.__maze_grid = [["Empty" for x in range(MazeSim.MAP_SIZE)] \
for x in range(MazeSim.MAP_SIZE)]
self.agent = agent
def get_mission_xml(self):
return '''<?xml version="1.0" encoding="UTF-8" ?>
<Mission xmlns="http://ProjectMalmo.microsoft.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<About>
<Summary>Run the maze!</Summary>
</About>
<ModSettings>
<MsPerTick>''' + str(MazeSim.MS_PER_TICK) + '''</MsPerTick>
</ModSettings>
<ServerSection>
<ServerInitialConditions>
<AllowSpawning>false</AllowSpawning>
</ServerInitialConditions>
<ServerHandlers>
<FlatWorldGenerator generatorString="3;7,220*1,5*3,2;3;,biome_1" />
''' + self.__maze_str + '''
<ServerQuitFromTimeUp timeLimitMs="45000"/>
<ServerQuitWhenAnyAgentFinishes />
</ServerHandlers>
</ServerSection>
<AgentSection mode="Survival">
<Name>A* Smart Guy</Name>
<AgentStart>
<Placement x="10" y="228" z="1"/>
</AgentStart>
<AgentHandlers>
<VideoProducer want_depth="false">
<Width>640</Width>
<Height>480</Height>
</VideoProducer>
<ObservationFromGrid>
<Grid name="World" absoluteCoords="true">
<min x="0" y="226" z="0"/>
<max x="''' + str(MazeSim.MAP_SIZE-1) + \
'''" y="226" z="''' + str(MazeSim.MAP_SIZE-1) + '''"/>
</Grid>
</ObservationFromGrid>
<DiscreteMovementCommands />
</AgentHandlers>
</AgentSection>
</Mission>'''
def __fill_grid(self, observations):
'''
Converts observation string grid (which is flat) into 2D list of observations
This simplifies the list so that the initial location is marked with a 2,
goal location is marked with a 3,
all invalid locations/blocks are marked with a 0,
and all valid moves/blocks are marked with a 1
Arguments:
observations -- list of strings in order by a flattened grid
'''
flat_grid_max = len(observations)
grid_max = len(self.__maze_grid)
for i in range(flat_grid_max):
curr_row = ((flat_grid_max-1)-i)//grid_max
curr_col = ((flat_grid_max-1)-i)%grid_max
self.__maze_grid[curr_row][curr_col] = self.conv_obs_str(observations[i])
def conv_obs_str(self, obs_str):
'''
Converts a given object string to the numerical representation for our
grid according to a few simple tests
'''
if (obs_str == MazeSim.GOAL_BLOCK):
return 3
elif (obs_str == MazeSim.START_BLOCK):
return 2
elif (obs_str == MazeSim.PATH_BLOCK):
return 1
else:
return 0
def create_actions(self):
'''Returns dictionary of actions that make up agent action space (discrete movements)
'''
actions = [0] * 5
actions[0] = "movenorth 1"
actions[1] = "moveeast 1"
actions[2] = "movesouth 1"
actions[3] = "movewest 1"
actions[4] = "move 0"
return (actions)
def run_sim(self, exp_role, num_episodes, port1, serv1, serv2, exp_id, epi, rsync):
'''Code to actually run simulation
'''
validate = True
movements = None
env = malmoenv.make()
env.init(self.get_mission_xml(),
port1, server=serv1,
server2=serv2, port2=(port1 + exp_role),
role=exp_role,
exp_uid=exp_id,
episode=epi,
resync=rsync,
action_space = malmoenv.ActionSpace(self.create_actions()))
max_num_steps = 1000
for r in range(num_episodes):
print("Reset [" + str(exp_role) + "] " + str(r) )
movements = None
max_retries = 3
env.reset()
num_steps = 0
sim_done = False
total_reward = 0
total_commands = 0
(obs, reward, sim_done, info) = env.step(4)
while not sim_done:
num_steps += 1
if (info is None or len(info) == 0):
(obs, reward, sim_done, info) = env.step(4)
elif (movements is None):
info_json = json.loads(info)
self.__fill_grid(info_json["World"])
self.__maze_grid
self.agent.set_grid(self.__maze_grid)
#You need to make it so this works! :-)
if (movements is None):
movements = self.agent.get_path()
print(movements)
print(len(movements))
else:
try:
#Moves are presented in reverse order (last move 1st)
next_move = movements.pop()
(obs, reward, sim_done, info) = env.step(env.action_space.actions.index(next_move))
except RuntimeError as e:
print("Issue with command/action: ",e)
pass
time.sleep(0.05)
#print "Mission has stopped."
time.sleep(0.5) # Give mod a little time to get back to dormant state.
#Change the MazeAgent as needed, but that should be the only part of the code that
# you need to change
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='malmovnv test')
parser.add_argument('--port', type=int, default=9000, help='the mission server port')
parser.add_argument('--server', type=str, default='127.0.0.1', help='the mission server DNS or IP address')
parser.add_argument('--server2', type=str, default=None, help="(Multi-agent) role N's server DNS or IP")
parser.add_argument('--port2', type=int, default=9000, help="(Multi-agent) role N's mission port")
parser.add_argument('--episodes', type=int, default=10, help='the number of resets to perform - default is 1')
parser.add_argument('--episode', type=int, default=0, help='the start episode - default is 0')
parser.add_argument('--resync', type=int, default=0, help='exit and re-sync on every N - default 0 meaning never')
parser.add_argument('--experimentUniqueId', type=str, default='test1', help="the experiment's unique id.")
args = parser.parse_args()
if args.server2 is None:
args.server2 = args.server
#smart_guy = MazeAgent()
smart_guy = MazeAgent(None, "astar")
smart_guy_sim = MazeSim(agent=smart_guy)
smart_guy_sim.run_sim(0, args.episodes, args.port, args.server, args.server2,
args.experimentUniqueId, args.episode, args.resync)
print(len(smart_guy.get_eset()))
print(len(smart_guy.get_fset()))
So, your agent is going to affect the runnning of the simulation in a simple way:
Within the main loop, we will use your agent (self.agent
) to get the path that we should follow!
In the MazeAgent starter code, I explain that this should be a list of actions (and I explain what those actions should look like there as well).
Of course, we'll also initialize your agent, which should be contained in MazeAgent
in the code at the bottom of the file.
validate = True
movements = None
env = malmoenv.make()
env.init(self.get_mission_xml(),
port1, server=serv1,
server2=serv2, port2=(port1 + exp_role),
role=exp_role,
exp_uid=exp_id,
episode=epi,
resync=rsync,
action_space = malmoenv.ActionSpace(self.create_actions()))
max_num_steps = 1000
for r in range(num_episodes):
print("Reset [" + str(exp_role) + "] " + str(r) )
movements = None
max_retries = 3
env.reset()
num_steps = 0
sim_done = False
total_reward = 0
total_commands = 0
(obs, reward, sim_done, info) = env.step(4)
while not sim_done:
num_steps += 1
if (info is None or len(info) == 0):
(obs, reward, sim_done, info) = env.step(4)
elif (movements is None):
info_json = json.loads(info)
self.__fill_grid(info_json["World"])
self.__maze_grid
self.agent.set_grid(self.__maze_grid)
#You need to make it so this works! :-)
if (movements is None):
movements = self.agent.get_path()
print(movements)
print(len(movements))
else:
try:
#Moves are presented in reverse order (last move 1st)
next_move = movements.pop()
(obs, reward, sim_done, info) = env.step(env.action_space.actions.index(next_move))
except RuntimeError as e:
print("Issue with command/action: ",e)
pass
time.sleep(0.05)
#print "Mission has stopped."
time.sleep(0.5) # Give mod a little time to get back to dormant state.
#Change the MazeAgent as needed, but that should be the only part of the code that
# you need to change
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='malmovnv test')
parser.add_argument('--port', type=int, default=9000, help='the mission server port')
parser.add_argument('--server', type=str, default='127.0.0.1', help='the mission server DNS or IP address')
parser.add_argument('--server2', type=str, default=None, help="(Multi-agent) role N's server DNS or IP")
parser.add_argument('--port2', type=int, default=9000, help="(Multi-agent) role N's mission port")
parser.add_argument('--episodes', type=int, default=10, help='the number of resets to perform - default is 1')
parser.add_argument('--episode', type=int, default=0, help='the start episode - default is 0')
parser.add_argument('--resync', type=int, default=0, help='exit and re-sync on every N - default 0 meaning never')
parser.add_argument('--experimentUniqueId', type=str, default='test1', help="the experiment's unique id.")
args = parser.parse_args()
if args.server2 is None:
args.server2 = args.server
#smart_guy = MazeAgent()
smart_guy = MazeAgent(None, "astar")
smart_guy_sim = MazeSim(agent=smart_guy)
smart_guy_sim.run_sim(0, args.episodes, args.port, args.server, args.server2,
args.experimentUniqueId, args.episode, args.resync)
print(len(smart_guy.get_eset()))
print(len(smart_guy.get_fset()))
In the sections that follow, you'll find installations instructions for malmo, the simulation code (I'd suggest calling it MobSim.py), and some Neural Net starter code
The Malmo/Minecraft software
This is the software that you'll need to run a Minecraft server on your machine. I've provided a zip on Google classroom for these files. You'll want to place each of the folders in the zip into the base directory of the simulation file that you'll be running.java -version
if you see java 8 or java 1.8, you should be fine.Create a virtual environment called malmoEnv
username$ python3 -m venv malmoEnv
Activate environment called testEnv
username$ source malmoEnv/bin/activate
(Do not do this until you want to not use your virtual environment anymore)
To deactivate any virtual environment you are in
$ deactivate
Activate environment called malmoEnv
malmoEnv\Scripts\activate.bat
(Do not do this until you want to not use your venv)
To deactivate any virtual environment you are in
deactivate
(malmoEnv) Labcomputer:project user1234$
which python
in terminal or where python
in cmd, you should see the path to your virutal environment listed first)(malmoENV) username$ python -m pip install gym lxml numpy pillow
Congrats, you should all you need installed! Now let's move on to getting the files you need
Minecraft
& Schemas
folders are located (the folders contained in the zip)Minecraft\launchClient.bat -port 9000 -env
username$ sh Minecraft/launchClient.sh -port 9000 -env
username$ module load java/1.8
username$ sh Minecraft/launchClient.sh -port 9000 -env
If you are on a lab computer, you will need to run module load java/1.8
before starting the Minecraft server each time in the same terminal
username$ python MalmoSim.py
Hint:
ctrl+c
ctrl+c
to restart/rebuild the Minecraft server!!!