Creating heuristic based agents for Project Malmo (PARTNER REQUIRED)

For this assignment, you will be creating agents that can traverse a maze in Project Malmo (Minecraft)

Now that we know some general tree searching algorithms (ask and/or go back and check the resources, for example slides, I’ve given if you’re unclear on the algorithms), we’re going to implement an agent that can find a goal block in a Minecraft maze.

We’ll be using Project Malmo for this assignment. - https://github.com/Microsoft/malmo

Getting Malmo setup

Malmo on lab computers

You'll just have to run a few commands to have project malmo available while working on a lab machine

Malmo on your own computers

You can follow the instructions provided on the official Project Malmo git under Getting Started

There was a time when a virtual environment was pretty key to make sure things aren't too messed up for you when using your own machine (because jeez your machines can have a bunch of random things and settings on them). This is not as much the case anymore. I still like to do it, but it shouldn't be needed unless you start running into 5 bagillion library errors.

Running the Minecraft server and your agent

Once you have malmo and unzip it, you can run the Minecraft Server by opening up a terminal, navigating to [MALMO_DIR]/Minecraft and running the launchClient sh launchClient.sh

Minecraft is waiting for your agent!

Then (once the Malmo environment has completed loaded), you can run your model using the normal commands. There are several examples in the Python Examples folder.

Make sure that you copy the MalmoPython.x file from the PythonExamples folder to the directory that your Sim & Agent files are located in!

The MazeSim code below can be run (after your Minecraft server has already been loaded) using the normal python3 MazeSim.py

Objective

The objective of this assignment is to make an Agent that can use both Breath-First Search and A* Search. To complete both, you’ll need to decide how to implement the data structures needed to represent the tree and search it (e.g., for the frontier-set).

Your agent will be given a 2D list that represents the initial state of the agent, the goal state, possible places in the grid to move, and places to which your agent cannot move (the representations for each of these is included in the code documentation).

A bit about the 2D list/grid

Normally Malmo (as it’s setup with this file/maze) would give you a flattened list of strings that actually represent where an object is in a 2d grid. For example, the list that would correspond to the figure below would be: [“0”,“1”,“2”,“3”,“4”,“5”,“6”,“7”,“8”]. To go from “0” to “3” would be moving north 1 block or “movenorth 1”. Thus, it would look like the figure below:

Malmo Normal Grid

To help out with this, I parse the list and give you a 2D grid that should prove a bit more natural for your path planning algorithm [[“0”,“1”,“2”], [“3”,“4”,“5”], [“6”,“7”,“8”]]. I’ve also simplified things by changing the strings to integers that can easily be parsed initial, goal, accessible, and inaccessible states (where the latter is a point on the grid where the agent cannot go).

Requirements (what in the hell do I turn in and how should it look?)

For this assignment you’ll need to complete the MazeAgent.py class so that it can get the correct movements to the goal using the get_path method. You should also complete the __plan_path_breadth and __plan_path_astar method. Both should work within the get_path method, but when you turn in your MazeAgent.py file, only one of them must be actually used within the get_path method (basically the file you turn in should be such that I can run the MazeSim.py file without errors).

In addition to the above file, you also should turn in a README file that includes the following:

Also, remember that if your agent is not working as you expect, you can just terminate the agent script and keep the server running (this should allow you to not have to restart the server). You can exit “Save and Quit to Title”, but if you do this before stopping the agent/simulation script the server may not respond the next time and have to be restarted.

Grading

Grade item Points
Breadth-First implemented 3 pts
A* implemented 5 pts
Reasonable design principles used 1 pts
Well documented (including readme) 1 pts

The MazeAgent (starter) code

Below is the starter code for your maze agent. If you add functions/change naming & functions, make sure you document changes and your agent still works!

For our Maze Agent class, we are just going to have a few attributes

__grid holds our actual representation of the map __goal_state holds the representation for our goal. I did not want to specify to allow you to figure out how you want to represent the problem, but the MazeSim code has constants & numbers to tell us what current state we are in (take a look at that code after)

class MazeAgent(object):
    '''
    Agent that uses path planning algorithm to figure out path to take to reach goal
    Built for Malmo discrete environment and to use Malmo discrete movements
    '''

    def __init__(self, grid=None):
        '''
        Arguments
            grid -- (optional) a 2D list that represents the map
        '''
        self.__frontier_set = None
        self.__explored_set = None
        self.__goal_state = None
        self.__grid = grid

Normal accessors & mutators....not much to see here ;-)

    def get_eset(self):
        return self.__explored_set

    def get_fset(self):
        return self.__frontier_set

    def get_goal(self):
        return self.__goal_state

    def set_grid(self, grid):
        self.__grid = grid

You should place your functionality into the __plan methods and use (one of) them within get_path


    def __plan_path_breadth(self):
        '''Breadth-First tree search'''
        pass

    def __plan_path_astar(self):
        '''A* tree search'''
        pass

    def get_path(self):
        '''should return list of strings where each string gives movement command
            (these should be in order)
            Example:
             ["movenorth 1", "movesouth 1", "moveeast 1", "movewest 1"]
             (these are also the only four commands that can be used, you
             cannot move diagonally)
             On a 2D grid (list), "move north" would move us
             from, say, [0][0] to [1][0]
        '''
        pass

The MazeSim class

The MazeSim class creates the environment and runs your agent MazeAgent

Use this code to run your agent. You should expect that when I run your agent, I will use this class.

Thus, don't use a modified version of this for your own testing!

Most of this code is just setup and things you won't need to worry about. However below the full code, I highlight where your code affects the running of the simulation.

# ------------------------------------------------------------------------------------------------
# Copyright (c) 2016 Microsoft Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge, publish, distribute,
# sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all copies or
# substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# ------------------------------------------------------------------------------------------------

'''
 Modified from Maze python example by Chris Dancy @ Bucknell University for
 AI & Cognitive Science course
'''

import MalmoPython
import os
import random
import sys
import time
import json
import errno
from MazeAgent import MazeAgent

class MazeSim(object):
    MAP_SIZE = 60
    MS_PER_TICK = 50

    FLOOR_BLOCK = "grass"
    GAP_BLOCK = "stone"
    PATH_BLOCK = "sandstone"
    START_BLOCK = "emerald_block"
    GOAL_BLOCK = "gold_block"

    #Canvas params
    CANVAS_BORDER = 20
    CANVAS_WIDTH = 400
    CANVAS_HEIGHT = CANVAS_BORDER + ((CANVAS_WIDTH - CANVAS_BORDER))
    CANVAS_SCALEX = (CANVAS_WIDTH-CANVAS_BORDER)/MAP_SIZE
    CANVAS_SCALEY = (CANVAS_HEIGHT-CANVAS_BORDER)/MAP_SIZE
    CANVAS_ORGX = -MAP_SIZE/CANVAS_SCALEX
    CANVAS_ORGY = -MAP_SIZE/CANVAS_SCALEY

    DEFAULT_MAZE = '''
        <MazeDecorator>
            <SizeAndPosition length="''' + str(MAP_SIZE-1) + '''"\
                width="''' + str(MAP_SIZE-1) + '''" \
                yOrigin="225" zOrigin="0" height="180"/>
            <GapProbability variance="0.4">0.5</GapProbability>
            <Seed>15</Seed>
            <MaterialSeed>random</MaterialSeed>
            <AllowDiagonalMovement>false</AllowDiagonalMovement>
            <StartBlock fixedToEdge="true" type="emerald_block"/>
            <EndBlock fixedToEdge="true" type="''' + GOAL_BLOCK + '''" height="12"/>
            <PathBlock type="''' + PATH_BLOCK + '''" colour="WHITE ORANGE MAGENTA LIGHT_BLUE YELLOW LIME PINK GRAY SILVER CYAN PURPLE BLUE BROWN GREEN RED BLACK" height="1"/>
            <FloorBlock type="''' + FLOOR_BLOCK + '''"/>
            <GapBlock type="'''+ GAP_BLOCK + '''" height="2"/>
            <AddQuitProducer description="finished maze"/>
        </MazeDecorator>
    '''

    def __init__(self, maze_str=None, agent=None):
        if (not(maze_str is None)):
            self.__maze_str = maze_str
        else:
            self.__maze_str = MazeSim.DEFAULT_MAZE

        self.__maze_grid = [["Empty" for x in range(MazeSim.MAP_SIZE)] \
                            for x in range(MazeSim.MAP_SIZE)]
        self.agent = agent

    def get_mission_xml(self):
        return '''<?xml version="1.0" encoding="UTF-8" ?>
        <Mission xmlns="http://ProjectMalmo.microsoft.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <About>
                <Summary>Run the maze!</Summary>
            </About>

            <ModSettings>
                <MsPerTick>''' + str(MazeSim.MS_PER_TICK) + '''</MsPerTick>
            </ModSettings>

            <ServerSection>
                <ServerInitialConditions>
                    <AllowSpawning>false</AllowSpawning>
                </ServerInitialConditions>
                <ServerHandlers>
                    <FlatWorldGenerator generatorString="3;7,220*1,5*3,2;3;,biome_1" />
                    ''' + self.__maze_str + '''
                    <ServerQuitFromTimeUp timeLimitMs="45000"/>
                    <ServerQuitWhenAnyAgentFinishes />
                </ServerHandlers>
            </ServerSection>

            <AgentSection mode="Survival">
                <Name>A* Smart Guy</Name>
                <AgentStart>
                    <Placement x="1" y="81" z="1"/>
                </AgentStart>
                <AgentHandlers>
                    <ObservationFromGrid>
                        <Grid name="World" absoluteCoords="true">
                            <min x="0" y="226" z="0"/>
                            <max x="''' + str(MazeSim.MAP_SIZE-1) + \
                            '''" y="226" z="''' + str(MazeSim.MAP_SIZE-1) + '''"/>
                        </Grid>
                    </ObservationFromGrid>
                    <DiscreteMovementCommands />
                </AgentHandlers>
            </AgentSection>

        </Mission>'''

    def __fill_grid(self, observations):
        '''
        Converts observation string grid (which is flat) into 2D list of observations
        This simplifies the list so that the initial location is marked with a 2,
        goal location is marked with a 3,
         all invalid locations/blocks are marked with a 0,
         and all valid moves/blocks are marked with a 1
        Arguments:
            observations -- list of strings in order by a flattened grid
        '''
        flat_grid_max = len(observations)
        grid_max = len(self.__maze_grid)
        for i in range(flat_grid_max):
            curr_row = ((flat_grid_max-1)-i)//grid_max
            curr_col = ((flat_grid_max-1)-i)%grid_max
            self.__maze_grid[curr_row][curr_col] = self.conv_obs_str(observations[i])

    def conv_obs_str(self, obs_str):
        '''
        Converts a given object string to the numerical representation for our
         grid according to a few simple tests
        '''
        if (obs_str == MazeSim.GOAL_BLOCK):
            return 3
        elif (obs_str == MazeSim.START_BLOCK):
            return 2
        elif (obs_str == MazeSim.PATH_BLOCK):
            return 1
        else:
            return 0


    def run_sim(self, num_reps=1):
        validate = True

        agent_host = MalmoPython.AgentHost()
        try:
            agent_host.parse( sys.argv )
        except RuntimeError as e:
            print('ERROR:',e)
            print(agent_host.getUsage())
            exit(1)
        if agent_host.receivedArgument("help"):
            print(agent_host.getUsage())
            exit(0)

        agent_host.setObservationsPolicy(MalmoPython.ObservationsPolicy.LATEST_OBSERVATION_ONLY)

        recordingsDirectory="MazeRecordings"

        try:
            os.makedirs(recordingsDirectory)
        except OSError as exception:
            if exception.errno != errno.EEXIST: # ignore error if already existed
                raise

        # Set up a recording
        my_mission_record = MalmoPython.MissionRecordSpec()
        my_mission_record.recordRewards()
        my_mission_record.recordObservations()

        for iRepeat in range(num_reps):
            my_mission_record.setDestination(recordingsDirectory + "//" + "Mission_" + str(iRepeat) + ".tgz")
            my_mission = MalmoPython.MissionSpec(self.get_mission_xml(),validate)

            max_retries = 3
            for retry in range(max_retries):
                try:
                    agent_host.startMission( my_mission, my_mission_record )
                    break
                except RuntimeError as e:
                    if retry == max_retries - 1:
                        print("Error starting mission:",e)
                        exit(1)
                    else:
                        time.sleep(2)

            print("Waiting for the mission to start")
            world_state = agent_host.getWorldState()
            while not world_state.has_mission_begun:
                sys.stdout.write(".")
                time.sleep(0.1)
                world_state = agent_host.getWorldState()
                if len(world_state.errors):
                    for error in world_state.errors:
                        print("Error:",error.text)
                        exit()

            movements = None

            i = 0
            # main loop:
            while world_state.is_mission_running:
                if world_state.number_of_observations_since_last_state > 0:
                    #i += 1
                    msg = world_state.observations[-1].text
                    obs = json.loads(msg)
                    self.__fill_grid(obs["World"])

                    self.agent.set_grid(self.__maze_grid)

                    #You need to make it so this works! :-)
                    if (movements is None):
                        movements = self.agent.get_path()
                        print()
                        print(len(movements))

                    #i = i % len(movements)
                    try:
                        #Moves are presented in reverse order (last move 1st)
                        agent_host.sendCommand( movements.pop() )
                        #for comm in movements:
                        #    agent_host.sendCommand( comm )
                    except RuntimeError as e:
                        print("Failed to send command:",e)
                        pass

                world_state = agent_host.getWorldState()

            #print "Mission has stopped."
            time.sleep(0.5) # Give mod a little time to get back to dormant state.


#Change the MazeAgent as needed, but that should be the only part of the code that
# you need to change
global movements
#smart_guy = MazeAgent()
smart_guy = MazeAgent(None, "astar")
smart_guy_sim = MazeSim(agent=smart_guy)
smart_guy_sim.run_sim()
print(len(smart_guy.get_eset()))
print(len(smart_guy.get_fset()))

So, your agent is going to affect the runnning of the simulation in a simple way: Within the main loop, we will use your agent (self.agent) to get the path that we should follow! In the MazeAgent starter code, I explain that this should be a list of actions (and I explain what those actions should look like there as well). Of course, we'll also initialize your agent, which should be contained in MazeAgent in the code at the bottom of the file.

# main loop:
while world_state.is_mission_running:
		if world_state.number_of_observations_since_last_state > 0:
				#i += 1
				msg = world_state.observations[-1].text
				obs = json.loads(msg)
				self.__fill_grid(obs["World"])

				self.agent.set_grid(self.__maze_grid)

				#You need to make it so this works! :-)
				if (movements is None):
						movements = self.agent.get_path()
						print()
						print(len(movements))

				#i = i % len(movements)
				try:
						#Moves are presented in reverse order (last move 1st)
						agent_host.sendCommand( movements.pop() )
						#for comm in movements:
						#    agent_host.sendCommand( comm )
				except RuntimeError as e:
						print("Failed to send command:",e)
						pass

		world_state = agent_host.getWorldState()

#print "Mission has stopped."
time.sleep(0.5) # Give mod a little time to get back to dormant state.


#Change the MazeAgent as needed, but that should be the only part of the code that
# you need to change
global movements
#smart_guy = MazeAgent()
smart_guy = MazeAgent(None, "astar")
smart_guy_sim = MazeSim(agent=smart_guy)
smart_guy_sim.run_sim()
print(len(smart_guy.get_eset()))
print(len(smart_guy.get_fset()))