Frozen Lake environment

Familiarize yourself with the Frozen Lake environment

  1. In a new cell in the notebook, initialize a FrozenLakeEnv object and call the render() method:

    env = FrozenLakeEnv()
    env.render()
    

    You should see an ANSI representation of the game board and current state.

    The ‘S’ in the top left corner is for ‘start’ and the ‘G’ in the bottom right corner is the ‘goal’. ‘F’ and ‘H’ represent ‘frozen’ and ‘hole’ respectively. The idea of the game is to navigate to the goal without falling into a hole in the ice. This would be trivial except the ice is slippery making your moves non-deterministic.

    The most important concepts for interacting with the environment are state, action and reward. As long as the board is considered fixed (later you will use a dynamic board), state is just the location of the cursor. The environment object stores this in the s attribute.

    >>> env.s
       
    0
    

    State increases as you move to the right and down. The states of the whole board look like this:

    An action is a decision an agent makes about what to do next. In this game, there are four possible actions: left, down, right and up. These four actions are represented by the integers 0, 1, 2, and 3 respectively.

    The last important concept for the environment is the reward. In this game, the reward is 1.0 for reaching the goal and 0.0 for all other steps. During training, the goal of the agent will be to find a policy that maximizes the reward.

  2. Turn slippery mode off and take a few steps with deterministic actions to get a feel for the API. In a new cell, create a new environment with slippery mode off and render it:

    env = FrozenLakeEnv(is_slippery=False)
    env.render()
    
  3. In another new cell, define constants for the directions, execute an action with the step() method and re-render the board:

    LEFT = 0
    DOWN = 1
    RIGHT = 2
    UP = 3
    
    env.step(DOWN)
    env.render()
    

    The cursor moves down one row to a new frozen cell.

  4. Do this a few times with different actions. Notice that the game finishes if you land in an ‘H’ or ‘G’ cell. The environment will keep accepting actions, but they won’t have any effect. You can call the reset() method to start over.