Finite State Machine Environment
The FiniteStateMachineEnv
class maps states in a finite state machine to
functions that handle the logic of the state. At the end of each state agents take
observations and at the start of the next step the agents provide actions based on the
observations and their respective policies.
It is possible to restrict which agents take actions and compute rewards for each state
with the acting_agents
and rewarded_agents
properties of the
FSMStage
class.
In each handler method the user must take care to call self.network.resolve()
.
This is left to the user as to allow full flexibility on both when the messages on the
network are resolved and also, in advanced cases, which resolve method is called.
There are two methods to define the finite state machine structure. It is possible to use a mix of both methods. The following two examples are equivalent.
The first uses the FSMStage
as a decorator directly on the state handler method:
class CustomEnv(ph.FiniteStateMachineEnv):
def __init__(self):
agents = [MinimalAgent("agent")]
network = ph.Network(agents)
super().__init__(num_steps=10, network=network, initial_stage="A")
@ph.FSMStage(stage_id="A", next_stages=["A"])
def handle(self):
# Perform any pre-resolve tasks
self.resolve_network()
# Perform any post-resolve tasks
The second defines the states via a list of FSMStage
instances passed to the
FiniteStateMachineEnv
init method. This method is needed when values of
parameters passed to the FSMStage
initialisers are only known when the
environment class is initialised (eg. lists of agent IDs).
class CustomEnv(ph.FiniteStateMachineEnv):
def __init__(self):
agents = [MinimalAgent("agent")]
network = ph.Network(agents)
super().__init__(
num_steps=10,
network=network,
initial_stage="A",
stages=[
ph.FSMStage(
stage_id="A",
next_stages=["A"],
handler=self.handle,
)
],
)
def handle(self):
# Perform any pre-resolve tasks
self.resolve_network()
# Perform any post-resolve tasks
Environment
- class phantom.fsm.FiniteStateMachineEnv(num_steps, network, initial_stage, env_supertype=None, agent_supertypes=None, stages=None)[source]
Base environment class that allows implementation of a finite state machine to handle complex environment multi-step setups. This class should not be used directly and instead should be subclassed. Use the
FSMStage
decorator on handler methods within subclasses of this class to register stages to the FSM.A ‘stage’ corresponds to a state in the finite state machine, however to avoid any confusion with Environment states we refer to them as stages. Stage IDs can be anything type that is hashable, eg. strings, ints, enums.
- Parameters:
num_steps (
int
) – The maximum number of steps the environment allows per episode.network (
Network
) – A Network class or derived class describing the connections between agents and agents in the environment.initial_stage (
Hashable
) – The initial starting stage of the FSM. When the reset() method is called the environment is initialised into this stage.env_supertype (
Optional
[Supertype
]) – Optional Supertype class instance for the environment. If this is set, it will be sampled from and theenv_type
property set on the class with every call toreset()
.agent_supertypes (
Optional
[Mapping
[Hashable
,Supertype
]]) – Optional mapping of agent IDs to Supertype class instances. If these are set, each supertype will be sampled from and thetype
property set on the related agent with every call toreset()
.stages (
Optional
[Sequence
[FSMStage
]]) – List of FSM stages. FSM stages can be defined via this list or alternatively via theFSMStage
decorator.
- class Step(observations, rewards, terminations, truncations, infos)
- count(value, /)
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- close()
After the user has finished using the environment, close contains the code necessary to “clean up” the environment.
This is critical for closing rendering windows, database or HTTP connections.
- property non_strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that do not take actions.
- property np_random: Generator
Returns the environment’s internal
_np_random
that if not set will initialise with a random seed.- Returns:
Instances of np.random.Generator
- post_message_resolution()
Perform internal, post-message resolution updates to the environment.
- Return type:
- pre_message_resolution()
Perform internal, pre-message resolution updates to the environment.
- Return type:
- render()
Compute the render frames as specified by
render_mode
during the initialization of the environment.The environment’s
metadata
render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames. :rtype:None
Note
As the
render_mode
is known during__init__
, the objects used to render the environment state should be initialised in__init__
.By convention, if the
render_mode
is:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()
andrender()
doesn’t need to be called. ReturnsNone
.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarray
with shape(x, y, 3)
representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str
) orStringIO.StringIO
containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollection
that is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list")
. The frames collected are popped afterrender()
is called orreset()
.
Note
Make sure that your class’s
metadata
"render_modes"
key includes the list of supported modes.Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(seed=None, options=None)[source]
Reset the environment and return an initial observation.
This method resets the step count and the
network
. This includes all the agents in the network.- Parameters:
- Return type:
- Returns:
A dictionary mapping Agent IDs to observations made by the respective
agents. It is not required for all agents to make an initial observation. - An optional dictionary with auxillary information, equivalent to the info dictionary in env.step().
- property strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that take actions.
- property strategic_agents: List[StrategicAgent]
Return a list of agents that take actions.
- property unwrapped: Env
Returns the base non-wrapped environment (i.e., removes all wrappers).
- Returns:
The base non-wrapped
gymnasium.Env
instance- Return type:
Env
Stages
- class phantom.fsm.FSMStage(stage_id, acting_agents, rewarded_agents=None, next_stages=None, handler=None)[source]
Decorator used in the
FiniteStateMachineEnv
to declare the finite state machine structure and assign handler functions to stages.A ‘stage’ corresponds to a state in the finite state machine, however to avoid any confusion with Environment states we refer to them as stages.
- id
The name of this stage.
- acting_agents
The agents that will take an action at the end of the steps that belong to this stage..
- rewarded_agents
If provided, only the given agents will calculate and return a reward at the end of the step for this stage. If not provided, a reward will be computed for all acting agents for the current stage.
- next_stages
The stages that this stage can transition to.
- handler
Environment class method to be called when the FSM enters this stage.
Errors
- class phantom.fsm.FSMValidationError[source]
Error raised when validating the FSM when initialising the
FiniteStateMachineEnv
.- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class phantom.fsm.FSMRuntimeError[source]
Error raised when validating FSM stage changes when running an episode using the
FiniteStateMachineEnv
.- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.