Stackelberg Environment
- class phantom.stackelberg.StackelbergEnv(num_steps, network, leader_agents, follower_agents, env_supertype=None, agent_supertypes=None)[source]
An environment modelling a Stackelberg game/competition.
- Parameters:
num_steps (
int
) – The maximum number of steps the environment allows per episode.network (
Network
) – A Network class or derived class describing the connections between agents and agents in the environment.leader_agents (
Sequence
[Hashable
]) – A list of Agent IDs to use as ‘leaders’.follower_agents (
Sequence
[Hashable
]) – A list of Agent IDs to use as ‘followers’.env_supertype (
Optional
[Supertype
]) – Optional Supertype class instance for the environment. If this is set, it will be sampled from and theenv_type
property set on the class with every call toreset()
.agent_supertypes (
Optional
[Mapping
[Hashable
,Supertype
]]) – Optional mapping of agent IDs to Supertype class instances. If these are set, each supertype will be sampled from and thetype
property set on the related agent with every call toreset()
.
- class Step(observations, rewards, terminations, truncations, infos)
- count(value, /)
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- close()
After the user has finished using the environment, close contains the code necessary to “clean up” the environment.
This is critical for closing rendering windows, database or HTTP connections.
- property non_strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that do not take actions.
- property np_random: Generator
Returns the environment’s internal
_np_random
that if not set will initialise with a random seed.- Returns:
Instances of np.random.Generator
- post_message_resolution()
Perform internal, post-message resolution updates to the environment.
- Return type:
- pre_message_resolution()
Perform internal, pre-message resolution updates to the environment.
- Return type:
- render()
Compute the render frames as specified by
render_mode
during the initialization of the environment.The environment’s
metadata
render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames. :rtype:None
Note
As the
render_mode
is known during__init__
, the objects used to render the environment state should be initialised in__init__
.By convention, if the
render_mode
is:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()
andrender()
doesn’t need to be called. ReturnsNone
.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarray
with shape(x, y, 3)
representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str
) orStringIO.StringIO
containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollection
that is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list")
. The frames collected are popped afterrender()
is called orreset()
.
Note
Make sure that your class’s
metadata
"render_modes"
key includes the list of supported modes.Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(seed=None, options=None)[source]
Reset the environment and return initial observations from the leader agents.
This method resets the step count and the
network
. This includes all the agents in the network.- Parameters:
- Return type:
- Returns:
A dictionary mapping Agent IDs to observations made by the respective
agents. It is not required for all agents to make an initial observation. - A dictionary with auxillary information, equivalent to the info dictionary
in env.step().
- property strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that take actions.
- property strategic_agents: List[StrategicAgent]
Return a list of agents that take actions.
- property unwrapped: Env
Returns the base non-wrapped environment (i.e., removes all wrappers).
- Returns:
The base non-wrapped
gymnasium.Env
instance- Return type:
Env