Stackelberg Environment
- class phantom.stackelberg.StackelbergEnv(num_steps, network, leader_agents, follower_agents, env_supertype=None, agent_supertypes=None)[source]
An environment modelling a Stackelberg game/competition.
- Parameters:
num_steps (
int) – The maximum number of steps the environment allows per episode.network (
Network) – A Network class or derived class describing the connections between agents and agents in the environment.leader_agents (
Sequence[Hashable]) – A list of Agent IDs to use as ‘leaders’.follower_agents (
Sequence[Hashable]) – A list of Agent IDs to use as ‘followers’.env_supertype (
Optional[Supertype]) – Optional Supertype class instance for the environment. If this is set, it will be sampled from and theenv_typeproperty set on the class with every call toreset().agent_supertypes (
Optional[Mapping[Hashable,Supertype]]) – Optional mapping of agent IDs to Supertype class instances. If these are set, each supertype will be sampled from and thetypeproperty set on the related agent with every call toreset().
- class Step(observations, rewards, terminations, truncations, infos)
- count(value, /)
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)
Return first index of value.
Raises ValueError if the value is not present.
- close()
After the user has finished using the environment, close contains the code necessary to “clean up” the environment.
This is critical for closing rendering windows, database or HTTP connections.
- property non_strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that do not take actions.
- property np_random: Generator
Returns the environment’s internal
_np_randomthat if not set will initialise with a random seed.- Returns:
Instances of np.random.Generator
- post_message_resolution()
Perform internal, post-message resolution updates to the environment.
- Return type:
- pre_message_resolution()
Perform internal, pre-message resolution updates to the environment.
- Return type:
- render()
Compute the render frames as specified by
render_modeduring the initialization of the environment.The environment’s
metadatarender modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames. :rtype:NoneNote
As the
render_modeis known during__init__, the objects used to render the environment state should be initialised in__init__.By convention, if the
render_modeis:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()andrender()doesn’t need to be called. ReturnsNone.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarraywith shape(x, y, 3)representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str) orStringIO.StringIOcontaining a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollectionthat is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list"). The frames collected are popped afterrender()is called orreset().
Note
Make sure that your class’s
metadata"render_modes"key includes the list of supported modes.Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(seed=None, options=None)[source]
Reset the environment and return initial observations from the leader agents.
This method resets the step count and the
network. This includes all the agents in the network.- Parameters:
- Return type:
- Returns:
A dictionary mapping Agent IDs to observations made by the respective
agents. It is not required for all agents to make an initial observation. - A dictionary with auxillary information, equivalent to the info dictionary
in env.step().
- property strategic_agent_ids: List[Hashable]
Return a list of the IDs of the agents that take actions.
- property strategic_agents: List[StrategicAgent]
Return a list of agents that take actions.
- property unwrapped: Env[ObsType, ActType]
Returns the base non-wrapped environment.
- Returns:
The base non-wrapped
gymnasium.Envinstance- Return type:
Env