Simple Line¶

This environment is part of the ”Learning Transferable Cooperative Behavior in Multi-Agent Teams” Paper.

Import	`from mpe2 import simple_line_v1`
Actions	Discrete/Continuous
Parallel API	Yes
Manual Control	No
Agents	`agents= [agent_0, ..., agent_N-1]`
Action Shape	(5)
Action Values	Discrete(5)/Box(0.0, 1.0, (5))
Observation Shape	(8)
Observation Values	(-inf, inf)

N agents must arrange themselves in a line between two landmarks. At reset, the two landmarks are placed at a fixed separation in a random direction; the ideal agent positions are evenly spaced along that line. Agents are assigned to target positions via bipartite matching (Hungarian algorithm). The shared reward is the negative mean distance from assigned positions, clipped to [0, 2].

Agent observations: [self_vel, self_pos, landmark_0_rel_pos, landmark_1_rel_pos]

Agent action space: [no_action, move_left, move_right, move_down, move_up]

Arguments¶

simple_line_v1.env(N=4, max_cycles=25, continuous_actions=False, terminate_on_success=False)

N: number of agents

max_cycles: number of frames until the episode terminates

continuous_actions: whether action spaces are discrete (default) or continuous

terminate_on_success: when True, the episode ends as soon as every agent is within 0.05 units of its assigned target position.

API¶

class mpe2.simple_line.simple_line.env(**kwargs)¶

class mpe2.simple_line.simple_line.raw_env(N=4, max_cycles=25, continuous_actions=False, render_mode=None, dynamic_rescaling=False, benchmark_data=False, terminate_on_success=False)¶