Simple Line¶
This environment is part of the ”Learning Transferable Cooperative Behavior in Multi-Agent Teams” Paper.
Import |
|
|---|---|
Actions |
Discrete/Continuous |
Parallel API |
Yes |
Manual Control |
No |
Agents |
|
Action Shape |
(5) |
Action Values |
Discrete(5)/Box(0.0, 1.0, (5)) |
Observation Shape |
(8) |
Observation Values |
(-inf, inf) |
N agents must arrange themselves in a line between two landmarks. At reset, the two landmarks are placed at a fixed separation in a random direction; the ideal agent positions are evenly spaced along that line. Agents are assigned to target positions via bipartite matching (Hungarian algorithm). The shared reward is the negative mean distance from assigned positions, clipped to [0, 2].
Agent observations: [self_vel, self_pos, landmark_0_rel_pos, landmark_1_rel_pos]
Agent action space: [no_action, move_left, move_right, move_down, move_up]
Arguments¶
simple_line_v1.env(N=4, max_cycles=25, continuous_actions=False, terminate_on_success=False)
N: number of agents
max_cycles: number of frames until the episode terminates
continuous_actions: whether action spaces are discrete (default) or continuous
terminate_on_success: when True, the episode ends as soon as every agent is within
0.05 units of its assigned target position.
API¶
- class mpe2.simple_line.simple_line.env(**kwargs)¶
- class mpe2.simple_line.simple_line.raw_env(N=4, max_cycles=25, continuous_actions=False, render_mode=None, dynamic_rescaling=False, benchmark_data=False, terminate_on_success=False)¶