Simple Line

This environment is part of the ”Learning Transferable Cooperative Behavior in Multi-Agent Teams” Paper.

Import

from mpe2 import simple_line_v1

Actions

Discrete/Continuous

Parallel API

Yes

Manual Control

No

Agents

agents= [agent_0, ..., agent_N-1]

Action Shape

(5)

Action Values

Discrete(5)/Box(0.0, 1.0, (5))

Observation Shape

(8)

Observation Values

(-inf, inf)

N agents must arrange themselves in a line between two landmarks. At reset, the two landmarks are placed at a fixed separation in a random direction; the ideal agent positions are evenly spaced along that line. Agents are assigned to target positions via bipartite matching (Hungarian algorithm). The shared reward is the negative mean distance from assigned positions, clipped to [0, 2].

Agent observations: [self_vel, self_pos, landmark_0_rel_pos, landmark_1_rel_pos]

Agent action space: [no_action, move_left, move_right, move_down, move_up]

Arguments

simple_line_v1.env(N=4, max_cycles=25, continuous_actions=False, terminate_on_success=False)

N: number of agents

max_cycles: number of frames until the episode terminates

continuous_actions: whether action spaces are discrete (default) or continuous

terminate_on_success: when True, the episode ends as soon as every agent is within 0.05 units of its assigned target position.

API

class mpe2.simple_line.simple_line.env(**kwargs)
class mpe2.simple_line.simple_line.raw_env(N=4, max_cycles=25, continuous_actions=False, render_mode=None, dynamic_rescaling=False, benchmark_data=False, terminate_on_success=False)