Spaces

Space

class rl_coach.spaces.Space(shape: Union[int, tuple, list, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf)[source]

A space defines a set of valid values

Parameters:
  • shape – the shape of the space
  • low – the lowest values possible in the space. can be an array defining the lowest values per point, or a single value defining the general lowest values
  • high – the highest values possible in the space. can be an array defining the highest values per point, or a single value defining the general highest values
is_point_in_space_shape(point: numpy.ndarray) → bool[source]

Checks if a given multidimensional point is within the bounds of the shape of the space

Parameters:point – a multidimensional point
Returns:True if the point is within the shape of the space. False otherwise
sample() → numpy.ndarray[source]

Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if no bounds are defined

Returns:A numpy array sampled from the space
val_matches_space_definition(val: Union[int, float, numpy.ndarray]) → bool[source]

Checks if the given value matches the space definition in terms of shape and values

Parameters:val – a value to check
Returns:True / False depending on if the val matches the space definition

Observation Spaces

class rl_coach.spaces.ObservationSpace(shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf)[source]
is_point_in_space_shape(point: numpy.ndarray) → bool

Checks if a given multidimensional point is within the bounds of the shape of the space

Parameters:point – a multidimensional point
Returns:True if the point is within the shape of the space. False otherwise
sample() → numpy.ndarray

Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if no bounds are defined

Returns:A numpy array sampled from the space
val_matches_space_definition(val: Union[int, float, numpy.ndarray]) → bool

Checks if the given value matches the space definition in terms of shape and values

Parameters:val – a value to check
Returns:True / False depending on if the val matches the space definition

VectorObservationSpace

class rl_coach.spaces.VectorObservationSpace(shape: int, low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, measurements_names: List[str] = None)[source]

An observation space which is defined as a vector of elements. This can be particularly useful for environments which return measurements, such as in robotic environments.

PlanarMapsObservationSpace

class rl_coach.spaces.PlanarMapsObservationSpace(shape: numpy.ndarray, low: int, high: int, channels_axis: int = -1)[source]

An observation space which defines a stack of 2D observations. For example, an environment which returns a stack of segmentation maps like in Starcraft.

ImageObservationSpace

class rl_coach.spaces.ImageObservationSpace(shape: numpy.ndarray, high: int, channels_axis: int = -1)[source]

An observation space which is a private case of the PlanarMapsObservationSpace, where the stack of 2D observations represent a RGB image, or a grayscale image.

Action Spaces

class rl_coach.spaces.ActionSpace(shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: Union[int, float, numpy.ndarray, List] = None)[source]
clip_action_to_space(action: Union[int, float, numpy.ndarray, List]) → Union[int, float, numpy.ndarray, List][source]

Given an action, clip its values to fit to the action space ranges

Parameters:action – a given action
Returns:the clipped action
is_point_in_space_shape(point: numpy.ndarray) → bool

Checks if a given multidimensional point is within the bounds of the shape of the space

Parameters:point – a multidimensional point
Returns:True if the point is within the shape of the space. False otherwise
sample() → numpy.ndarray

Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if no bounds are defined

Returns:A numpy array sampled from the space
sample_with_info() → rl_coach.core_types.ActionInfo[source]

Get a random action with additional “fake” info

Returns:An action info instance
val_matches_space_definition(val: Union[int, float, numpy.ndarray]) → bool

Checks if the given value matches the space definition in terms of shape and values

Parameters:val – a value to check
Returns:True / False depending on if the val matches the space definition

AttentionActionSpace

class rl_coach.spaces.AttentionActionSpace(shape: int, low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None, forced_attention_size: Union[None, int, float, numpy.ndarray] = None)[source]

A box selection continuous action space, meaning that the actions are defined as selecting a multidimensional box from a given range. The actions will be in the form: [[low_x, low_y, …], [high_x, high_y, …]]

BoxActionSpace

class rl_coach.spaces.BoxActionSpace(shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None)[source]

A multidimensional bounded or unbounded continuous action space

DiscreteActionSpace

class rl_coach.spaces.DiscreteActionSpace(num_actions: int, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None)[source]

A discrete action space with action indices as actions

MultiSelectActionSpace

class rl_coach.spaces.MultiSelectActionSpace(size: int, max_simultaneous_selected_actions: int = 1, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None, allow_no_action_to_be_selected=True)[source]

A discrete action space where multiple actions can be selected at once. The actions are encoded as multi-hot vectors

CompoundActionSpace

class rl_coach.spaces.CompoundActionSpace(sub_spaces: List[rl_coach.spaces.ActionSpace])[source]

An action space which consists of multiple sub-action spaces. For example, in Starcraft the agent should choose an action identifier from ~550 options (Discrete(550)), but it also needs to choose 13 different arguments for the selected action identifier, where each argument is by itself an action space. In Starcraft, the arguments are Discrete action spaces as well, but this is not mandatory.

Goal Spaces

class rl_coach.spaces.GoalsSpace(goal_name: str, reward_type: rl_coach.spaces.GoalToRewardConversion, distance_metric: Union[rl_coach.spaces.GoalsSpace.DistanceMetric, Callable])[source]

A multidimensional space with a goal type definition. It also behaves as an action space, so that hierarchical agents can use it as an output action space. The class acts as a wrapper to the target space. So after setting the target space, all the values of the class will match the values of the target space (the shape, low, high, etc.)

Parameters:
  • goal_name – the name of the observation space to use as the achieved goal.
  • reward_type – the reward type to use for converting distances from goal to rewards
  • distance_metric – the distance metric to use. could be either one of the distances in the DistanceMetric enum, or a custom function that gets two vectors as input and returns the distance between them
class DistanceMetric[source]

An enumeration.

clip_action_to_space(action: Union[int, float, numpy.ndarray, List]) → Union[int, float, numpy.ndarray, List]

Given an action, clip its values to fit to the action space ranges

Parameters:action – a given action
Returns:the clipped action
distance_from_goal(goal: numpy.ndarray, state: dict) → float[source]

Given a state, check its distance from the goal

Parameters:
  • goal – a numpy array representing the goal
  • state – a dict representing the state
Returns:

the distance from the goal

get_reward_for_goal_and_state(goal: numpy.ndarray, state: dict) → Tuple[float, bool][source]

Given a state, check if the goal was reached and return a reward accordingly

Parameters:
  • goal – a numpy array representing the goal
  • state – a dict representing the state
Returns:

the reward for the current goal and state pair and a boolean representing if the goal was reached

goal_from_state(state: Dict)[source]

Given a state, extract an observation according to the goal_name

Parameters:state – a dictionary of observations
Returns:the observation corresponding to the goal_name
is_point_in_space_shape(point: numpy.ndarray) → bool

Checks if a given multidimensional point is within the bounds of the shape of the space

Parameters:point – a multidimensional point
Returns:True if the point is within the shape of the space. False otherwise
sample() → numpy.ndarray

Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if no bounds are defined

Returns:A numpy array sampled from the space
sample_with_info() → rl_coach.core_types.ActionInfo

Get a random action with additional “fake” info

Returns:An action info instance
val_matches_space_definition(val: Union[int, float, numpy.ndarray]) → bool

Checks if the given value matches the space definition in terms of shape and values

Parameters:val – a value to check
Returns:True / False depending on if the val matches the space definition