Environment Module

class Broker(leverage: int = 1)

Bases: object

Responsible for managing and placing orders. It implements real-environment behavior of orders price actions.

get_current_orders() list[source.environment.order.Order]

Current orders getter.

Returns:

Copy of the list of currently ongoing trades.

Return type:

(list[Order])

get_leverage() int

Leverage getter.

Returns:

Copy of leverage coefficient.

Return type:

(int)

place_order(amount: float, is_buy_order: bool, stop_loss: float, take_profit: float) None

Creates trade with given parameters and attach it to current broker’s orders.

Parameters:
  • amount (float) – Amount of money assigned to order.

  • is_buy_order (bool) – Indicates whether order should be treated as buy (long) position - when true - or as sell (short) postion - when false

  • stop_loss (float) – Coefficient used to close order when stock behaves contrary to expectations.

  • take_profit (float) – Coefficient used to close order when stock behaves accordingly to expectations.

reset() None

Resets broker by clearing the currently ongoing and recently closed lists of orders.

update_orders(coefficient: float) list[source.environment.order.Order]

Updates and closes orders. The current value of the order is multiplied by coefficient and if stop loss or take profit boundaries are crossed, then the order is closed.

Parameters:

coefficient (float) – Multiplyer that current order value is multiplied by.

Returns:

List of closed trades.

Return type:

(list[Order])

class LabelAnnotatorBase

Bases: ABC

Implements a base class for label annotators. It provides an interface for annotating data with labels based on price movements.

annotate(data: DataFrame) Series

Annotates the provided data with labels based on price movements.

Parameters:

data (pd.DataFrame) – The data to annotate, must contain a ‘close’ column.

Raises:

ValueError – If the output classes are not initialized before annotating data.

Returns:

A series of labels corresponding to the price movement trends.

Return type:

(pd.Series)

get_output_classes() SimpleNamespace

Returns the output classes for the label annotator.

Returns:

The SimpleNamespace containing the labels for classification.

Return type:

(SimpleNamespace)

class LabeledDataBalancer(balancers: list[imblearn.base.BaseSampler])

Bases: object

Implements a labeled data balancer that uses a list of samplers to balance the input and output data.

balance(input_data: list[list[float]], output_data: list[int]) tuple[list[list[float]], list[int]]

Balances the input and output data using the configured samplers.

Parameters:
  • input_data (list[list[float]]) – The input data to be balanced.

  • output_data (list[int]) – The output data to be balanced.

Raises:

ValueError – If the input data and output data do not have the same length.

Returns:

A tuple containing the balanced input data and output data.

Return type:

(tuple[list[list[float]], list[int]])

class Order(amount: float, is_buy_order: bool, stop_loss: float, take_profit: float)

Bases: object

Class storing information regarding particular order.

class PointsRewardValidator(rewarded_points: tuple[int, int] = (1, - 1))

Bases: RewardValidatorBase

Awards reward for successful or failure order basing on predefined constants.

validate_orders(orders: list[source.environment.order.Order]) float

Calculates number of points to be rewarded for list of closed trades.

Parameters:

orders (list[Order]) – Orders to be validated.

Returns:

Calcualted reward.

Return type:

(float)

class PriceRewardValidator(coefficient: float = 1.0, normalizable: bool = False)

Bases: RewardValidatorBase

Awards reward for successful or failure order basing on gained or lost value.

validate_orders(orders: list[source.environment.order.Order]) float

Calculates number of points to be rewarded for list of closed trades.

Parameters:

orders (list[Order]) – Orders to be validated.

Returns:

Calcualted reward.

Return type:

(float)

class RewardValidatorBase(*args)

Bases: ABC

Awards reward for successful or failure order basing on approach defined in derivative class.

abstract validate_orders(orders: list[source.environment.order.Order]) float

Calculates number of points to be rewarded for list of closed trades.

Parameters:

orders (list[Order]) – Orders to be validated.

Returns:

Calcualted reward.

Return type:

(float)

class SimpleLabelAnnotator(threshold: float = 0.01)

Bases: LabelAnnotatorBase

Implements a simple label annotator that classifies price movements into three classes: - Up trend - Down trend - No trend

annotate(data: DataFrame) Series

Annotates the provided data with labels based on price movements.

Parameters:

data (pd.DataFrame) – The data to annotate, must contain a ‘close’ column.

Raises:

ValueError – If the output classes are not initialized before annotating data.

Returns:

A series of labels corresponding to the price movement trends.

Return type:

(pd.Series)

get_output_classes() SimpleNamespace

Returns the output classes for the label annotator.

Returns:

The SimpleNamespace containing the labels for classification.

Return type:

(SimpleNamespace)

class TradingEnvironment(data: DataFrame, initial_budget: float, max_amount_of_trades: int, window_size: int, validator: RewardValidatorBase, label_annotator: LabelAnnotatorBase, sell_stop_loss: float, sell_take_profit: float, buy_stop_loss: float, buy_take_profit: float, test_ratio: float = 0.2, penalty_starts: int = 0, penalty_stops: int = 10, static_reward_adjustment: float = 1, labeled_data_balancer: Optional[LabeledDataBalancer] = None, meta_data: Optional[dict[str, Any]] = None)

Bases: Env

Implements stock market environment that actor can perform actions (place orders) in. It is used to train various models using various approaches. Can be configured to award points and impose a penalty in several ways.

TEST_MODE = 'test'
TRAIN_MODE = 'train'
action_space: Space[ActType]
close()

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

get_broker() Broker

Broker getter.

Returns:

Copy of the broker used by environment.

Return type:

(Broker)

get_data_for_iteration(columns: list[str], start: int = 0, stop: Optional[int] = None, step: int = 1) list[float]

Data getter for certain iterations.

Parameters:
  • columns (list[str]) – List of column names to extract from data.

  • start (int) – Start iteration index. Defaults to 0.

  • stop (int) – Stop iteration index. Defaults to environment length minus one.

  • step (int) – Step between iterations. Defaults to 1.

Returns:

Copy of part of data with specified columns

over specified iterations.

Return type:

(list[float])

get_environment_length() int

Environment length getter.

Returns:

Length of environment.

Return type:

(Int)

get_environment_spatial_data_dimension() tuple[int, int]

Environment spatial data dimensionality getter.

Returns:

Dimension of spatial data in environment.

Return type:

(Int)

get_labeled_data(should_split: bool = True, should_balance: bool = True, verbose: bool = True) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]

Prepares labeled data for training or testing the model. It extracts the relevant features and labels from the environment’s data.

Parameters:
  • should_split (bool) – Whether to split the data into training and testing sets. Defaults to True. If set to False, testing data will be empty.

  • should_balance (bool) – Whether to balance the labeled data. Defaults to True. Will be ignored if labeled_data_balancer is None.

  • verbose (bool) – Whether to log the class cardinality before and after balancing. Defaults to True.

Returns:

A tuple containing the

input data, output data, test input data, and test output data.

Return type:

(tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray])

get_mode() str

Mode getter.

Returns:

Current mode of the environment.

Return type:

(str)

get_trading_consts() SimpleNamespace

Trading constants getter.

Returns:

Copy of the namespace with all trading constants.

Return type:

(SimpleNamespace)

get_trading_data() SimpleNamespace

Trading data getter.

Returns:

Copy of the namespace with all trading data.

Return type:

(SimpleNamespace)

metadata: Dict[str, Any] = {'render_modes': []}
property np_random: Generator

Returns the environment’s internal _np_random that if not set will initialise with a random seed.

observation_space: Space[ObsType]
render() None

Renders environment visualization. Will be implemented later.

render_mode: Optional[str] = None
reset(randkey: Optional[int] = None) list[float]

Resets environment. Used typically if environemnt is finished, i.e. when ther is no more steps to be taken within environemnt or finish conditions are fulfilled.

Parameters:

randkey (Optional[int]) – Value indicating what iteration should be trated as starting point after reset.

Returns:

Current iteration observation state.

Return type:

(list[float])

reward_range = (-inf, inf)
set_mode(mode: str) None

Sets the mode of the environment to either TRAIN_MODE or TEST_MODE.

Parameters:

mode (str) – Mode to set for the environment.

Raises:

ValueError – If the provided mode is not valid.

spec: EnvSpec = None
step(action: int) tuple[list[float], float, bool, dict]

Performs specified action on environment. It results in generation of the new observations. This function causes trades to be handled, reward to be calculated and environment to be updated.

Parameters:

action (int) – Number specifing action. Possible values are 0 for buy action, 1 for wait action and 2 for sell action.

Returns:

Tuple containing next observation

state, reward, finish indication and additional info dictionary.

Return type:

(tuple[list[float], float, bool, dict])

property unwrapped: Env

Returns the base non-wrapped environment.

Returns:

The base non-wrapped gym.Env instance

Return type:

Env