flexmeasures.data.models.forecasting.custom_models.lgbm_model

Classes

class flexmeasures.data.models.forecasting.custom_models.lgbm_model.CustomLGBM(max_forecast_horizon: int = 48, probabilistic: bool = True, models_params: dict | None = None, auto_regressive: bool = True, use_past_covariates: bool = False, use_future_covariates: bool = False, ensure_positive: bool = False, seasonal_lags_steps: list[int] | None = None, training_sample_count: int | None = None, min_samples_per_horizon: int = 2)

Multi-horizon forecasting model using LightGBM.

This class implements a forecasting model that utilizes LightGBM (LGBM) for multi-horizon forecasting. It inherits from BaseModel and is designed to forecast multiple horizons into the future based on the provided maximum forecast horizon.

Attributes:: max_forecast_horizon (int): The maximum number of hours into the future for forecasting. probabilistic (bool): Flag indicating whether the model is probabilistic. models (List): List to hold multiple LGBM models.

__init__(max_forecast_horizon: int = 48, probabilistic: bool = True, models_params: dict | None = None, auto_regressive: bool = True, use_past_covariates: bool = False, use_future_covariates: bool = False, ensure_positive: bool = False, seasonal_lags_steps: list[int] | None = None, training_sample_count: int | None = None, min_samples_per_horizon: int = 2) → None

Initialize the LightGBM forecasting model.

Parameters:

max_forecast_horizon – Maximum number of sensor-resolution steps to forecast.
probabilistic – Whether to configure LightGBM for quantile predictions.
models_params – Optional LightGBM parameter overrides.
auto_regressive – Whether the target history should provide autoregressive features.
use_past_covariates – Whether past covariates are used for fitting and prediction.
use_future_covariates – Whether future covariates are used for fitting and prediction.
ensure_positive – Whether negative predictions should be clipped to zero.
seasonal_lags_steps – Candidate seasonal lag steps to keep if enough training samples remain. Include 1 in the list to account for the most recent observation (recommended).
training_sample_count – Optional number of target training samples, used to decide which lags are eligible.
min_samples_per_horizon – Minimum training rows required for each horizon model.

_filter_eligible_lags_for_horizon(horizon: int) → list[int]: Keep lag candidates that leave enough samples for this horizon.

static _lags_for_horizon(horizon: int, max_forecast_horizon: int, seasonal_lag_steps: int) → list[int]

Build Darts target lags for a forecasting horizon.

For a forecast target at horizon h and a seasonal period s, the aligned seasonal reference point is:

(t + h) - s

expressed relative to prediction origin t.

The corresponding aligned Darts lag l is therefore:

l = -(s - (h % s))

where the modulo wraps the horizon within the seasonal cycle.

Returned lags

The returned lag list always contains:

l:
the lag corresponding to the aligned seasonal position

In most cases, it additionally contains:

l - 1:
the observation immediately preceding the aligned seasonal position

Including both lags helps the model capture short-term local dynamics around the seasonal reference point, rather than relying on a single aligned observation.

Example

        timeline
    title Seasonal alignment example for h=3 and s=24

    section  Model lags
        t-25
        t-24 : seasonal anchor for s=24
        t-23
        t-22 : preceding point (second Darts lag)
    section Δ24h seasonal offset
        t-21 : aligned seasonal point for t+3 (first Darts lag)
        ... t+l ...
        t-1
        t : prediction origin (belief time)
        t+1
        t+2
    section Forecast horizons
        t+3 : forecast target at h=3 (event start)
        t+4
        ... t+H : max forecast horizon

For:

horizon = 3 seasonal_lag_steps = 24

we obtain:

l = -(24 - (3 % 24))
= -21

yielding:

[-21, -22]

corresponding to:

t-21 : aligned seasonal position for target t+3 t-22 : observation immediately preceding it

Edge case near maximum forecast horizon

For horizons near the maximum forecast horizon, only l is returned.

This avoids generating additional lag references that are not guaranteed to exist consistently during recursive multi-horizon prediction.

_setup() → None

Set up the forecasting models.

Subclasses must implement this method to populate self.models. Typically, one model is created per forecast horizon (up to self.max_forecast_horizon). These models must provide fit() and predict() methods compatible with darts TimeSeries.

static _validate_lag_candidates(seasonal_lags_steps: list[int]) → list[int]: Validate lag candidates and return them without duplicates.