data module

conformalopt.data.get_scores(data_name, scores=None)[source]

Loads and processes scores for a specified dataset. The scores are calculated as |Y_t - hat Y_t| for various base forecasters hat Y_t. The datasets are all described in detail in CITE. The datasets of the form name* should be input as f”{name}_{base_forecaster}_absolute-residual_scores” for base_forecaster as ar, prophet, theta, or transformer.

Parameters:

data_name (str) –

The name of the dataset. Supported options:

  • elec: Elec2 data with base forecaster being a one-day delayed moving average.

  • daily-climate*: Daily climate data.

  • AMZN*, GOOGL*, MSFT*: Stock data.

  • synthetic_AR_2_1M: 1_000_000 synthetic AR(2) data generated with [0.3, -0.3] AR parameters and standard normal noise.

  • gaussian: 10_000 i.i.d. Gaussian-distributed synthetic scores.

  • ercot_preregistered: ERCOT load and forecast data. This is the preregistered dataset used in the paper.

Returns:

A processed score array.

Return type:

np.ndarray