Skip to content



Bases: DatasetV2, PredictiveTaskSpecificationMixin, SplitSpecificationV1Mixin

An instance of this class represents a Polaris competition.


Basic API usage:

import polaris as po

# Load the benchmark from the Hub
competition = po.load_competition("dummy-user/dummy-name")

# Get the train and test data-loaders
train, test = competition.get_train_test_split()

# Use the training data to train your model
# Get the input as an array with 'train.inputs' and 'train.targets'
# Or simply iterate over the train object.
for x, y in train:

# Work your magic to accurately predict the test set
prediction_values = np.array([0.0 for x in test])

# Submit your predictions


Name Type Description
start_time datetime

The time at which the competition starts accepting prediction submissions.

end_time datetime

The time at which the competition stops accepting prediction submissions.

n_classes dict[ColumnName, int | None]

The number of classes within each target column that defines a classification task.

For additional metadata attributes, see the base classes.


_validate_split_in_dataset() -> Self

All indices are valid given the dataset. We check the len of self here because a competition entity includes both the dataset and benchmark in one artifact.


_validate_cols_in_dataset() -> Self

Verifies that all specified columns are present in the dataset.


_validate_n_classes() -> Self

The number of classes for each of the target columns.


_get_subset(indices, hide_targets=True, featurization_fn=None) -> Subset

Returns a Subset using the given indices. Used internally to construct the train and test sets.


    hide_targets=True, featurization_fn: Callable | None = None
) -> dict[str, Subset]

Construct the test set(s), given the split in the competition specification. Used internally to construct the test set for client use and evaluation.


    featurization_fn: Callable | None = None,
) -> tuple[Subset, Subset | dict[str, Subset]]

Construct the train and test sets, given the split in the competition specification.

Returns Subset objects, which offer several ways of accessing the data and can thus easily serve as a basis to build framework-specific (e.g. PyTorch, Tensorflow) data-loaders on top of.


Name Type Description Default
featurization_fn Callable | None

A function to apply to the input data. If a multi-input benchmark, this function expects an input in the format specified by the input_format parameter.



Type Description
tuple[Subset, Subset | dict[str, Subset]]

A tuple with the train Subset and test Subset objects. If there are multiple test sets, these are returned in a dictionary and each test set has an associated name. The targets of the test set can not be accessed.


    predictions: IncomingPredictionsType,
    prediction_name: SlugCompatibleStringType,
    prediction_owner: str,
    report_url: HttpUrlString,
    contributors: list[HubUser] | None = None,
    github_url: HttpUrlString | None = None,
    description: str = "",
    tags: list[str] | None = None,
    user_attributes: dict[str, str] | None = None,
) -> None

Convenient wrapper around the PolarisHubClient.submit_competition_predictions method. It handles the creation of a standardized predictions object, which is expected by the Hub, automatically.


Name Type Description Default
prediction_name SlugCompatibleStringType

The name of the prediction.

prediction_owner str

The slug of the user/organization which owns the prediction.

predictions IncomingPredictionsType

The predictions for each test set defined in the competition.

report_url HttpUrlString

A URL to a report/paper/write-up which describes the methods used to generate the predictions.

contributors list[HubUser] | None

The users credited with generating these predictions.

github_url HttpUrlString | None

An optional URL to a code repository containing the code used to generated these predictions.

description str

An optional and short description of the predictions.

tags list[str] | None

An optional list of tags to categorize the prediction by.

user_attributes dict[str, str] | None

An optional dict with additional, textual user attributes.




For pretty printing in Jupyter.