Skip to content

Client

polaris.hub.settings.PolarisHubSettings

Bases: BaseSettings

Settings for the OAuth2 Polaris Hub API Client.

Secrecy of these settings

Since the Polaris Hub uses PCKE (Proof Key for Code Exchange) for OAuth2, these values thus do not have to be kept secret. See RFC 7636 for more info.

Attributes:

Name Type Description
hub_url HttpUrlString

The URL to the main page of the Polaris Hub.

api_url HttpUrlString | None

The URL to the main entrypoint of the Polaris API.

authorize_url HttpUrlString

The URL of the OAuth2 authorization endpoint.

callback_url HttpUrlString

The URL to which the user is redirected after authorization.

token_fetch_url HttpUrlString

The URL of the OAuth2 token endpoint.

user_info_url HttpUrlString

The URL of the OAuth2 user info endpoint.

scopes str

The OAuth2 scopes that are requested.

client_id str

The OAuth2 client ID.

ca_bundle Union[str, bool, None]

The path to a CA bundle file for requests. Allows for custom SSL certificates to be used.


polaris.hub.client.PolarisHubClient

PolarisHubClient(settings: PolarisHubSettings | None = None, cache_auth_token: bool = True, **kwargs: dict)

Bases: OAuth2Client

A client for the Polaris Hub API. The Polaris Hub is a central repository of datasets, benchmarks and results. Visit it here: https://polarishub.io/.

Bases the authlib client, which in turns bases the httpx client. See the relevant docs to learn more about how to use these clients outside of the integration with the Polaris Hub.

Closing the client

The client should be closed after all requests have been made. For convenience, you can also use the client as a context manager to automatically close the client when the context is exited. Note that once the client has been closed, it cannot be used anymore.

# Make sure to close the client once finished
client = PolarisHubClient()
client.get(...)
client.close()

# Or use the client as a context manager
with PolarisHubClient() as client:
    client.get(...)
Interacting with artifacts owned by an organization

Soon after being added to a new organization on Polaris, there may be a delay spanning some minutes where you cannot upload/download artifacts where the aforementioned organization is the owner. If this occurs, please re-login via polaris login --overwrite and try again.

Async Client

authlib also supports an async client. Since we don't expect to make multiple requests to the Hub in parallel and due to the added complexity stemming from using the Python asyncio API, we are sticking to the sync client - at least for now.

Parameters:

Name Type Description Default
settings PolarisHubSettings | None

A PolarisHubSettings instance.

None
cache_auth_token bool

Whether to cache the auth token to a file.

True
**kwargs dict

Additional keyword arguments passed to the authlib OAuth2Client constructor.

{}

get_metadata_from_response

get_metadata_from_response(response: Response, key: str) -> str | None

Get custom metadata saved to the R2 object from the headers.

login

login(overwrite: bool = False, auto_open_browser: bool = True)

Login to the Polaris Hub using the OAuth2 protocol.

Headless authentication

It is currently not possible to login to the Polaris Hub without a browser. See this Github issue for more info.

Parameters:

Name Type Description Default
overwrite bool

Whether to overwrite the current token if the user is already logged in.

False
auto_open_browser bool

Whether to automatically open the browser to visit the authorization URL.

True

list_datasets

list_datasets(limit: int = 100, offset: int = 0) -> list[str]

List all available datasets on the Polaris Hub.

Parameters:

Name Type Description Default
limit int

The maximum number of datasets to return.

100
offset int

The offset from which to start returning datasets.

0

Returns:

Type Description
list[str]

A list of dataset names in the format owner/dataset_name.

get_dataset

get_dataset(owner: str | HubOwner, name: str, verify_checksum: ChecksumStrategy = 'verify_unless_zarr') -> DatasetV1 | DatasetV2

Load a standard dataset from the Polaris Hub.

Parameters:

Name Type Description Default
owner str | HubOwner

The owner of the dataset. Can be either a user or organization from the Polaris Hub.

required
name str

The name of the dataset.

required
verify_checksum ChecksumStrategy

Whether to use the checksum to verify the integrity of the dataset. If None, will infer a practical default based on the dataset's storage location.

'verify_unless_zarr'

Returns:

Type Description
DatasetV1 | DatasetV2

A Dataset instance, if it exists.

list_benchmarks

list_benchmarks(limit: int = 100, offset: int = 0) -> list[str]

List all available benchmarks on the Polaris Hub.

Parameters:

Name Type Description Default
limit int

The maximum number of benchmarks to return.

100
offset int

The offset from which to start returning benchmarks.

0

Returns:

Type Description
list[str]

A list of benchmark names in the format owner/benchmark_name.

get_benchmark

get_benchmark(owner: str | HubOwner, name: str, verify_checksum: ChecksumStrategy = 'verify_unless_zarr') -> BenchmarkSpecification

Load a benchmark from the Polaris Hub.

Parameters:

Name Type Description Default
owner str | HubOwner

The owner of the benchmark. Can be either a user or organization from the Polaris Hub.

required
name str

The name of the benchmark.

required
verify_checksum ChecksumStrategy

Whether to use the checksum to verify the integrity of the benchmark.

'verify_unless_zarr'

Returns:

Type Description
BenchmarkSpecification

A BenchmarkSpecification instance, if it exists.

upload_results

upload_results(results: BenchmarkResults, access: AccessType = 'private', owner: HubOwner | str | None = None)

Upload the results to the Polaris Hub.

Owner

The owner of the results will automatically be inferred by the hub from the user making the request. Contrary to other artifact types, an organization cannot own a set of results. However, you can specify the BenchmarkResults.contributors field to share credit with other hub users.

Required meta-data

The Polaris client and hub maintain different requirements as to which meta-data is required. The requirements by the hub are stricter, so when uploading to the hub you might get some errors on missing meta-data. Make sure to fill-in as much of the meta-data as possible before uploading.

Benchmark name and owner

Importantly, results.benchmark_name and results.benchmark_owner must be specified and match an existing benchmark on the Polaris Hub. If these results were generated by benchmark.evaluate(...), this is done automatically.

Parameters:

Name Type Description Default
results BenchmarkResults

The results to upload.

required
access AccessType

Grant public or private access to result

'private'
owner HubOwner | str | None

Which Hub user or organization owns the artifact. Takes precedence over results.owner.

None

upload_dataset

upload_dataset(dataset: DatasetV1 | DatasetV2, access: AccessType = 'private', timeout: TimeoutTypes = (10, 200), owner: HubOwner | str | None = None, if_exists: ZarrConflictResolution = 'replace')

Upload a dataset to the Polaris Hub.

Owner

You have to manually specify the owner in the dataset data model. Because the owner could be a user or an organization, we cannot automatically infer this from just the logged-in user.

Required meta-data

The Polaris client and hub maintain different requirements as to which meta-data is required. The requirements by the hub are stricter, so when uploading to the hub you might get some errors on missing meta-data. Make sure to fill-in as much of the meta-data as possible before uploading.

Parameters:

Name Type Description Default
dataset DatasetV1 | DatasetV2

The dataset to upload.

required
access AccessType

Grant public or private access to result

'private'
timeout TimeoutTypes

Request timeout values. User can modify the value when uploading large dataset as needed. This can be a single value with the timeout in seconds for all IO operations, or a more granular tuple with (connect_timeout, write_timeout). The type of the the timout parameter comes from httpx. Since datasets can get large, it might be needed to increase the write timeout for larger datasets. See also: https://www.python-httpx.org/advanced/#timeout-configuration

(10, 200)
owner HubOwner | str | None

Which Hub user or organization owns the artifact. Takes precedence over dataset.owner.

None
if_exists ZarrConflictResolution

Action for handling existing files in the Zarr archive. Options are 'raise' to throw an error, 'replace' to overwrite, or 'skip' to proceed without altering the existing files.

'replace'

upload_benchmark

upload_benchmark(benchmark: BenchmarkSpecification, access: AccessType = 'private', owner: HubOwner | str | None = None)

Upload the benchmark to the Polaris Hub.

Owner

You have to manually specify the owner in the benchmark data model. Because the owner could be a user or an organization, we cannot automatically infer this from the logged-in user.

Required meta-data

The Polaris client and hub maintain different requirements as to which meta-data is required. The requirements by the hub are stricter, so when uploading to the hub you might get some errors on missing meta-data. Make sure to fill-in as much of the meta-data as possible before uploading.

Non-existent datasets

The client will not upload the associated dataset to the hub if it does not yet exist. Make sure to specify an existing dataset or upload the dataset first.

Parameters:

Name Type Description Default
benchmark BenchmarkSpecification

The benchmark to upload.

required
access AccessType

Grant public or private access to result

'private'
owner HubOwner | str | None

Which Hub user or organization owns the artifact. Takes precedence over benchmark.owner.

None

get_competition

get_competition(owner: str | HubOwner, name: str, verify_checksum: ChecksumStrategy = 'verify_unless_zarr') -> CompetitionSpecification

Load a competition from the Polaris Hub.

Parameters:

Name Type Description Default
owner str | HubOwner

The owner of the competition. Can be either a user or organization from the Polaris Hub.

required
name str

The name of the competition.

required
verify_checksum ChecksumStrategy

Whether to use the checksum to verify the integrity of the dataset.

'verify_unless_zarr'

Returns:

Type Description
CompetitionSpecification

A CompetitionSpecification instance, if it exists.

list_competitions

list_competitions(limit: int = 100, offset: int = 0) -> list[str]

List all available competitions on the Polaris Hub.

Parameters:

Name Type Description Default
limit int

The maximum number of competitions to return.

100
offset int

The offset from which to start returning competitions.

0

Returns:

Type Description
list[str]

A list of competition names in the format owner/competition_name.

evaluate_competition

evaluate_competition(competition: CompetitionSpecification, competition_predictions: CompetitionPredictions) -> CompetitionResults

Evaluate the predictions for a competition on the Polaris Hub. Target labels are fetched by Polaris Hub and used only internally.

Parameters:

Name Type Description Default
competition CompetitionSpecification

The competition to evaluate the predictions for.

required
competition_predictions CompetitionPredictions

The predictions and associated metadata to be submitted for evaluation by the Hub.

required

Returns:

Type Description
CompetitionResults

A CompetitionResults object.