Skip to content

PolarisFileSystem

polaris.hub.polarisfs.PolarisFileSystem

PolarisFileSystem(polaris_client: PolarisHubClient, dataset_owner: str, dataset_name: str, **kwargs: dict)

Bases: AbstractFileSystem

A file system interface for accessing datasets on the Polaris platform.

This class extends fsspec.AbstractFileSystem and provides methods to list objects within a Polaris dataset and fetch the content of a file from the dataset.

Zarr Integration

This file system can be used with Zarr to load multidimensional array data stored in a Dataset from the Polaris infrastructure. This class is needed because we otherwise cannot generate signed URLs for folders and Zarr is a folder based data-format.

fs = PolarisFileSystem(...)
store = zarr.storage.FSStore(..., fs=polaris_fs)
root = zarr.open(store, mode="r")

Parameters:

Name Type Description Default
polaris_client PolarisHubClient

The Polaris Hub client used to make API requests.

required
dataset_owner str

The owner of the dataset.

required
dataset_name str

The name of the dataset.

required

is_polarisfs_path staticmethod

is_polarisfs_path(path: str) -> bool

Check if the given path is a PolarisFS path.

Parameters:

Name Type Description Default
path str

The path to check.

required

Returns:

Type Description
bool

True if the path is a PolarisFS path; otherwise, False.

ls

ls(path: str, detail: bool = False, timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> Union[List[str], List[Dict[str, Any]]]

List objects in the specified path within the Polaris dataset.

Parameters:

Name Type Description Default
path str

The path within the dataset to list objects.

required
detail bool

If True, returns detailed information about each object.

False
timeout Optional[TimeoutTypes]

Maximum time (in seconds) to wait for the request to complete.

None

Returns:

Type Description
Union[List[str], List[Dict[str, Any]]]

A list of dictionaries if detail is True; otherwise, a list of object names.

cat_file

cat_file(path: str, start: Union[int, None] = None, end: Union[int, None] = None, timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> bytes

Fetches and returns the content of a file from the Polaris dataset.

Parameters:

Name Type Description Default
path str

The path to the file within the dataset.

required
start Union[int, None]

The starting index of the content to retrieve.

None
end Union[int, None]

The ending index of the content to retrieve.

None
timeout Optional[TimeoutTypes]

Maximum time (in seconds) to wait for the request to complete.

None
kwargs dict

Extra arguments passed to fsspec.open()

{}

Returns:

Type Description
bytes

The content of the requested file.

rm

rm(path: str, recursive: bool = False, maxdepth: Optional[int] = None) -> None

Remove a file or directory from the Polaris dataset.

This method is provided for compatibility with the Zarr storage interface. It may be called by the Zarr store when removing a file or directory.

Parameters:

Name Type Description Default
path str

The path to the file or directory to be removed.

required
recursive bool

If True, remove directories and their contents recursively.

False
maxdepth Optional[int]

The maximum depth to recurse when removing directories.

None

Returns:

Type Description
None

None

Note

This method currently it does not perform any removal operations and is included as a placeholder that aligns with the Zarr interface's expectations.

pipe_file

pipe_file(path: str, content: Union[bytes, str], timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> None

Pipes the content of a file to the Polaris dataset.

Parameters:

Name Type Description Default
path str

The path to the file within the dataset.

required
content Union[bytes, str]

The content to be piped into the file.

required
timeout Optional[TimeoutTypes]

Maximum time (in seconds) to wait for the request to complete.

None

Returns:

Type Description
None

None