PolarisFileSystem
polaris.hub.polarisfs.PolarisFileSystem
PolarisFileSystem(polaris_client: PolarisHubClient, dataset_owner: str, dataset_name: str, **kwargs: dict)
Bases: AbstractFileSystem
A file system interface for accessing datasets on the Polaris platform.
This class extends fsspec.AbstractFileSystem
and provides methods to list objects within a Polaris dataset
and fetch the content of a file from the dataset.
Zarr Integration
This file system can be used with Zarr to load multidimensional array data stored in a Dataset from the Polaris infrastructure. This class is needed because we otherwise cannot generate signed URLs for folders and Zarr is a folder based data-format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
polaris_client |
PolarisHubClient
|
The Polaris Hub client used to make API requests. |
required |
dataset_owner |
str
|
The owner of the dataset. |
required |
dataset_name |
str
|
The name of the dataset. |
required |
is_polarisfs_path
staticmethod
Check if the given path is a PolarisFS path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to check. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the path is a PolarisFS path; otherwise, False. |
ls
ls(path: str, detail: bool = False, timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> Union[List[str], List[Dict[str, Any]]]
List objects in the specified path within the Polaris dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path within the dataset to list objects. |
required |
detail |
bool
|
If True, returns detailed information about each object. |
False
|
timeout |
Optional[TimeoutTypes]
|
Maximum time (in seconds) to wait for the request to complete. |
None
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Dict[str, Any]]]
|
A list of dictionaries if detail is True; otherwise, a list of object names. |
cat_file
cat_file(path: str, start: Union[int, None] = None, end: Union[int, None] = None, timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> bytes
Fetches and returns the content of a file from the Polaris dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the file within the dataset. |
required |
start |
Union[int, None]
|
The starting index of the content to retrieve. |
None
|
end |
Union[int, None]
|
The ending index of the content to retrieve. |
None
|
timeout |
Optional[TimeoutTypes]
|
Maximum time (in seconds) to wait for the request to complete. |
None
|
kwargs |
dict
|
Extra arguments passed to |
{}
|
Returns:
Type | Description |
---|---|
bytes
|
The content of the requested file. |
rm
Remove a file or directory from the Polaris dataset.
This method is provided for compatibility with the Zarr storage interface. It may be called by the Zarr store when removing a file or directory.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the file or directory to be removed. |
required |
recursive |
bool
|
If True, remove directories and their contents recursively. |
False
|
maxdepth |
Optional[int]
|
The maximum depth to recurse when removing directories. |
None
|
Returns:
Type | Description |
---|---|
None
|
None |
Note
This method currently it does not perform any removal operations and is included as a placeholder that aligns with the Zarr interface's expectations.
pipe_file
pipe_file(path: str, content: Union[bytes, str], timeout: Optional[TimeoutTypes] = None, **kwargs: dict) -> None
Pipes the content of a file to the Polaris dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the file within the dataset. |
required |
content |
Union[bytes, str]
|
The content to be piped into the file. |
required |
timeout |
Optional[TimeoutTypes]
|
Maximum time (in seconds) to wait for the request to complete. |
None
|
Returns:
Type | Description |
---|---|
None
|
None |