Curator
auroris.curation.Curator
              Bases: BaseModel
A curator is a serializable collection of actions that are applied to a dataset.
Attributes:
| Name | Type | Description | 
|---|---|---|
| steps | List[BaseAction] | Ordered list of curation actions to apply to the dataset. | 
| src_dataset_path | Optional[str] | An optional path to load the source dataset from. Can be used to specify a reproducible workflow. | 
| verbosity | VerbosityLevel | Verbosity level for logging. | 
| parallelized_kwargs | dict | Keyword arguments to affect parallelization in the steps. | 
transform
Runs the curation process.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| dataset | Optional[DataFrame] | The dataset to be curated. If  | None | 
Returns:
| Type | Description | 
|---|---|
| Tuple[DataFrame, CurationReport] | A tuple of the curated dataset and a report summarizing the changes made. | 
            load_dataset
  
      staticmethod
  
    Loads a dataset, to be curated, from a path.
File-format support
This currently only supports CSV and Parquet files and uses the default
parameters for pd.read_csv and pd.read_parquet. If you need more flexibility,
consider loading the data yourself and passing it directly to Curator.transform(dataset=...).
            from_json
  
      classmethod
  
    Loads a curation workflow from a JSON file.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| path | str | The path to load from | required |