Actions
auroris.curation.actions.BaseAction
Bases: BaseModel
, ABC
An action in the curation process.
The importance of reproducibility
One of the main goals in designing auroris
is to make it easy to reproduce the curation process.
Reproducibility is key to scientific research. This is why a BaseAction needs to be serializable and
uniquely identified by a name
.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name that uniquely identifies the action. This is used to serialize and deserialize the action. |
prefix |
str
|
This prefix is used when an action adds columns to a dataset. If not set, it defaults to the name in uppercase. |
StereoIsomerACDetection
Bases: BaseAction
Automatic detection of activity shift between stereoisomers.
See auroris.curation.functional.detect_streoisomer_activity_cliff
for the docs of the
stereoisomer_id_col
, y_cols
and threshold
attributes
Attributes:
Name | Type | Description |
---|---|---|
mol_col |
Optional[str]
|
Column with the SMILES or RDKit Molecule objects. If specified, will be used to render an image for the activity cliffs. |
Deduplication
Bases: BaseAction
Automatic detection of outliers.
See auroris.curation.functional.deduplicate
for the docs of the
deduplicate_on
, y_cols
, keep
and method
attributes
Discretization
Bases: BaseAction
Thresholding bioactivity columns to binary or multiclass labels.
See auroris.curation.functional.discretize
for the docs of the
thresholds
, inplace
, allow_nan
and label_order
attributes
Attributes:
Name | Type | Description |
---|---|---|
input_column |
str
|
The column to discretize. |
log_scale |
bool
|
Whether a visual depiction of the discretization should be on a log scale. |
ContinuousDistributionVisualization
Bases: BaseAction
Visualize one or more continuous distribution(s).
See auroris.visualization.visualize_continuous_distribution
for the docs of the
log_scale
and bins
attributes
Attributes:
Name | Type | Description |
---|---|---|
y_cols |
List[str]
|
The columns whose distributions should be visualized. |
MoleculeCuration
Bases: BaseAction
Automated molecule curation and chemistry space distribution.
See auroris.curation.functional.curate_molecules
for the docs of the
remove_stereo
, fix_mol
, count_stereoisomers
, and count_stereocenters
attributes
Attributes:
Name | Type | Description |
---|---|---|
input_column |
str
|
The name of the column that has the molecules (either |
X_col |
Optional[str]
|
Column with custom features for each of the molecules. If None, will use ECFP. |
y_cols |
Optional[Union[str, List[str]]]
|
Column names for bioactivities, which will be used to colorcode the chemical space visualization. |
OutlierDetection
Bases: BaseAction
Automatic detection of outliers.
See auroris.curation.functional.detect_outliers
for the docs of the
method
and kwargs
attributes
Attributes:
Name | Type | Description |
---|---|---|
columns |
List[str]
|
The columns for which to detect outliers. |