Participating in a Competition¶
In short
This tutorial walks you through how to fetch an active competition from Polaris, prepare your predictions and then submit them for secure evaluation by the Polaris Hub.
Participating in a competition on Polaris is very similar to participating in a standard benchmark. The main difference lies in how predictions are prepared and how they are evaluated. We'll touch on each of these topics later in the tutorial.
Before continuing, please ensure you are logged into Polaris.
import polaris as po
from polaris.hub.client import PolarisHubClient
# Don't forget to add your Polaris Hub username below!
MY_POLARIS_USERNAME = ""
client = PolarisHubClient()
client.login()
2024-08-09 18:05:23.205 | SUCCESS | polaris.hub.client:login:267 - You are successfully logged in to the Polaris Hub.
Fetching a Competition¶
As with standard benchmarks, Polaris provides simple APIs that allow you to quickly fetch a competition from the Polaris Hub. All you need is the unique identifier for the competition which follows the format of competition_owner
/competition_name
.
competition_id = "polaris/hello-world-competition"
competition = po.load_competition(competition_id)
Participate in the Competition¶
The Polaris library is designed to make it easy to participate in a competition. In just a few lines of code, we can get the train and test partition, access the associated data in various ways and evaluate our predictions. There's two main API endpoints.
get_train_test_split()
: For creating objects through which we can access the different dataset partitions.evaluate()
: For evaluating a set of predictions in accordance with the competition protocol.
train, test = competition.get_train_test_split()
The created test and train objects support various flavours to access the data.
# The objects are iterable
for x, y in train:
pass
# The objects can be indexed
for i in range(len(train)):
x, y = train[i]
# The objects have properties to access all data at once
x = train.inputs
y = train.targets
Now, let's create some predictions against the official Polaris hello-world-competition
. We will train a simple random forest model on the ECFP representation through scikit-learn and datamol, and then we will submit our results for secure evaluation by the Polaris Hub.
import datamol as dm
from sklearn.ensemble import RandomForestRegressor
# Load the competition (automatically loads the underlying dataset as well)
competition = po.load_competition("polaris/hello-world-benchmark")
# Get the split and convert SMILES to ECFP fingerprints by specifying an featurize function.
train, test = competition.get_train_test_split(featurization_fn=dm.to_fp)
# Define a model and train
model = RandomForestRegressor(max_depth=2, random_state=0)
model.fit(train.X, train.y)
predictions = model.predict(test.X)
Now that we have created some predictions, we can construct a CompetitionPredictions
object that will prepare our predictions for evaluation by the Polaris Hub. Here, you can also add metadata to your predictions to better describe your results and how you achieved them.
from polaris.evaluate import CompetitionPredictions
competition_predictions = CompetitionPredictions(
name="hello-world-result",
predictions=predictions,
target_labels=competition.target_cols,
test_set_labels=competition.test_set_labels,
test_set_sizes=competition.test_set_sizes,
github_url="https://github.com/polaris-hub/polaris-hub",
paper_url="https://polarishub.io/",
description="Hello, World!",
)
Once your CompetitionPredictions
object is created, you're ready to submit them for evaluation! This will automatically save your result to the Polaris Hub, but it will be private. You can choose to make it public through the Polaris web application.
results = competition.evaluate(competition_predictions)
client.close()
That's it! Just like that you have partaken in your first Polaris competition. Keep an eye on that leaderboard and best of luck in your future competitions!
The End.