Creating a project and running the report

Linking models and datasets

After first uploading models and the datasets to the Unbox platform, they become available to your whole team to explore and can be easily accessed on their specific sections under Registry. As the names suggest, your models live under the Models section and your datasets under the Datasets section.

Although there are many interesting things you can do with models and datasets in isolation (which we’ll discuss in another part of the tutorial), the true power of Unbox appears when we link models and datasets.

It’s time to conceptually link the two into a single logical unit to deeply investigate how our model behaves on the dataset we uploaded.

To link a model and a dataset, we need to create a Project.

Creating your first project

It’s time to create your first project!

First go to the projects page, by clicking on Projects on the sidebar, right under Registry.

You should see a panel that says Create a project with empty fields. To create your first project, fill out the project name (in our case, we’ll call it “My first project”) and click on +Add besides Models and Datasets. Notice that once you click on +Add, you will be briefly taken to a page that shows the linkable models and datasets.

🚧

It is important that your models and datasets are compatible with one another. If something appears to be disabled on the page that shows the linkable models or datasets, it means the class names do not match those of the model, dataset, or project to which you are linking.

Now, all of the fields on the project panel should be filled out. Click on Create to create your first project.

Your project should now appear under NLP, since this is a project with textual data.

The project page

Click on your newly created project and let’s now understand what’s inside the project page.

As we mentioned earlier, a project is created when we link models and datasets together. This is where most of the error analysis will happen. After all, error analysis is all about analyzing when, how, and why models fail and embracing the process of isolating, observing, and diagnosing erroneous ML predictions, thereby helping understand pockets of high and low performance of the model.

Inside a project, you will find the Report, all the tests you create based on that model-dataset pair, and other information, such as metadata and the linked model and dataset.

The Report is one of the most important things you’ll find inside the project page. It is a byproduct of the link between a model and a dataset and provides a comprehensive view of the performance of that model on that dataset.

👍

Running the report

Before moving on with the tutorial, click on Run on the Report. Once that is done running, click on Open to see the report, which is what we’ll explore next.

The report page

After the report is generated, you will see a block with some basic aggregate metrics, such as accuracy, precision, recall, and F1. Click on Open to go to the report page.

The report page has three parts: the metadata (shown on the right-hand side), the Error analysis panel (on top), and the data slice (on the bottom).

The information contained in the metadata displays the model’s performance on the dataset, measured by aggregate metrics, and other information, such as date of creation, model version, and others.

What is fundamental to understand is the relationship between the Error analysis panel and the slice of data at the bottom. By playing with the error analysis panel, you can modify or slice and dice your dataset and what you see at the bottom is a consequence of your actions in the error analysis panel.

👍

The error analysis panel and the data slice

Feel free to play a little bit with the error analysis panel, particularly on the Data distribution and Feature importance tabs. Click on a few of the elements that appear on the error analysis panel. Can you see what happens at the data shown in the bottom?

Now that you understand the relationship between the error analysis panel and the data shown at the bottom, let’s move on to the next part of the tutorial!

Creating more projects

As a side note, you can create multiple projects. They can be projects with any model or dataset that you uploaded to the platform, as long as they are compatible. For a refresher on how to upload models and datasets, check out the previous section of our tutorial.

If you already created your first project, you won’t see a Create project panel right in the center of the screen, but notice that in the upper right corner, there is always a Create project button.

Hover over that button and fill out all the fields to create new projects. Once you hit Create, the project should appear on the Projects page.


Did this page help you?