It is important to carefully curate the datasets to avoid noise being introduced into
the model due to the presence of outliers. This can be done after a dataset has been
created.
From the PhysicsAI ribbon, select the Manage
Dataset tool.
Figure 1.
The Datasets dialog opens.
Identify Outliers
Outliers in the datasets are automatically detected based on one or more responses or
model properties.
A Z-score distribution is used to highlight the samples which fall in 3-sigma
tails.
Select a dataset and properties.
Figure 2.
Switch between the Table and Plot tabs to identify outliers in the tabular and
graphical views.
Compare data samples on multiple criteria by using side-by-side plots.
Edit Datasets
In the tabular view, click to add
datasets.
The Add files to Dataset dialog opens wherein you
can browse and select files to be added the dataset.
Select files and click to remove data
samples.
Select files and click Move to to move them to other
datasets.
Note: A potential source of noise is rotational/translational non-alignment of
the data samples. While translational non-alignment can be corrected (see
Train Models), rotational
alignment cannot be corrected. To detect rotational outliers, the dimensions
of the bounding box and coordinates of the center of gravity can be
used.Figure 3. Figure 4.
Clone Datasets
Entire datasets can be copied to preserve the original datasets without the need to
repeat the entire dataset creation process.