Tune a Model

Model Tuning Utilities

calculate_metrics_by_thresh_binary

 calculate_metrics_by_thresh_binary (y_true:<built-infunctionarray>,
                                     y_prob:<built-infunctionarray>, metri
                                     cs:Union[Callable,Sequence[Callable]]
                                     , thresholds:Optional[Sequence]=None)

Calculate binary classification metrics as a function of threshold

Takes prediction to be 1 when y_prob is greater than the threshold, 0 otherwise.

Parameters:

y_true: Ground-truth values with shape (n_items,)
y_prob: Probability distributions with shape (n_items, 2)
metrics: Callables that take y_true, y_pred as positional arguments and return a number. Must have a __name__ attribute.
thresholds: Sequence of float threshold values to use. By default uses 0 and the values that appear in y_prob[:, 1], which is a minimal set that covers all of the relevant possibilities. One reason to override that default would be to save time with a large dataset.

Returns: DataFrame with one column “thresh” indicating the thresholds used and an additional column for each input metric giving the value of that metric at that threshold.

For instance, we can use calculate_metrics_by_thresh_binary to find the threshold that maximizes a model’s F1 score.

y_true = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
y_prob = np.array(
    [
        [0.9, 0.1],
        [0.7, 0.3],
        [0.6, 0.4],
        [0.6, 0.4],
        [0.4, 0.6],
        [0.6, 0.4],
        [0.4, 0.6],
        [0.4, 0.6],
        [0.3, 0.7],
        [0.1, 0.9],
    ]
)

results = calculate_metrics_by_thresh_binary(
    y_true=y_true,
    y_prob=y_prob,
    metrics=[metrics.recall_score, metrics.precision_score, metrics.f1_score],
).iloc[:-1, :]
results

  0%|          | 0/7 [00:00<?, ?it/s]/Users/greg/.pyenv/versions/model_inspector/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
100%|██████████| 7/7 [00:00<00:00, 472.74it/s]

	thresh	recall_score	precision_score	f1_score
0	0.0	1.0	0.500000	0.666667
1	0.1	1.0	0.555556	0.714286
2	0.3	1.0	0.625000	0.769231
3	0.4	0.8	0.800000	0.800000
4	0.6	0.4	1.000000	0.571429
5	0.7	0.2	1.000000	0.333333

ax = results.plot(x="thresh")
best_row = results.loc[results.loc[:, "f1_score"].idxmax()]
best_thresh = best_row.loc["thresh"]
ax.axvline(best_thresh, c="k")

print("Best result:")
best_row

Best result:

thresh             0.4
recall_score       0.8
precision_score    0.8
f1_score           0.8
Name: 3, dtype: float64

source

plot_pr_curve

 plot_pr_curve (y_true:<built-infunctionarray>, y_prob:<built-
                infunctionarray>,
                ax:Optional[matplotlib.axes._axes.Axes]=None,
                pos_label=None, sample_weight=None)

Plot the precision-recall curve for a binary classification problem.

Parameters:

y_true: Ground-truth values with shape (n_items,)
y_prob: Probability distributions with shape (n_items, 2)
ax: Matplotlib Axes object. Plot will be added to this object if provided; otherwise a new Axes object will be generated.

Remaining parameters are passed to sklearn.metrics._ranking.precision_recall_curve.

ax = plot_pr_curve(
    y_true=y_true,
    y_prob=y_prob,
)

source

calculate_metrics_by_thresh_multi

 calculate_metrics_by_thresh_multi (y_true:<built-infunctionarray>,
                                    y_prob:<built-infunctionarray>, metric
                                    s:Union[Callable,Sequence[Callable]],
                                    thresholds:Optional[Sequence]=None)

Calculate multiclass metrics as a function of threshold

Takes prediction to be the position of the column in y_prob with the greatest value if that value is greater than the threshold, np.nan otherwise.

Parameters:

y_true: Ground-truth values
y_prob: Probability distributions
metrics: Callables that take y_true, y_pred as positional arguments and return a number. Must have a __name__ attribute.
thresholds: Sequence of float threshold values to use. By default uses 0 and all values that appear in y_prob, which is a minimal set that covers all of the relevant possibilities. One reason to override that default would be to save time with a large dataset.

Returns: DataFrame with one column “thresh” indicating the thresholds used and an additional column for each input metric giving the value of that metric at that threshold.

Suppose that in a multiclass problem we want to track two metrics: coverage (how often we make a prediction) and precision (how often our predictions are right when we make them). We will choose the threshold an \(F_\beta\)-like metric that maximizes a weighted harmonic mean of those two metrics that puts twice as much weight on precision as coverage.

source

coverage

 coverage (y_true:<built-infunctionarray>, y_pred:<built-infunctionarray>)

How often the model makes a prediction, where np.nan indicates abstaining from predicting.

Parameters:

y_true: Ground-truth values
y_pred: Predicted values, possibly including np.nan to indicate abstraining from predicting

source

calculate_metric_ignoring_nan

 calculate_metric_ignoring_nan (y_true:<built-infunctionarray>,
                                y_pred:<built-infunctionarray>,
                                metric:Callable, *args, **kwargs)

Calculate metric ignoring np.nan predictions

Parameters:

y_true: Ground-truth values
y_pred: Predicted values, possibly including np.nan to indicate abstraining from predicting
metric: Function that takes y_true, y_pred as keyword arguments

Any additional arguments will be passed to metric

source

fbeta

 fbeta (precision:float, recall:float, beta:float=1)

precision_ignoring_nan = partial(
    calculate_metric_ignoring_nan,
    metric=partial(metrics.precision_score, average="micro"),
)
precision_ignoring_nan.__name__ = "precision_ignoring_nan"

y_true = np.array([0, 0, 1, 2])
y_prob = np.array([[0.9, 0.1, 0], [0.2, 0.8, 0], [0.2, 0.8, 0], [0.3, 0.4, 0.3]])

results = calculate_metrics_by_thresh_multi(
    y_true=y_true,
    y_prob=y_prob,
    metrics=[coverage, precision_ignoring_nan],
).iloc[:-1, :]
results.loc[:, "quasi_fbeta"] = results.apply(
    lambda row: fbeta(
        precision=row.loc["precision_ignoring_nan"],
        recall=row.loc["coverage"],
        beta=0.5,
    ),
    axis="columns",
)
results

  0%|          | 0/8 [00:00<?, ?it/s]/Users/greg/.pyenv/versions/model_inspector/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1344: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
100%|██████████| 8/8 [00:00<00:00, 582.12it/s]

	thresh	coverage	precision_ignoring_nan	quasi_fbeta
0	0.0	1.00	0.500000	0.555556
1	0.0	1.00	0.500000	0.555556
2	0.1	1.00	0.500000	0.555556
3	0.2	1.00	0.500000	0.555556
4	0.3	1.00	0.500000	0.555556
5	0.4	0.75	0.666667	0.681818
6	0.8	0.25	1.000000	0.625000

ax = results.plot(x="thresh")
best_row = results.loc[results.loc[:, "quasi_fbeta"].idxmax(), :]
ax.axvline(best_row.loc["thresh"], c="k")
print("Best result:")
best_row

Best result:

thresh                    0.400000
coverage                  0.750000
precision_ignoring_nan    0.666667
quasi_fbeta               0.681818
Name: 5, dtype: float64

source

confusion_matrix

 confusion_matrix (y_true:Union[<built-
                   infunctionarray>,pandas.core.series.Series],
                   y_pred:Union[<built-
                   infunctionarray>,pandas.core.series.Series],
                   shade_axis:Union[str,int,NoneType]=None,
                   sample_weight:Optional[<built-infunctionarray>]=None,
                   normalize:Optional[str]=None)

Get confusion matrix

Parameters:

y_true: Ground-truth values
y_pred: Predicted values
shade_axis: axis argument to pass to pd.DataFrame.style.background_gradient

The remaining parameters are passed to sklearn.metrics.confusion_matrix.

confusion_matrix(
    y_true=np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1]),
    y_pred=np.array([0, 0, 0, 0, 1, 0, 1, 1, 0, 1]),
)

	Predicted 0	Predicted 1	Totals
Actual 0	4	1	5
Actual 1	2	3	5
Totals	6	4	10

confusion_matrix(
    y_true=np.array([0, 0, 2, 0, 0, 1, 1, 1, 1, 1]),
    y_pred=np.array([0, 0, 2, 0, 1, 2, 1, 1, 0, 1]),
    shade_axis="rows",
)

	Predicted 0	Predicted 1	Predicted 2	Totals
Actual 0	3	1	0	4
Actual 1	1	3	1	5
Actual 2	0	0	1	1
Totals	4	4	2	10