When evaluating machine learning models, accuracy is one of the most commonly used metrics for classification tasks. In this blog post, we’ll dive into the accuracy_score
function provided by Scikit-Learn’s metrics
module, understand how it works, and compare it with manually calculating accuracy. By the end, you’ll have a solid understanding of accuracy as a metric and be able to calculate it both with and without Scikit-Learn.
What is Accuracy?
Accuracy is a metric used to evaluate classification models, defined as the ratio of correct predictions to the total number of predictions. It’s simple yet effective for balanced datasets where all classes are represented equally.
Accuracy Formula
This metric is particularly useful for classification tasks with balanced class distributions but can be misleading if your dataset is imbalanced. For example, in a dataset where 90% of the samples belong to one class, even a model that only predicts the majority class will have 90% accuracy.
Using sklearn.metrics.accuracy_score
Scikit-Learn’s accuracy_score
function makes it easy to calculate accuracy for classification tasks. Here’s a quick example to demonstrate its usage.
from sklearn.metrics import accuracy_score # True labels and predicted labels y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0] y_pred = [1, 0, 1, 0, 0, 1, 0, 1, 1, 0] # Calculate accuracy accuracy = accuracy_score(y_true, y_pred) print(f"Accuracy: {accuracy}")
The output is
Accuracy: 0.8
In this example:
y_true
is the list of true labels.y_pred
is the list of predicted labels from a classifier.accuracy_score
computes the ratio of matching values iny_true
andy_pred
.
The function returns 0.8
, meaning 80% of predictions were correct.
accuracy_score
Parameters
accuracy_score
has additional parameters that provide flexibility:
- normalize (default=
True
): IfTrue
, returns the fraction of correctly classified samples; ifFalse
, returns the number of correct predictions. - sample_weight: Allows you to assign different weights to samples, which is useful when certain samples are more important.
For example, if you set normalize=False
, accuracy_score
will return the count of correct predictions instead of the ratio:
accuracy_count = accuracy_score(y_true, y_pred, normalize=False) print(f"Number of Correct Predictions: {accuracy_count}")
The output is
Number of Correct Predictions: 8
In this case, accuracy_score
returns 8
, indicating there were 8 correct predictions out of 10.
Calculating Accuracy Manually
To understand how accuracy_score
works internally, let’s manually calculate accuracy. Here’s a step-by-step guide:
- Count the number of correct predictions by comparing each element in
y_true
andy_pred
. - Divide the count by the total number of predictions to get the accuracy.
Here’s how you can do this in Python:
# True labels and predicted labels y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0] y_pred = [1, 0, 1, 0, 0, 1, 0, 1, 1, 0] # Manual accuracy calculation correct_predictions = sum(1 for true, pred in zip(y_true, y_pred) if true == pred) total_predictions = len(y_true) manual_accuracy = correct_predictions / total_predictions print(f"Manual Accuracy: {manual_accuracy}")
The output is
Manual Accuracy: 0.8
The result is the same as using accuracy_score
, showing that our manual calculation matches Scikit-Learn’s result.
Explanation of the Code
- Zip and Compare: We use
zip(y_true, y_pred)
to pair each true label with its corresponding predicted label, then use a generator expression to count matches. - Sum Matches:
sum(1 for true, pred in zip(y_true, y_pred) if true == pred)
counts the correct predictions by summing up1
for each match. - Divide by Total: Dividing by the total number of predictions (
len(y_true)
) gives us the accuracy as a fraction.
Comparing accuracy_score
with Manual Calculation
The Scikit-Learn accuracy_score
function performs the same basic operations we used in the manual calculation:
- It compares each element of
y_true
andy_pred
. - It counts matches and divides by the total count if
normalize=True
.
Both methods give identical results, so choosing between them is a matter of preference and convenience. For small projects or quick checks, a manual calculation is straightforward. However, accuracy_score
is better for production code because it’s optimized, tested, and provides additional options like sample weights.
Example with normalize=False
Let’s compare the output when using normalize=False
to match with a count of correct predictions from our manual calculation.
# Using sklearn with normalize=False accuracy_count_sklearn = accuracy_score(y_true, y_pred, normalize=False) print(f"Correct Predictions (sklearn): {accuracy_count_sklearn}") # Manual count of correct predictions correct_count_manual = sum(1 for true, pred in zip(y_true, y_pred) if true == pred) print(f"Correct Predictions (manual): {correct_count_manual}")
The output is
Correct Predictions (sklearn): 8 Correct Predictions (manual): 8
Both methods give 8
as the count of correct predictions, showing consistency between Scikit-Learn and manual calculation even when using different options.
Pros and Cons
Method | Pros | Cons |
---|---|---|
accuracy_score | – Easy to use – Optimized and tested – Flexible | – Requires import – Might be overkill for simple tasks |
Manual Calculation | – No dependency on external libraries – Clear understanding | – Longer code – Limited flexibility |
When to Use Accuracy
Accuracy is a simple and intuitive metric, but it’s not always the best choice. Consider alternatives if:
- Class Imbalance: If your dataset has imbalanced classes, consider metrics like F1-score, precision, or recall.
- Multi-Class Problems: For multi-class classification, accuracy can be misleading if the model performs well on certain classes but poorly on others.
In such cases, look into Scikit-Learn metrics like f1_score
, precision_score
, and recall_score
, or use metrics like Cohen’s Kappa or the confusion matrix for a more nuanced view of model performance.
Conclusion
The accuracy_score
function in Scikit-Learn is a handy tool for quickly calculating accuracy, and it’s a reliable choice for balanced datasets and straightforward classification tasks. For cases where accuracy might not tell the full story, consider other metrics that account for class balance and other complexities.
Manually calculating accuracy is simple and can be useful for quick checks or to better understand the underlying calculation. However, in most cases, using accuracy_score
is more efficient and less error-prone, thanks to Scikit-Learn’s optimizations.
By understanding both approaches, you gain flexibility in your work with machine learning models, whether you’re using Scikit-Learn or working in a more controlled or lightweight environment without it.