In Machine Learning Performance Metrics numbers have an important story to tell. They rely on you to give them a voice.

• Performance metrics Assess machine learning algorithms.
• Machine learning models are evaluated against your selected performance measures.
• It also helps in evaluating the efficiency and accuracy of machine learning models.

## Need for Performance Metrics

Regardless of you are a non-technical person in sales, marketing or operations. Or whether you belong to a technical background such as data science, engineering or development. It is equally important for everyone to understand how performance metrics work for machine learning.

There are so many different types of Machine Learning Algorithms, now the question comes, how do you rank machine learning algorithms or how can you pick one algorithm over the other or how do you measure and compare these algorithms.

Performance metric is the answer to these questions. It helps you to measure and compare the algorithms in the simplest forms.

## Key methods of Machine Learning Performance Metrics

• Confusion Matrix – It is one of the most intuitive and easiest metrics used to find correctness and accuracy. It is used for classification of problems where the output can be of two or more types of classes. The confusion matrix is in itself is not a performance matrix but almost all the performance matrices are based on a confusion matrix.
Terminologies used in Confusion matrixes.
1. True Positive – They are the cases where the actual class of the data point is 1(true) and the predicted is also 1(true). For example, the case where a person has cancer and the model classifies the case as cancer positive comes under true positive.
2. True Negative – They are the cases when the actual class of the data point is 0(false) and the predicted is also 0(false). It is negative because the class predicted was a negative one. For example, the case where a person does not have cancer and the model classifies the case as cancer negative comes under true negative.
3. False Positive – They are the cases when the actual class of the data point is 0(false) and the predicted is 1(true). It is false because the model has predicted incorrectly. For example, the case where a person does not have cancer and the model classifies the case as cancer positive comes under positive.
4. False Negative – They are the cases when the actual class of the data point is 1(true) and the predicted is 0(false). It is false because the model has predicted incorrectly, it is negative because the class predicted was a negative one. For example, the case where person has cancer and the model classifies the case as cancer negative comes under false negative.
• Accuracy – In classification problems, accuracy is defined by the total number of correct predictions made out of all the predictions.
• Precision – refers to the closeness of two or more measurements. Aims at deriving the correct proportion of positive identifications.
• Recall – Recall or sensitivity measures the proportion of actual positives that are correctly identified. Precision is about being precise, whereas recall is about capturing all the cases.
• Specificity – Measures = proportion of actual negatives that are correctly identified. Tries to identify the probability of a negative test result when input with a negative example.
• F1 Score – It’s a score based on Precision and Recall. (2*Precision*Recall)/(Precision + Recall)

#### Minimize False Cases

• A model is best identified by its accuracy.
• No rules are defined to identify false cases
• It depends on the business requirements and context of the problem.

#### Harmonic Mean

• The harmonic mean is an average used when x and y are equal.
• The value of the mean is smaller when x and y are different.

With reference to the fraud detection example, the F1 score can be calculated as:
F1 Score = (2*Precision*Recall)/(Precision + Recall)

For more information on Performance metrices, refer Mohammed Sunasra Performance matrix post.