# How Gauss would compute a Confusion matrix for their classification model

###### PUBLISHED ON MAY 19, 2022 / 2 MIN READ — LINEAR-ALGEBRA, ML, NUMPY, PYTHON

A confusion matrix is a practical and conceptually simple tool to evaluate a classification model. So we need to honour it with a simple way to compute it, like Gauss in the past, without the magic of from sklearn.metrics import confusion_matrix would do it with simple linear algebra operations:

A confusion matrix is the matrix multiplication by the true and predicted labels, both encoding as one-hot vectors.

If we have the true labels of 4 observations in vector $$\boldsymbol y = [1, 0, 2, 1]$$, and 3 different classes (i.e. 0, 1 and 2), their one-hot encoding will be:

$\boldsymbol T = \begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1\\ 0 & 1 & 0 \end{bmatrix}~\in~[0,1]^{4~\times~3}$

Some classification model gives us the predicted label for each observation in the vector $$\hat{\boldsymbol y} = [2, 0, 2, 0]$$, by the same logic above, the one-hot encoding will be:

$\hat{\boldsymbol T} = \begin{bmatrix} 0 & 0 & 1\\ 1 & 0 & 0\\ 0 & 0 & 1\\ 1 & 0 & 0 \end{bmatrix}~\in~[0,1]^{4~\times~3}$ We have everything to compute the confusion matrix and, it will be $$\boldsymbol T^{\top}\hat{\boldsymbol T}~\in~\boldsymbol Z_{0+}^{3\times3}$$. So again,

A confusion matrix is the matrix multiplication by the true and predicted labels, both encoding as one-hot vectors.

$\boldsymbol T^{\top}\hat{\boldsymbol T} = \begin{bmatrix} 1 & 0 & 0\\ 1 & 0 & 1\\ 0 & 0 & 1 \end{bmatrix}~\in~Z_{0+}^{3~\times~3} \\$

As you notice, the confusion matrix summarizes the information correctly of both vectors.

$\boldsymbol y = [1,0,2,1] \\ \hat{\boldsymbol y} = [2,0,2,0]$

• The sum of the diagonal elements tells us that two observations were correctly classified by the model
• The model correctly classified one observation for class 0
• The model didn’t assign any label to class 1, producing two errors
• The model confuses two class’ 1-observations, one with a 2 and the other with a 0, look at the second row

### Now with import numpy as np

We need two steps to compute our confusion matrix.

First, we need a way to transform a vector $$\boldsymbol v$$ with k-classes into their one-hot-encoding version, v_one_hot = one_hot_econding(v):

def one_hot_encoding(v):
'''Return the one-hot encoding vector for k-classes label vector'''
num_classes = np.unique(v).size
return np.eye(num_classes)[v]

Second, compute the confusion matrix, $$~\boldsymbol T^{\top}\hat{\boldsymbol T}~\in~\boldsymbol Z_{0+}^{K\times K}~$$, for k-classes; there are many ways of doing it with numpy as you can see in the following code. Below I used the canonical notation to name the true labels (y) and the predicted ones (y_pred):

# 1st option: Using the matrix multiplication '@' operator
one_hot_encoding(y).T @ one_hot_encoding(y_pred)

# 2nd option: Using np.dot()
np.dot(one_hot_encoding(y).T, one_hot_encoding(y_pred))

# 3rd option: Using np.matmul()
np.matmul(one_hot_encoding(y).T, one_hot_encoding(y_pred))

And we are done! Of course, you can always get your confusion matrix from your favourite store ;)

from sklearn.metrics import confusion_matrix
confusion_matrix(y, y_pred)

That’s the way computer talks to each other.