Evaluation metrics

The performance measure of recommendation compares the each recommended item list to the ground truth items, and averages them over whole recommendation list. Here is an example

# Two recommendation, each of which has a collection of predicted item ids with descending order of scores.
recommends = [
    [1, 2],
    [4, 5]
]
# Ground truth item ids corresponding to each recommendation
ground_truth = [
    [1],
    [4, 5]
]

The Precision@2 for the first entry is $1/2=0.5$, while the second is $2/2=1$. Therefore, the mean Precision@2 is $0.75$.

In Recommenders.jl, this computation is done by

Recommenders: MeanPrecision
prec2 = MeanPrecision(2) # metrics are implemented as callable struct
prec2(recommends, ground_truth)
# 0.75

Currently the following metrics are implemented. They are all descendent of MeanMetric type.

Recommenders.MeanDCG — Method

MeanDCG(k)

Create callbale struct to compute DCG@k averaged over all predictions. DCG@k is defined by

\[\mathrm{DCG@k} = \sum_{i=1}^{\mathrm{min}(k, \mathrm{length}(\text{prediction}))} \frac{2^{r_i}-1}{\log(i+1)}\,,\]

where $r_i$ is the true relevance for the i-th predicted item (binary for implicit feedback).

Example

dcg10 = MeanDCG(10)
dcg10(predictions, ground_truth)

source

Recommenders.MeanNDCG — Method

MeanNDCG(k)

Create callbale struct to compute NDCG@k averaged over all predictions. NDCG@k is defined by

\[\mathrm{NDCG@k} = \frac{\mathrm{DCG}@k}{\mathrm{IDCG}@k}\]

where IDCG is the ideal DCG, prediction sorted by true relevance. Note that if the number of ground truth items is smaller than $k$, the predicted item list is truncated to that length.

Example

ndcg10 = MeanNDCG(10)
ndcg10(predictions, ground_truth)

source

Recommenders.MeanPrecision — Method

MeanPrecision(k)

Create callbale struct to compute Precision@k averaged over all predictions. Precision@k is defined by

\[\mathrm{Precision@k} = \frac{|(\text{ground truth}) \cap (\text{top k prediction})|}{k}\]

Example

prec10 = MeanPrecision(10)
prec10(predictions, ground_truth)

source

Recommenders.MeanRecall — Method

MeanRecall(k)

Create callbale struct to compute Recall@k averaged over all predictions. Recall@k is defined by

\[\mathrm{Recall@k} = \frac{|(\text{ground truth}) \cap (\text{top k prediction})|}{|(\text{ground truth})|}\]

Example

recall10 = MeanRecall(10)
recall10(predictions, ground_truth)

source