Evaluation metrics
The performance measure of recommendation compares the each recommended item list to the ground truth items, and averages them over whole recommendation list. Here is an example
# Two recommendation, each of which has a collection of predicted item ids with descending order of scores.
recommends = [
[1, 2],
[4, 5]
]
# Ground truth item ids corresponding to each recommendation
ground_truth = [
[1],
[4, 5]
]
The Precision@2 for the first entry is $1/2=0.5$, while the second is $2/2=1$. Therefore, the mean Precision@2 is $0.75$.
In Recommenders.jl
, this computation is done by
Recommenders: MeanPrecision
prec2 = MeanPrecision(2) # metrics are implemented as callable struct
prec2(recommends, ground_truth)
# 0.75
Currently the following metrics are implemented. They are all descendent of MeanMetric
type.
Recommenders.MeanDCG
— MethodMeanDCG(k)
Create callbale struct to compute DCG@k averaged over all predictions. DCG@k is defined by
\[\mathrm{DCG@k} = \sum_{i=1}^{\mathrm{min}(k, \mathrm{length}(\text{prediction}))} \frac{2^{r_i}-1}{\log(i+1)}\,,\]
where $r_i$ is the true relevance for the i-th predicted item (binary for implicit feedback).
Example
dcg10 = MeanDCG(10)
dcg10(predictions, ground_truth)
Recommenders.MeanNDCG
— MethodMeanNDCG(k)
Create callbale struct to compute NDCG@k averaged over all predictions. NDCG@k is defined by
\[\mathrm{NDCG@k} = \frac{\mathrm{DCG}@k}{\mathrm{IDCG}@k}\]
where IDCG is the ideal DCG, prediction sorted by true relevance. Note that if the number of ground truth items is smaller than $k$, the predicted item list is truncated to that length.
Example
ndcg10 = MeanNDCG(10)
ndcg10(predictions, ground_truth)
Recommenders.MeanPrecision
— MethodMeanPrecision(k)
Create callbale struct to compute Precision@k averaged over all predictions. Precision@k is defined by
\[\mathrm{Precision@k} = \frac{|(\text{ground truth}) \cap (\text{top k prediction})|}{k}\]
Example
prec10 = MeanPrecision(10)
prec10(predictions, ground_truth)
Recommenders.MeanRecall
— MethodMeanRecall(k)
Create callbale struct to compute Recall@k averaged over all predictions. Recall@k is defined by
\[\mathrm{Recall@k} = \frac{|(\text{ground truth}) \cap (\text{top k prediction})|}{|(\text{ground truth})|}\]
Example
recall10 = MeanRecall(10)
recall10(predictions, ground_truth)