Replace eval metric with lenskit TopN #15

sophiasun0515 · 2024-06-05T18:53:03Z

Prior values for first 10 users:

evaluation using the first 1 is NDCG@5 = 0.5, NDCG@10 = 0.5, RR = 0.3333333333333333
evaluation using the first 2 is NDCG@5 = 0.3391602052736161, NDCG@10 = 0.41039156802332444, RR = 0.2142073313555942
evaluation using the first 3 is NDCG@5 = 1.0, NDCG@10 = 1.0, RR = 1.0
evaluation using the first 4 is NDCG@5 = 0.9197207891481876, NDCG@10 = 0.9197207891481876, RR = 0.6666666666666666
evaluation using the first 5 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.038461538461538464
evaluation using the first 6 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.030303030303030304
evaluation using the first 7 is NDCG@5 = 0.38685280723454163, NDCG@10 = 0.38685280723454163, RR = 0.2
evaluation using the first 8 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.02658371040723982
evaluation using the first 9 is NDCG@5 = 0.43067655807339306, NDCG@10 = 0.43067655807339306, RR = 0.25
evaluation using the first 10 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.03125

Lenskit eval values for first 10 users:

evaluation using the first 1 is ndcg5 = 0.6309297535714575, ndcg10 = 0.6309297535714575, mrr = 1.0
evaluation using the first 2 is ndcg5 = 0.2807721888661444, ndcg10 = 0.35123899361230887, mrr = 1.0
evaluation using the first 3 is ndcg5 = 1.0, ndcg10 = 1.0, mrr = 1.0
evaluation using the first 4 is ndcg5 = 0.8154648767857288, ndcg10 = 0.8154648767857288, mrr = 1.0
evaluation using the first 5 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0
evaluation using the first 6 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0
evaluation using the first 7 is ndcg5 = 0.43067655807339306, ndcg10 = 0.43067655807339306, mrr = 1.0
evaluation using the first 8 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0
evaluation using the first 9 is ndcg5 = 0.5, ndcg10 = 0.5, mrr = 1.0
evaluation using the first 10 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0

mdekstrand · 2024-06-05T20:02:25Z

The value discrepancy is probably because LensKit and the original code use different ways of handling the first item ($\mathrm{log}_2(i+1)$ vs. $\mathrm{log}_2(\mathrm{max}(i, 2))$, where i is 1-based). Both have precedent in the literature; the max approach was used in the original nDCG paper, but i+1 is widely used. This can be configured by replacing the discount parameter to ndcg:

def _discount_log1p(ranks):
    return np.log2(ranks + 1)

and then:

topn.ndcg(recs, truth, discount=_discount_log1p, k=??)

Future LensKiit will provide a +1 log option (lenskit/lkpy#417).

karlhigley · 2024-06-05T20:41:24Z

src/poprox_recommender/test_json.py

+    # single_rr = compute_mrr(recommended_list, impressions_truth)
+    # single_ndcg5 = compute_ndcg(recommended_list, impressions_truth, 5)
+    # single_ndcg10 = compute_ndcg(recommended_list, impressions_truth, 10)


We have this code in the commit history, so it's safe to remove instead of leaving it commented out because we can still retrieve it later if we need it:

Suggested change

# single_rr = compute_mrr(recommended_list, impressions_truth)

# single_ndcg5 = compute_ndcg(recommended_list, impressions_truth, 5)

# single_ndcg10 = compute_ndcg(recommended_list, impressions_truth, 10)

# single_rr = compute_mrr(recommended_list, impressions_truth)

# single_ndcg5 = compute_ndcg(recommended_list, impressions_truth, 5)

# single_ndcg10 = compute_ndcg(recommended_list, impressions_truth, 10)

We can also remove the old compute_ndcg/mrr functions.

Prior values for first 10 users: ``` evaluation using the first 1 is NDCG@5 = 0.5, NDCG@10 = 0.5, RR = 0.3333333333333333 evaluation using the first 2 is NDCG@5 = 0.3391602052736161, NDCG@10 = 0.41039156802332444, RR = 0.2142073313555942 evaluation using the first 3 is NDCG@5 = 1.0, NDCG@10 = 1.0, RR = 1.0 evaluation using the first 4 is NDCG@5 = 0.9197207891481876, NDCG@10 = 0.9197207891481876, RR = 0.6666666666666666 evaluation using the first 5 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.038461538461538464 evaluation using the first 6 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.030303030303030304 evaluation using the first 7 is NDCG@5 = 0.38685280723454163, NDCG@10 = 0.38685280723454163, RR = 0.2 evaluation using the first 8 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.02658371040723982 evaluation using the first 9 is NDCG@5 = 0.43067655807339306, NDCG@10 = 0.43067655807339306, RR = 0.25 evaluation using the first 10 is NDCG@5 = 0.0, NDCG@10 = 0.0, RR = 0.03125 ``` Lenskit eval values for first 10 users: ``` evaluation using the first 1 is ndcg5 = 0.6309297535714575, ndcg10 = 0.6309297535714575, mrr = 1.0 evaluation using the first 2 is ndcg5 = 0.2807721888661444, ndcg10 = 0.35123899361230887, mrr = 1.0 evaluation using the first 3 is ndcg5 = 1.0, ndcg10 = 1.0, mrr = 1.0 evaluation using the first 4 is ndcg5 = 0.8154648767857288, ndcg10 = 0.8154648767857288, mrr = 1.0 evaluation using the first 5 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0 evaluation using the first 6 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0 evaluation using the first 7 is ndcg5 = 0.43067655807339306, ndcg10 = 0.43067655807339306, mrr = 1.0 evaluation using the first 8 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0 evaluation using the first 9 is ndcg5 = 0.5, ndcg10 = 0.5, mrr = 1.0 evaluation using the first 10 is ndcg5 = 0.0, ndcg10 = 0.0, mrr = 1.0 ```

Fix paths to the safetensors files

replace eval metric with lenskit topn, found value inconsistency

081d0cf

sophiasun0515 requested review from imrecommender, karlhigley and mdekstrand June 5, 2024 18:53

karlhigley reviewed Jun 5, 2024

View reviewed changes

karlhigley approved these changes Jun 5, 2024

View reviewed changes

sophiasun0515 added 2 commits June 6, 2024 01:41

Remove old functions and comments

c229b06

rename offline test file

e95687e

sophiasun0515 changed the title ~~Replace eval metric with lenskit TopN, but found value inconsistency~~ Replace eval metric with lenskit TopN Jun 6, 2024

karlhigley merged commit 108b7d6 into main Jun 6, 2024

karlhigley pushed a commit that referenced this pull request Feb 14, 2025

Merge pull request #15 from zentavious/karl/fix/safetensors-paths

06465da

Fix paths to the safetensors files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace eval metric with lenskit TopN #15

Replace eval metric with lenskit TopN #15

sophiasun0515 commented Jun 5, 2024

mdekstrand commented Jun 5, 2024 •

edited

Loading

karlhigley Jun 5, 2024

mdekstrand Jun 5, 2024

Replace eval metric with lenskit TopN #15

Replace eval metric with lenskit TopN #15

Conversation

sophiasun0515 commented Jun 5, 2024

mdekstrand commented Jun 5, 2024 • edited Loading

karlhigley Jun 5, 2024

Choose a reason for hiding this comment

mdekstrand Jun 5, 2024

Choose a reason for hiding this comment

mdekstrand commented Jun 5, 2024 •

edited

Loading