MSCI 541 : BM25

preview_player
Показать описание
As presented in this video, BM25 can return negative values if we have very frequent terms, or a doc with only very frequent terms.  One solution to this is to compute IDF by adding 1 before taking the log:

log( (N-n_i+0.5)/(n_i+0.5) + 1) 

You can see other approaches and formulations of BM25 here: 

Рекомендации по теме