IR Course Lecture 9: Index Compression

preview_player
Показать описание
Our inverted index can grow very large as we deal with web-scale data. In this lecture, we discuss some fundamental techniques to store the index so that we save some storage space. Specifically, we discuss the rule of 30, Zipf's Law and Heap's Law. We also discuss Dictionary as a String and Front-Coding techniques. Finally, I introduce the Variable Byte Encoding technqiue to store numbers efficiently.
Рекомендации по теме