Linkedin's New Search Engine | DeText: A Deep Text Ranking Framework with BERT | Deep Ranking Model

preview_player
Показать описание
This video explains Linkedin's latest ranking models (deep learning based) and how they deployed this model to production. Deploying deep learning models is a very challenging thing to do, especially at a scale like Linkedin. This is a really practical and useful paper. If you also want to build a semantic search engine, make sure you check it out!

Connect

0:00 - Intro
2:50 - What is a search engine
3:11 - Search v.s Ranking
4:08 - Representation-based ranking
8:12 - Interaction-based ranking
9:28 - 3 search verticals on Linkedin
11:30 - DeText framework
13:48 - Interaction layer
15:22 - Traditional feature processing
16:55 - Learning-to-rank layer
18:38 - DeText-BERT for ranking
20:56 - Linkedin data for BERT pre-training
22:31 - Document embedding pre-computing
23:41 - 2-pass ranking (DeText-CNN)
25:-00 - Experiment settings
26:17 - Training data
27:36 - How good DeText is
30:18 - General BERT v.s In-domain BERT
32:42 - Traditional feature ablation study
33:52 - Metrics for online experiments
36:02 - BERT v.s CNN
37:42 - 99th percentile latency
39:29 - Summary

Related Video:
Neural Information Retrial | REALM: Retrieval-Augmented Language Model Pre-training

Paper: DeText: A Deep Text Ranking Framework with BERT Weiwei

Abstract
Ranking is the most important component in a search system. Most search systems deal with large amounts of natural language data, hence an effective ranking system requires a deep understanding of text semantics. Recently, deep learning based natural language processing (deep NLP) models have generated promising results on ranking systems. BERT is one of the most successful models that learn contextual embedding, which has been applied to capture complex query-document relations for search ranking. However, this is generally done by exhaustively interacting each query word with each document word, which is inefficient for online serving in search product systems. In this paper, we investigate how to build an efficient BERT-based ranking model for industry use cases. The solution is further extended to a general ranking framework, DeText, that is open sourced and can be applied to various ranking productions. Offline and online experiments of DeText on three real-world search systems present significant improvement over state-of-the-art approaches.
Рекомендации по теме
Комментарии
Автор

This is great! You have a lot of practical knowledge, clearly from production experience I would guess.

_tnk_