'Tuning Elasticsearch for English-Language Precision' by Erin McKean

preview_player
Показать описание
Everyone knows that English is weird, but just how weird it is becomes glaringly apparent once you try to build word-driven search. English is full of edge cases, irregularities, and just plain head-scratchers. Is "911" a word? How about "B2B"? How about "one-and-done", "look (something) up", or "AWOL"? How much do you have to know about language and linguistics to build something that covers the full range of possible English words? With Elasticsearch, you can create custom analyzers, tokenizers, and mappings that help you find "the right word", no matter how weird that word might be, and it's easier than you might expect. Using real-world examples from a large online dictionary project, you'll see how to juggle the tradeoffs between precision and recall, how to rank and score results, and how to push Elasticsearch to handle the full panoply of the English language. (And there might even be emoji!)

Erin McKean
IBM/WORDNIK

Speaker site
Рекомендации по теме