Query-time Nonparametric Regression with Temporally Bounded Models - Patrick Heck & David Smiley

preview_player
Показать описание
Patrick Heck, Needham Software & David Smiley, D W Smiley LLC
Presented at Activate 2018

Discussion and demonstration of an architecture that knits several pieces of Solr’s infrastructure together, with further detail into Solr’s new Time Routed Aliases (TRAs). The system is a machine learning system based on a non-parametric regression methodology taken from habitat ecology. The model is partially pre-calculated and stored in Solr so that it can can be assembled on the fly to recommend what documents a user may be interested in based on recent data. The definition of “recent” is defined by a Solr filter query. Solr TRAs are used to help scale and sunset old data from the system. Technologies discussed in this talk include predictive modeling, Solr streaming expressions, indexing with JesterJ, and Solr Time Routed Aliases (TRAs). The latter half of this presentation goes into some depth regarding TRAs,. TRAs are useful for avoiding performance degradation due to index growth in systems based on continuously acquired timestamped data (similar to the system presented). Both presenters helped build Solr’s TRA capability.

Рекомендации по теме