How to perform in-database text analytics using a pre-built ML model

Показать описание

Did you know that you can perform document similarity, topic modeling, and term/document classification in-database using a pre-built machine learning model that’s based on millions of Wikipedia articles? This pre-built model was produced using the Explicit Semantic Analysis (ESA) algorithm. ESA models do not find latent features, but rather use explicit features represented in the knowledge base, which in this case is a selected set of Wikipedia articles. Preparing data and training such a model can be involved, so the model being pre-built streamlines that effort. We’ll be demonstrating using the Wikipedia model through the soon-to-be-released OML4R interface on Oracle Autonomous Database. Using OML4R, we show how to create a model proxy object for this pre-build Wiki model and use it for document similarity and document classification use cases.