Building ML Blocks with Kubeflow Orchestration with Feature Store //Aniruddha Choudhury// Meetup #72

preview_player
Показать описание
MLOps community meetup #72! Last Wednesday, we talked to Aniruddha Choudhury, Senior Data scientist at Publicis Sapient.

//Abstract
In this talk, Aniruddha is presenting the Kubeflow with feature store end-to-end training and serving the model, build the Lego blocks for online and offline data ingestion with spark and Kafka alongside Redis, monitor the metrics in Grafana and Prometheus, and the whole architecture building in GCP.

//Bio
Aniruddha is a Senior data scientist with experience over 5 years across various industry experiences for AI/ML journey. He is experienced in building the MLOps domain across various multi-cloud solutions for various AI synergies.

Aniruddha is a passionate AI researcher to transform the data into a machine learning end-to-end production journey to ease the life of Business users.

// Takeaways
How to set up Kubeflow in GCP and building Kubeflow pipeline.
Alongside how to use feast feature store and for training and serving.

// Other Links

--------------- ✌️Connect With Us ✌️ -------------
Follow us on Twitter: @mlopscommunity

Timestamps:
[00:00] Introduction to Aniruddha Choudhury
[02:56] Building blocks for productionizing ML projects with Feature Store in Kubeflow Orchestration
[03:16] Aniruddha's background in tech
[03:33] What the session will cover
[04:24] Touch base components
[05:02] Feature Store: Why?
[07:08] AI/ML Workflow Without Feature Store
[09:50] What is feast Feature Store?
[11:43] Five Components of Modern Feature Store
[13:48] FeatureLife Cycle with Feast
[15:01] Scenario: Churn Modeling
[16:17] What is Feast not?
[17:10] Feature Store Benefits
[18:29] High-Level Components
[19:20] Operational Flow of Feature Store with Feast
[21:23] Kubeflow + Kubernetes
[21:44] What and why Kubernetes?
[22:49] What is Kubeflow?
[24:27] Feature Store with Kubeflow
[25:10] Kubeflow with Feast Architecture
[26:55] Demo
[27:18] Demo Overview
[30:40] Kubeflow Environment
[32:12] Inference Layer
[32:17] High-Level Architecture
[34:08] Initialize all environment variables in the local
[35:24] Create a Staging Bucket
[37:41] Describe the entity
[39:37] Declare the identity
[40:51] Declare a schema
[42:40] Register the features of the data frame
[43:32] Check the schema chapter
[44:21] Ingest the data
[46:43] Start offline to online ingestion
[47:06] Declare the Kafka
[47:49] Declare the streaming source
[50:01] Save the data driver ID
[51:52] Train the data
[52:02] Save the artifacts
[52:53] Feature Store vs database where you store features
[55:36] Building the Kubeflow pipeline
[58:30] Initialize the environment and load the data from the artifacts
[59:22] Initialize feature store feast SDK and retrieve historical features
[59:59] Staging bucket
[1:00:35] Save the data in the file store
[1:01:50] Pass all the arguments
[1:02:25] Retrieve the data
[1:02:36] Run the model and save the logs
[1:03:00] Retrieve the CATIV artifacts
[1:04:08] Upload the model saved in local
[1:04:49] Publish the model waves to the GC buckets for prediction
[1:05:01] Go to the pipeline
[1:05:22] Max trial count
[1:07:30] Declare adjacent template
[1:08:13] Fit all trial specs into the trial template
[1:09:46] Feature Store retrieval
[1:10:27] Name the experiment
[1:10:46] Serialize
[1:11:21] Converting the results
[1:11:45] Evaluation
[1:13:14] Deployment
[1:14:10] Compile and upload the pipeline
[1:18:16] Predict the model to get the output
[1:21:10] Monitor the metrics
[1:22:19] Check the health metrics
Рекомендации по теме
Комментарии
Автор

Excellent session..amazing info..thanks

mevinodyou
Автор

a) I am trying to understand w.r.t online store. Say for example I have a customer 1 with a ton of transaction data and I have trained the model on those tranc data. Ideally, I will have these details in the feature store and the same will be in the online store. Now say, the same customer makes a transaction and I want to predict the legitimacy of the transaction. Will this entire transaction data be synced to feature store and materializing to an online store? in real-time to support the prediction of this data?
b) so the customer id and the event timestamp form the composite key?
c) the online store has the data pertaining to customers' recent transactions and the older data gets destroyed from the online store based on the ttl?

Appreciate your reply pls.

Ash-hznc