Ingesting and Querying Netflix Data using Chroma DB | Step-by-Step Python Tutorial

Показать описание

Join me as we dive into the world of data engineering with Python! In this step-by-step tutorial, we'll be using the Netflix Movies and TV Shows CSV dataset to build a powerful search engine using Chroma DB. Whether you're a data enthusiast, a budding data scientist, or someone interested in database applications, this video is packed with valuable insights and practical Python coding!

📊 What You'll Learn:

Data Ingestion: How to import 1000 movies and TV shows from the Netflix dataset into a Jupyter notebook.
Data Enrichment: Creating enriched information strings that capture title, type, and category for use as documents.
Database Operations: Ingesting documents into Chroma DB, using methods like count(), get(), and peek() to manipulate and view data.
Advanced Metadata: Enhancing your search capabilities by adding metadata like release type, country, and release year.
Performing Queries: Making complex search queries to find specific movies and shows based on your criteria.
Clean-Up: Safely deleting collections from Chroma DB once you're done.

🔧 Tools & Technologies Used:

Python
Jupyter Notebook
Chroma DB
Netflix Movies and TV Shows CSV

👨‍💻 Follow Along: Grab your dataset and open up your Jupyter notebook! This tutorial is designed for viewers to code along and build their own search system by the end of the video.

💡 Why This Video?

Real-World Application: Learn how to handle real datasets and build functionalities that are highly sought after in the tech industry.
Interactive Learning: Step-by-step coding instructions make it easy for you to follow along and learn at your own pace.
Advanced Techniques: Explore how enriched metadata can transform your search capabilities and make your data projects more dynamic.

Chapters
(00:00) Recap and Scope
(02:52) Downloading Netflix Dataset
(03:15) Reading CSV with Pandas
(04:51) Dropping NaN rows.
(07:00) Creating Enriched Netflix Info
(09:30) get_collection in ChromaDB
(10:45) upsert usage
(11:40) Querying the ingested documents.
(17:20) Why do we need Metadata?
(19:00) Adding Metadata to documents
(22:28) Where clause query
(24:50) And Logical Operator in Query
(26:31) gte operator in chromadb queries.
(28:05) OR logical operator in Query
(29:00) Deleting documents in ChromaDB
(29:35) Deleting collection using delete_collection

👍 Like, Comment, and Subscribe: If you find this tutorial helpful, please like, comment, and subscribe for more content like this! Your support helps me create more tutorials that help you enhance your skills.