filmov
tv
Fast intro to multi-modal ML with OpenAI's CLIP
![preview_player](https://i.ytimg.com/vi/989aKUVBfbk/maxresdefault.jpg)
Показать описание
OpenAI's CLIP is "multi-modal" model capable of understanding the relationships and concepts between both text and images. As we'll see, CLIP is very capable, and when used via the Hugging Face library, could not be easier to work with.
📕 Article:
📖 Friend Link (free access):
🤖 70% Discount on the NLP With Transformers in Python course:
🎉 Subscribe for Article and Video Updates!
👾 Discord:
00:00 Intro
00:15 What is CLIP?
02:13 Getting started
05:38 Creating text embeddings
07:23 Creating image embeddings
10:26 Embedding a lot of images
15:08 Text-image similarity search
21:38 Alternative image and text search
📕 Article:
📖 Friend Link (free access):
🤖 70% Discount on the NLP With Transformers in Python course:
🎉 Subscribe for Article and Video Updates!
👾 Discord:
00:00 Intro
00:15 What is CLIP?
02:13 Getting started
05:38 Creating text embeddings
07:23 Creating image embeddings
10:26 Embedding a lot of images
15:08 Text-image similarity search
21:38 Alternative image and text search
Fast intro to multi-modal ML with OpenAI's CLIP
OpenAI CLIP Explained | Multi-modal ML
All Machine Learning Models Explained in 5 Minutes | Types of ML Models Basics
Multi-Modal ML With Financial Text and Tabular Data
Explained In A Minute: Neural Networks
Deep Dive into Multimodal Embeddings Part 1&2
K Nearest Neighbors | Intuitive explained | Machine Learning Basics
Transformers, explained: Understand the model behind GPT, BERT, and T5
Multimodal Generative AI for Precision Health
PyTorch in 100 Seconds
Multimodal Machine Learning at Scale: Democratizing AI for Academic Research
Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.
LLM Explained | What is LLM
MiniGPT-4 - Multimodal model handling images and text
Building Multi-Modal Search with Vector Databases
Hands-on with Multi-Modal Machine Learning and Predicting Customer Reviews
Michał Nowicki - Multi-Model Mobile Robot Localization | ML in PL 23
Go in 100 Seconds
Multimodal Image-text Classification
Foundation Models: An Explainer for Non-Experts
Random Forest Algorithm Clearly Explained!
Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision
WACV18: Fast Self-Attentive Multimodal Retrieval
Optimizing FastAPI for Concurrent Users when Running Hugging Face ML Models
Комментарии