filmov
tv
Florence-2: Fine-tune Microsoft’s Multimodal Model

Показать описание
Learn how to fine-tune Microsoft's Florence-2, a powerful open-source Vision Language Model, for custom object detection tasks. This in-depth tutorial guides you through setting up your environment in Google Colab, preparing datasets, and optimizing the model using LoRA.
Chapters:
- 00:00 Introduction: Unlock the Power of Florence-2
- 01:09 Getting Started: Prepare for VLM Fine-Tuning
- 03:55 Florence-2 in Action: Explore Pre-trained Capabilities
- 07:00 Dataset Deep Dive: PyTorch Data Loading for Florence-2
- 13:02 LoRA: Optimize Your VLM Training
- 14:21 Fine-Tuning: Unleash Florence-2's Custom Object Detection
- 17:30 Model Evaluation: Measure Your VLM's Success
- 21:37 Florence-2 vs Other Computer Vision Models
- 24:09 Conclusion and Next Steps
Resources:
Chapters:
- 00:00 Introduction: Unlock the Power of Florence-2
- 01:09 Getting Started: Prepare for VLM Fine-Tuning
- 03:55 Florence-2 in Action: Explore Pre-trained Capabilities
- 07:00 Dataset Deep Dive: PyTorch Data Loading for Florence-2
- 13:02 LoRA: Optimize Your VLM Training
- 14:21 Fine-Tuning: Unleash Florence-2's Custom Object Detection
- 17:30 Model Evaluation: Measure Your VLM's Success
- 21:37 Florence-2 vs Other Computer Vision Models
- 24:09 Conclusion and Next Steps
Resources:
OCR Using Microsoft's Florence-2 Vision Model on Free Google Colab
Florence 2 - The Best Small VLM Out There?
Florence-2 And Deepseek Coder v2 - Open Source LLM With Strong Vision And Logic Beats GPT4o
This free MIND BLOWING Workflow Just Changed Filmmaking
The next wave of AI Innovations for Startups by Microsoft & OpenAI
Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson
Microsoft Build Into Focus: AI | KEY06
Episode 67 - Nuance | Developing a Clinical Research Tool with Azure and Best of AI Show
Do Language Models Have a Critical Period for Language Acquisition? - ArXiv:2407.19325
This Embodied LLM is...
The AI Doctor Med-Palm M: Can it help or replace doctors?
Prompt Engineering: Prompt based learning in NLP
Do Language Models Have a Critical Period for Language Acquisition? - ArXiv:2407.19325
AN INTRODUCTION TO TRANSFER LEARNING IN NLP AND HUGGINGFACE
It's not just words: LLMs in Computer Vision
Snap4City una vista generale (ITA) parte 1, elemento 2 di 2, corso 2020
ActivityNet Event Dense-Captioning
CVPR #18541 - Workshop and Challenges for New Frontiers in Visual Language Reasoning
ChatGPT and Large Language Model: Achieving Human Like Conversational Intelligence
Generative Language Models in Molecular Discovery: Regression Transformer, GT4SD and Beyond
Google I/O 2023 Keynote - Pixel Fold, Pixel Tablet, PaLM 2
Technically Speaking (E13): Building a foundation for AI models
The Future of 24/7 Clean Energy driven by AI
Dialog - A Natural Language Generation Task
Комментарии