filmov
tv
How can LLMs improve Vision AI? OCR, Image & Video Analysis
Показать описание
Combine vision and language in an AI model with the latest vision AI model in Azure Cognitive Services. Use natural language to fetch visual content in images and videos without needing metadata or location, generate automatic and detailed descriptions of images using the model’s knowledge of the world, and use a verbal description to search video content.
Cognitive Service for Vision AI combines both natural language models (LLM) with computer vision and is part of the Azure Cognitive Services suite of pre-trained AI capabilities. It can carry out a variety of vision-language tasks including automatic image classification, object detection, and image segmentation. Similar to GPT, the foundational language model, Project Florence, used in this case infuses deeper language skill with vision analytics to make training, inferencing and interacting with your image and video content simpler using natural language.
Azure Expert, Matt McSpirit shares how to customize the model and use these capabilities in your own apps.
► QUICK LINKS:
00:00 - Introduction
00:48 - Project Florence
01:52 - Open-world recognition
03:19 - Dense captioning
04:23 - Run frame analysis
05:02 - Train a custom model
06:29 - Build custom apps
07:41 - Wrap up
► Link References:
► Unfamiliar with Microsoft Mechanics?
As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
► Keep getting this insider knowledge, join us on social:
#LLM #CognitiveServices #OpenAI #Azure #chatgpt
Cognitive Service for Vision AI combines both natural language models (LLM) with computer vision and is part of the Azure Cognitive Services suite of pre-trained AI capabilities. It can carry out a variety of vision-language tasks including automatic image classification, object detection, and image segmentation. Similar to GPT, the foundational language model, Project Florence, used in this case infuses deeper language skill with vision analytics to make training, inferencing and interacting with your image and video content simpler using natural language.
Azure Expert, Matt McSpirit shares how to customize the model and use these capabilities in your own apps.
► QUICK LINKS:
00:00 - Introduction
00:48 - Project Florence
01:52 - Open-world recognition
03:19 - Dense captioning
04:23 - Run frame analysis
05:02 - Train a custom model
06:29 - Build custom apps
07:41 - Wrap up
► Link References:
► Unfamiliar with Microsoft Mechanics?
As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
► Keep getting this insider knowledge, join us on social:
#LLM #CognitiveServices #OpenAI #Azure #chatgpt
Комментарии