Agent-S : Unleash The Power Of GUI Computer Use Agents !

Показать описание

In this video, I look at the paper "Agent-S" and how it handles GUI agents and the components that are needed to make that work.

For more tutorials on using LLMs and building agents, check out my Patreon

🕵️ Interested in building LLM Agents? Fill out the form below

👨‍💻Github:

⏱️Time Stamps:
00:00 Intro
00:34 Agent-S Paper
01:03 Example Task
02:08 How it Works
04:49 Experience Augmented Hierarchical Planning
05:38 2 Types of Memory
06:56 Agent Computer Interface
10:27 Paper, Site & Code

Рекомендации по теме

Комментарии

Businesses often have bespoke apps that have user documentation but no API. I can see Agent S being fabulous for this type of thing.

Avman

Thanks Sam. It'll be interesting when you can fine-tune this on your domain specific apps, and what the fine-tuning process would look like.

kenchang

This sounds useful for automated testing to mimic how a user would behave then interact a web app or desktop app or even just carry workflow tasks as a chaos user.

steelwolf

awesome explanation sam. Can you do more of these videos of explaining papers really helps merge understanding between GA and scientific knowledge. Where do you find worthwhile papers, hugging face?

shiv_

I have done so many project after getting a lot knowledge from you. We need the new video on image generation model that can handle the text and facial and body problem

muhammadhasnain

Several months ago, when Rabbit R1 device was announced, there was another wave of "large action models" - an attempt of training or fine tuning transformers to do the UI interaction stuff. I wonder where did this eventually go? There were few quite promising products

alx

You mentioned something I strongly believe in. A generic solution is required. It's just a matter of time before website owners, apps, and platforms realize they need to create specific layers for AI agents and assistants. Rather than creating weird solutions to communicate with apps, it makes sense now when apps can't provide enough API data for AI applications. Website owners will likely have specific markdown with knowledge and instructions for AI, possibly developing a markup language for AI data. We can even include tools our websites or apps want AI to use. Like with robots.txt, website owners will define which parts AI can control. This isn't far off. Even for other products/services like books, musics or movies, authors can include that AI content layer. Until then, IMO we have patchwork solutions that aren't permanent but help understand the system's needs, weaknesses, and strengths.

unclecode

Thanks Sam. I'm learning everything by myself and I need help in identifying worthy recent research papers to study. How do you know which ones are good?

arungnanaable

Very interesting, does this compete with Microsoft UFO?

bombala

Create a video on image generation model plz

muhammadhasnain

Funny that the next day this video came out anthropic published their computer use API

megaklis.vasilakis

24 hours later... Anthropic brings out computer use.

davidmetekingi

I would be super concerned to allow anything to run directly on my desktop. It could see passwords, cryptographic keys, modify the registry, destroy the system.

pensiveintrovert

I don't trust AI enough to give it access to the files and apps on my computer.

micbab-vgmu

Agent-S : Unleash The Power Of GUI Computer Use Agents !

Agent-S : Unleash The Power Of GUI Computer Use Agents !

Unleash the Power: Step-by-Step Guide to Self-Hosted Agents in Azure DevOps

🚀 Unleash the Power of Microsoft Copilot Agents! 💻✨

Unleash the power of Local LLM's with Ollama x AnythingLLM

Unleash the Power: Top 5 Use Cases for Microsoft Autogen

🚀 Unleash Your AI Team: Multi-AI Agents Are Here! Build and Customize Your Mini-Me and AI Workforce!...

Unleash the power of AI phone agents with Bland ai

Unleash the Power of Social Media for Real Estate Success

Start building AI Agents in just 5 minutes under 2 cents using OpenAI Swarm Agent Framework

Agents Are Failing at Lead Gen, But Not You! Unleash the Power of YouTube!

Unleash Zabbix Power: Server and mass agent install! Part 6

Unleash the Power Androids Battle Machines in Epic SciFi Anime

Agents of Chaos: Unleash the Power of TFX in this Jammer Tutorial

Why You Need a Real Estate Hero: Unleash the Power of Expert Agents!

Unleash the Power of Gen AI with AnalytX4T

Unleash the Power of Compound Interest and Crush Your Opponents

Unleash the Power of AI: Become One of the First Certified Agents! www.GetAiCert.com

Green and Black Rangers Unleash Their Powers | Power Rangers Dino Fury | Power Rangers Official

Medicare Agents Unite Unleash the Power of Agent CRM for Growth #CRM #agentcrm #viral #crm #software

Unleash the power of Teams with Power Virtual Agents - Michael Plettner

Unleash the Power of AI for Real Estate Leads with Phil Stringer LIVE!

Maximizing Email Recruitment: Unleash the Power of Clickfunnels for Top Talent! 🚀

NinjaTrader 8 Line Tool: Unleash Its Trading Power

Unleash Listing Power: The Magic of a Single Sheet!'