I Ran Advanced LLMs on the Raspberry Pi 5!

preview_player
Показать описание
Honestly, I'm shocked...

Product Links (some are affiliate links)

Local Model Management

Mistral7B Model

Hardware

For Text to Speech (WaveNet)

🚀 Dive into the fascinating world of small language models with our latest video! We're pushing the boundaries of tech by running various open-source LLMs like Orca and Phi on the new Raspberry Pi 5, a device that's both powerful and affordable.

🤖 Discover the capabilities of GPT-4 and its massive 1.7T parameters, and see how we creatively use a Raspberry Pi 5 to explore the potential of smaller, more accessible models. We're not just talking about theories; we're running live demos, showing you the models in action, and even making them 'talk' using Wavenet text-to-speech technology.

🔍 We're testing every major LLM available, including the intriguing Mistral 7B, and examining their speed and efficiency on compact hardware. This exploration covers a range of practical questions, from the possibility of accelerating performance with edge TPUs to the feasibility of running these models on a cluster of Raspberry Pis.

📡 Experience the implications of 'jailbroken' LLMs, the privacy of interactions, and the possibility of a future where the power of LLMs is harnessed locally on everyday hardware. Plus, we address some of your burning questions like, "Who was the second person to walk on the moon?" and "Can you write a recipe for dangerously spicy mayo?"

🛠️ Whether you're a tech enthusiast, a Raspberry Pi hobbyist, or simply curious about the future of AI, this video has something for you. We've included a step-by-step guide in the description for those who want to follow along, and we're exploring the potential of these models for commercial use and research.

✨ Join us on this journey of discovery and innovation as we demonstrate the power of language models on the Raspberry Pi 5. It's not just a tutorial; it's a showcase of capabilities that might just change the way you think about AI in everyday technology!

🔗 Check out our detailed guide and additional resources in the description below. Don't forget to like, share, and subscribe for more tech adventures!"
Рекомендации по теме
Комментарии
Автор

Viewers should probably note that the actual text generation is much slower and the video is sped up (look at the timestamps) massively. This is particularly true for the multimodal models like LLaVA which can take a couple of minutes to produce that output. These outputs are also quite cherry picked, a lot of the time, these quantized models can give garbage outputs.

Not to mention most of the script of this video is AI generated...

Illusion_____
Автор

Clickbait I came here to check the display lol

slabua
Автор

Llama 2 got the 1952 POTUS question wrong. Harry S. Truman was POTUS in 1952. Eisenhower won the 1952 election, but wasn’t inaugurated until 1953. Small, but an important detail to note.

robfalk
Автор

nice video!

a little correction at 10:41, privateGPT doesn't train a model on your documents but does something called RAG — basically, smartly searches through your docs to find the context relevant to your query and pass it on to the LLM for more factually correct answers!

sudhamjayanthi
Автор

The fact that these can run on Raspberry Pi is crazy. I always assumed you needed a pretty beefy GPU to do any of this

garrettrinquest
Автор

Got about 14 LLM's running on my Pi5. This is the vid that started my dive down tge AI rabbit hole. You can have multiple Ollama/LLMs running at once as long as only one is answering a prompt.

AlwaysCensored-xpbe
Автор

Is it just me, or is bro recording this while a little baked?

flatujalok
Автор

You can run the 13b models with 8GB RAM. Just add swap file in Linux of e.g 10GB. It's slower, but will still run with ollama and other variants.

olealgoritme
Автор

Thanks for the video. I have also been experimenting with various LLMs on the Pi5, locally. Have best results with Ollama so far. I am also running these pis on battery power for robotic, mobile use. I am pretty close to successfully integrating local speech to text, LLM & text to speech using 2 pi5s, including animatronics. Fun stuff.

whitneydesignlabs
Автор

This is absolutely fascinating! Thank you so much for sharing. It was just 1 year ago we were blown away by this multi billion dollar tech that now can run on a small raspberry pi. It's an amazing exploration you did here. Please continue for good.

sentinelaenow
Автор

you should mention they are quantized and are pretty bad, not only that but they would take several minutes to reply vrs less than 10sec on a medium gpu

RampagingCoder
Автор

Llama2, you missed one question. 7:21 the US president in 1952 was NOT Dwight David Eisenhower; it was Harry S. Truman. Eisenhower won the election in November 1952, and was then inaugurated on January 20, 1953.

TigerPaw
Автор

This presents a very intetesting use case.

Is it possible to feed technical manuals into one these models, and then ask them specific questions about the content of the manuals?

It would be really neat if you could take a picture of an error code from a machine, send that pic to the AI model and then have it provide information about the errors or faults

dilboteabaggins
Автор

Wow, never thought a Pi could perform. I was thinking of trying this with a Jetson

erniea
Автор

Great video but huge shame you didn't show how long they each take to process, before responding....

davidmiscavige
Автор

It would be fascinating to work out a way to cause multiple small edge computers hosting LLMs to work in synchrony. A cluster of Pi 5 SBC's could narrow the memory gap required to run larger models, providing more accurate responses if not measurably better performance. There would be a lot of tradeoffs for sure, since the bulk of these currently seem to be created to run within a monolithic structure (composed of massively parallel hardware GPUs) which does not lend itself as well to "node-based" distributed computing on consumer-grade processing and networking hardware, so I wonder if the traffic running across the network meshing multiple processors would create bottlenecks, and if these could operate on a common data store to eliminate attempts to "parse" and distribute training data among nodes?

I have the feeling that the next step toward AGI will involve using generative models in "reflective layers" anyway, using adversarial models to temper and cross-check responses before they are submitted for output, and perhaps others "tuned to hallucinate" to form a primitive "imagination", which perhaps could form the foundation for "synthesizing" new "ideas", for deep analysis and cross-checking of assumed "inferences", and potentially for providing "insights" toward problem solving where current models fall short.

As one of my favorite YouTube white-paper PHDs always says, "What a time to be alive!"

Thanks for a great production!

mbunds
Автор

I watched this three times lol I love this. Thank you for this and Do you think the raspberry pi 5 is the best single board for the job or would the zimaboard compare just as good if not better

Also since you had repurposed a wifi adapter would you have an idea how tear down old pcs and laptops and combine the hardware to create the vram needed for an upgrade like this? Probably more complex then it needs to be but got a whole bunch of old computers with junk processors and low ram etc to today's standards buy feel like you could repurpose alot of the stuff with a different board or if we just flashed windows off😂 and used the giant mother board and maybe even part of the laptop screen or something idk lol. A way to combine multiple processors or something to create a Frankenstein that works well lol

Or another side project to make a control box for golf simulator. Basically just buttons to map to a keyboard and have a decent housing for the thing. Maybe your box Is for an arcade emulator box or something or controls your smart home or sound set up idk 🤷‍♂️

Derick
Автор

The scientific revolution in the area of advanced mathematics and algorithms is just amazing these days. ❤❤❤

ThomasConover
Автор

Oh, that’s just awesome. Edge AI. Just confirm if you would… the google voice was not generated in real time with an webhook or API, right?

michaelzumpano
Автор

It would be useful if we can add more ram to the pi5's m.2 slot so we can run the 13B models

snopz
join shbcf.ru