How I Ran an LLM on a 5-Year-Old Android Phone with LLaMA.cpp

preview_player
Показать описание
In this video, I show you how to run large language models (LLMs) locally on your Android phone using LLaMA.cpp, a framework that simplifies LLM deployment. I demonstrate this by running an LLM on a Xiaomi Redmi 8, a 5-year-old low-end phone with 4 GB RAM and Snapdragon 439 SoC. You will learn how to set up LLaMA.cpp, choose the right model for your device, and interact with the LLM using the LLaMA.cpp WebUI. By the end of this video, you will be able to run LLMs on any Android phone without relying on the cloud or internet connection.

Рекомендации по теме
Комментарии
Автор

Make sure to use the most recent version of LLaMA.cpp for optimal performance. If you encounter any issues, feel free to experiment with other available versions.

TechAceYoutube
Автор

You've really done it, mate! Something those huge companies couldn't pull off. Keep going strong, your journey's just begun!
-from Mohd Fahad

mbqszsb
Автор

been searching for detailed tutorials like this, thanks!

bismomr
Автор

It works!!! Amazing!! You are really a genius, I have no idea how you figured out how to do this but im really grateful that you shared this. I can hardly believe it, I have tiny llama running on my phone.

Dominic-wezi
Автор

Thanks, man. I've been looking for a good guide on this.

DefaultFlame
Автор

Great tutorial! Phi3 seems like it would shine in this scenario.

brando
Автор

low effort, high reward. great video!

DWJT_Music
Автор

Make sure you get the correct download link for the model. Ending in download=true otherwise it is not downloading anything useful but still creating a file. You will notice when at the end nothing happens when trying to run it.

VR_Wizard
Автор

Tiny Llama is not acurrate for multilanguage tasks.Thanks for all effort

said.skopal
Автор

Does the phone or `make` command install/use any additional software like CLBlast or Opencl?

dillon
Автор

Hi I installed llama b3416 the last version available. Then the same tiny llama in this video. I open de browser and is working! But... I installed in my old Samsung Galaxy Note 5! 😂. First cuestión? Hello can you answer my question? Ten minutes later nothing happened 😅. Tomorrow morning I check if is response my question 😂. Thanks for sharing this amazing video.

NestorMucci-zndd
Автор

I get "illegal instruction" when i type "./server". This before loading a model. I have not gotten to that stage yet. Please help. I am using s23 with snapdragon 8 gen 2 processor.

MrAlamichhane
Автор

What models would you use instead of tinyllama like for coding?

VexSpitta
Автор

Hi, how to make the same with llama3 and mistral7b?

alexandrunistoroiu
Автор

Thank you, but I have a question what if I uninstall Termux app...

skashiful
Автор

How do I set max_tokens? Whenever I stop the generation or reset, the next time I try to generate something it doesn't work and I have to start a new session

EDIT: I thought that was what was happening, but it just seems to stop randomly after a few prompts (using completion mode)

matheusazevedo
Автор

Bro for the lava model which version you recommend

ngemuyu
Автор

how to get api key from this to use it in another app???

mohamedali-endm
Автор

I'm having issues, I've a really powerful device but I get 0.5-4 tokens and I don't think that is right for what my device is capable of

vedforeal
Автор

Can i use hardware acceleration on it using vulkan?

blokkypixel