I used LLaMA 2 70B to rebuild GPT Banker...and its AMAZING (LLM RAG)

preview_player
Показать описание
👨‍💻 Sign up for the Full Stack course and use YOUTUBE50 to get 50% off:

🐍 Get the free Python course

Hopefully you enjoyed this video.

Learn how to use Llama 2 70B Chat for Retrieval Augmented Generation...for FINANCE! Albeit in a hella haphazard way, oh, we'll also build a streamlit app while we're at it.

Oh, and don't forget to connect with me!

Happy coding!
Nick
Рекомендации по теме
Комментарии
Автор

first off, respect for the hustle and the in-depth breakdown of integrating llama with other tools

really shows how much work goes behind the scenes

that said, not sure why everyone's so hyped about all these new models when sometimes simpler and older architectures can do the trick

but hey, if it's all about pushing boundaries and experimenting, you're killing it bro!

moondevonyt
Автор

You can’t load Llama2-70b on a single A100 GPU. Using full precision(float-32) would require 70billion * 4 bytes = 280GB of GPU memory. If you load it using float-16 it would reduce by half to 140GB. It finally worked cause you loaded it in int-8 which only requires 70GB of memory while the A100 has 80GB of GPU memory. If you wanted to load it in full/half precision you would need multiple GPUs and also need to leverage tensor parallelism whereby you slice the tensors across multiple GPUs.

yudhiesh
Автор

This video was great. You have created a format that is very entertaining to watch! 🙌 Subbed!

MikeAirforce
Автор

Always looking forward to your videos...
I've an MSc. In AI, but I still learn from you 👏🏼

princechijioke
Автор

Yeah, nice work!
I've been playing around RAG as well, I can relate to all roadblocks and pain points.
I'm trying to squeeze as much as possible so I can have a decent RAG, without any fancy GPU, consumer grade hardware running everything local, it's been fun/painful

splitpierre
Автор

NIcholas I love your videos and your way of making learning about ML/AI fun! In your next video can you please show us how to fine-tune a LLM model! Thanks for all the hard work you do on making these videos!

malice
Автор

I think "Amazing" falls short, the amount of knowledge, the fact that your using cutting edge Open source model and all of that in a really funny and light tone. Keep up the good work! I have a question, do you think is much harder to deploy that app into google cloud run compared with runpod?

juanpablopereira
Автор

Incredible stuff done... thank you Nich.

FunCodingwithRahul
Автор

please make a video on ocr on past question papers that can extract questions, and extract keywords and analyse with 10 years papers, and predicts upcoming questions

ShahJahan_NNN
Автор

Love this style of video. Fantastic content as always mate. You've given me some ideas to try out. Thanks :)

dacoda
Автор

Huge thanks for you videos. Nowadays I code, demonstrate, and perhaps lead AI, ML, DL, and RL development in 1300 + worker engineering and consulting company

I am combining technical analysis tools (fem, CFD, MBS…) with AI to generate new digital business cases

projecttitanium-slowishdriver
Автор

My computer is currently training a lora on stable 7b for natural language to (30k)python, and (30k)sql. I also Included (30k)orca questions so it dosent loose its abilities as a language model, and 20k sentiment analysis for new headlines. I would love to try this model with this as soon as Its done training.

Nick_With_A_Stick
Автор

Great tutorial! Can you also help do a tutorial on setting up runpod to host the application on it? Found that part to be a bit confusing and would love a more thorough walk thru. Thanks for all you do!

wayallen
Автор

2:40 😂
Thanks Nic the video is awesome ! 🤘🏽🤘🏽🤘🏽

kevynkrancenblum
Автор

'How to start a farm with no experience' - Hahaha, man, I just want to say that I love your sense of humour. Also, your videos are really useful for me, I'm an English teacher and I'm trying to build useful tools for my students. Thanks for your content.

Bliss_
Автор

It is possible that when you tried to load the pdf with SimpleDirectoryReader, it was skipping pages, because of the chunk size /embedding model you selected, the model you selected (all-MiniLM-L6-v2) is limited to 384 while the chunk you specified was 1024, maybe and just maybe, that is why I think it was skipping pages, because it was unable to load all the chunk in the embedding model

hebjies
Автор

Taking the viewers along the development and debugging ride is a cool style

ShaneZarechian
Автор

Love your videos Nicholas. Watching this with my morning coffee, a few chuckles, and a bunch of "ooohhh Your vid bridged a bunch of gaps in my knowledge.

Gonna be implementing my own RAG now 😎👍

richardbeare
Автор

Really great content, you might have the most effective style I’ve ever seen. Well done. I can’t remember which video I saw where you spoke about your hardware setup. It’s cloud based isn’t it?

ba
Автор

Nick this is insanely good, thank you for the effort

shipo