Deploy FULLY PRIVATE & FAST LLM Chatbots! (Local + Production)

preview_player
Показать описание
In this video, I'll show you how you can deploy and run large language model (LLM) chatbots locally. The steps followed are also valid for production environment and the tutorial is also production ready! By the end of the tutorial, you will be running an LLM like Falcon-7B (or 40B or any LLM) locally and you would have also deployed a chat interface to use the local llm and chat with it!

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

Follow me on:
Рекомендации по теме
Комментарии
Автор

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

abhishekkrthakur
Автор

Easy is highly subjective here. Easy if you are a docker expert makes more sense. You have to go through about 20 steps in windows to get WSL 2 working, then docker, then permissions, then setup cuda toolkit in the docker Linux distro, then you have to test that is operating, make sure docker can connect to your Nvidia instance if using cuda, then you can start the download and setup text-gen then you can setup the chat-UI, - So plan to spend a good 5 hours+ on this. It's exciting, but a lot of new people are learning this, so details are important.

prestonmccauley
Автор

This is amazing ! How do we make something for personal data using may be llama-index or something for any form of internal data but sane speed as this? I am not able to get it to this speed.

vigneshpadmanabhan
Автор

Going through this now, had to add --platform linux/amd64 to the docker command to get it to run on my Mac M1. This macbook gives me loads of issues....

pancham_b
Автор

Thanks for making this video. Can I know what GPUs are you using?

MichaelTanOfficialChannel
Автор

Please do a tutorial on multiple Instance learning.

nikhilthapa
Автор

Good video, but for beginner it is all bounced. Anyone can post any link which will a complete step by step process. I mean I have the Visual studio code download but i cannot figure how sir went into that window where he is terminal. Sorry from non tech side.

qjmyxfn
Автор

Is there any way to train a LLM to get insights from a tabular dataset

Knight-Walker
Автор

minimum system specifications for running this

encnxyg
Автор

Dear Abhishekh, this is a really amazing video, but most of these things are done using GPUs. However, I don't a machine with GPU, the only source is google colab or Kaggle notebooks. Could you please make a video for such chat bot using google colab or kaggle notebooks. Thanks.

drsohailahmed
Автор

I have attempted to download the docker image multiple times. Despite the fast internet connection, it failed to download the model files. Therefore, I am downloading the model files directly from huggingface. Where to put the model files? I am downloading falcon-7b-instruct and ...02-of-00002.bin. Is it ok to put it in $PWD/data or should I create any subfolder?

nikoG
Автор

Does this chat ui already have "continue message" like chat gpt does when it pass 2048 token limit ? if not is it possible to use langchain to increment it or improve the model with vector db or other options ?

odev
Автор

How can I Train this with my own Documents?

shizzm
Автор

Also, would that be usable on an Intel Mac?

RG-ikkw
Автор

Hey, what's the configuration of your local machine?!

sajeevyadav
Автор

what are the hardware requirements for this ?

Vexxter
Автор

Have you tried connecting the self-hosted model with internet through langchain? Im trying to build a private chatbot to help me plan my holiday.

sugiantolauw
Автор

Is there a doc specifying which models can be used? You've shown llama2 but its name was of some model on hugging face. How does one know which models are supported and where to get the list of all the supported model names

gunnvant
Автор

You should also create youtube shorts will be helpful for the growth of this channel

dataflex
Автор

Can I do all these in colab? Please help me with it.

towfiq