Deploy FULLY PRIVATE & FAST LLM Chatbots! (Local + Production)

Показать описание

In this video, I'll show you how you can deploy and run large language model (LLM) chatbots locally. The steps followed are also valid for production environment and the tutorial is also production ready! By the end of the tutorial, you will be running an LLM like Falcon-7B (or 40B or any LLM) locally and you would have also deployed a chat interface to use the local llm and chat with it!

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

Follow me on:

Рекомендации по теме

Комментарии

Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)

abhishekkrthakur

Easy is highly subjective here. Easy if you are a docker expert makes more sense. You have to go through about 20 steps in windows to get WSL 2 working, then docker, then permissions, then setup cuda toolkit in the docker Linux distro, then you have to test that is operating, make sure docker can connect to your Nvidia instance if using cuda, then you can start the download and setup text-gen then you can setup the chat-UI, - So plan to spend a good 5 hours+ on this. It's exciting, but a lot of new people are learning this, so details are important.

prestonmccauley

This is amazing ! How do we make something for personal data using may be llama-index or something for any form of internal data but sane speed as this? I am not able to get it to this speed.

vigneshpadmanabhan

Going through this now, had to add --platform linux/amd64 to the docker command to get it to run on my Mac M1. This macbook gives me loads of issues....

pancham_b

Thanks for making this video. Can I know what GPUs are you using?

MichaelTanOfficialChannel

Please do a tutorial on multiple Instance learning.

nikhilthapa

Good video, but for beginner it is all bounced. Anyone can post any link which will a complete step by step process. I mean I have the Visual studio code download but i cannot figure how sir went into that window where he is terminal. Sorry from non tech side.

qjmyxfn

Is there any way to train a LLM to get insights from a tabular dataset

Knight-Walker

minimum system specifications for running this

encnxyg

Dear Abhishekh, this is a really amazing video, but most of these things are done using GPUs. However, I don't a machine with GPU, the only source is google colab or Kaggle notebooks. Could you please make a video for such chat bot using google colab or kaggle notebooks. Thanks.

drsohailahmed

I have attempted to download the docker image multiple times. Despite the fast internet connection, it failed to download the model files. Therefore, I am downloading the model files directly from huggingface. Where to put the model files? I am downloading falcon-7b-instruct and ...02-of-00002.bin. Is it ok to put it in $PWD/data or should I create any subfolder?

nikoG

Does this chat ui already have "continue message" like chat gpt does when it pass 2048 token limit ? if not is it possible to use langchain to increment it or improve the model with vector db or other options ?

odev

How can I Train this with my own Documents?

shizzm

Also, would that be usable on an Intel Mac?

RG-ikkw

Hey, what's the configuration of your local machine?!

sajeevyadav

what are the hardware requirements for this ?

Vexxter

Have you tried connecting the self-hosted model with internet through langchain? Im trying to build a private chatbot to help me plan my holiday.

sugiantolauw

Is there a doc specifying which models can be used? You've shown llama2 but its name was of some model on hugging face. How does one know which models are supported and where to get the list of all the supported model names

gunnvant

You should also create youtube shorts will be helpful for the growth of this channel

dataflex

Can I do all these in colab? Please help me with it.

towfiq

Deploy FULLY PRIVATE & FAST LLM Chatbots! (Local + Production)

Deploy FULLY PRIVATE & FAST LLM Chatbots! (Local + Production)

Fastest Way to Deploy a Full Stack Web App (Railway)

Deploying a Website to AWS in Under 1 Minute

Why People Aren't Deploying to Vercel Anymore

HOST a website for FREE using AWS? | Deploy a Website to AWS in Under 8 Minute | AWS + DevOps

How to Deploy a Node.js App to Render.com for Free (Heroku Alternative)

Build and Deploy a Machine Learning App in 2 Minutes

Easily Deploy to Vercel with One Click

Deploy NodeJS Application on AWS - Amazon Web Services | NodeJS

FastAPI - Deploy your App in 2 minutes on Deta Space - FASTEST API DEPLOYMENT!

Deploy a Private Github Repository

Deploy ML model quickly and easily | Deploying machine learning models quickly and easily

How To Deploy a NextJS App To Vercel (EASY AND QUICK!!!)

Fastest Way to Deploy a Database (10 Seconds!?)

How To Deploy Any Web Application In 15 Minutes Tutorial | VPS + Coolify Combo

AWS re:Invent 2022 - How to deploy a private mobile network in days using AWS Private 5G (HYB204)

How to Deploy MERN Application on Vercel? HOST Full-Stack MERN App to Vercel for Free

Smart contracts, private test chain and deployment to Ethereum with Nethereum

How To Deploy to Heroku in 5 Minutes

How to build and deploy an app in Flutter in 60 seconds!

Kubernetes: How to deploy a Simple Game App into Amazon EKS in 10 minutes

How to Deploy Flask with Gunicorn and Nginx (on Ubuntu)

How To Deploy Machine Learning Models Using FastAPI-Deployment Of ML Models As API’s

Node.js Express Deployment on Vercel: Quick and Easy