Using the Chat Endpoint in the Ollama API

Показать описание

Рекомендации по теме

Комментарии

Thanks for this awesome tutorial. I took it as a reference and built a user_id based map to keep history in an in memory database.
This helped me to keep history for each user.
{
"user_1" : [{}],
"user_2": [{}]
}

nofoobar

Thank you sir for adding this QoL. Super helpful indeed!

Psychopatz

Thanks for writing this terrific server, models, and tools!

dr.mikeybee

That awakward silence at the end though :D

RanaMuhammadWaqas

For some reason, I’m unable to hit the endpoint from another computer on the same network

gears

I actually had no idea you were part of the ollama team too thats super cool

HistoryIsAbsurd

Hello 👋 can we connect our local ollama with a self hosted N8n server ?

GrecoFPV

Very interesting, I had no idea. What possible roles are there besides "user", do we just make them up or is there a predefined set?

chrisBruner

Is some way that you can help me with my proyect ? thanks from Chile Claudio

claudioguendelman

Hi Matt! Thanks from the awesome work! Is there a way to include vLlm into Ollama?

dextersantamaria

Is there an example of a chat UI application that uses the ollama inference endpoint and is then deployed in the cloud (AWS, GCP, etc)? I have managed to create the app and it is running on local, but I'm struggling to deploy it in cloud - specifically, I'm stuck at creating an appropriate Dockerfile, as it seems there needs to be two deployments, one for the ollama inference endpoint and one for the UI. Therefore, an example showing how it's done would be awesome!

PrashantSaikia

separate question. What's that browser? It has a very nice interface. TIA

ilteris

When I use Ollama from the terminal with the llama3 model, it works very fast, almost instantly. However, when I try to make a request to localhost from the same machine using curl, it is incredibly slow. Why could this be?

Ramirola

If ollama itself is not multithread and async, how much wrapper like this can help?

I have 2 laptops, A and B. 'A' has a database with more than 10k records in a table and code written Scala to fetch data from the database. 'B' has ollama running on it and llama3:latest model is loaded on it. If I am fetching data from the database from 'A' and sending it to chat and generate endpoints API of ollama on 'B'. I observed that ollama responds to the chat or generate request within few milliseconds like 100, 125, 200, 300. But that's just in the beginning. Later on the response time increase to 10 minutes and about for a request by the time 2k requests is send and still 8k requests still need to send.

From this behavious, it doesn't look that ollama is supporting concurrency. I created a python script with Flask to achieve async but the behaviour from ollama remained same. Do you know to solve this problem? Or is it like there no problem and hence no solution? OLLAMA_NUM_PARRAL value was 4.

niteshapte

Why does the chat api respond all at once, even with streaming turned on?

MavVRX

Do you have github repo for this video?

briannezhad

hey can you post a github link to the code?

sampriti

Using the Chat Endpoint in the Ollama API

Using the Chat Endpoint in the Ollama API

Choosing between #Chat and #Generate Endpoints in #ollama

OpenAI API Endpoints | What You Need to Know

🚀 Unlocking the Power of OpenAI's Chat Completions Endpoint! 🤖✨

How to Use Custom API Action to Call an API Endpoint

Superpowered Citations: Chat Endpoint

What is an API (in 5 minutes)

Beginners Guide to GPT4 API & ChatGPT 3.5 Turbo API Tutorial

FM BetterForms Friday Live! - All in on AI with special guest Ian Jempson

Learn Fetch API In 6 Minutes

How to Test Endpoint in Nodejs

Power BI XMLA Endpoint: Why you should care

NodeJS : REST API chat - endpoint for real-time fetching messages

Windows - Terminal Chat via Azure OpenAI Endpoint

Stream Language Model response to a FAST API endpoint

R : endpoints in xts using R

Build a chat assistant fast using Canopy from Pinecone and Anyscale Endpoints

R : to.minutes using custom endpoints

What is an API ? Simply Explained

Publish conversation transcripts of automated chats: Setup API endpoints with JSON payload

Go + HTMX + OpenAI: Create a Lightweight AI Chat Application

NodeJS : How to deal when calling a wrong endpoint using app.get?

Apache Kafka in 6 minutes

Full Stack React Django DRF | Chat App | Creating an API Endpoint for Filtering Servers by Category