Все публикации

Intro to burr: A State Machine for LLM apps

Llama 3.2-vision: The best open vision model?

Moonshine: Real-Time Speech-To-Text on your laptop

NuExtract: An LLM that extracts information

Using LLMs on the command line

Ollama: Running Hugging Face GGUF models just got easier!

The fastest way to run OpenAI Whisper Turbo on a Mac

Ollama: How to send multiple prompts to vision models

Running OpenAI Whisper Turbo on a Mac

An intro to rerankers: A uniform API for reranking models

DuckDB dynamic column selection gets even better

Ollama and LanceDB: The best combination for Local RAG?

Searching images on my laptop with LanceDB

Rewriting RAG Queries with OpenAI Structured Outputs

DuckDB function chaining: The simpler SQL you didn't know you needed

Why OpenAI's new Structured Outputs feature is awesome!

What Are Matryoshka Embeddings?

How to evaluate retrieval in RAG pipelines

Hybrid Search for RAG in DuckDB (Reciprocal Rank Fusion)

Full-Text Search vs Vector Search (RAG with DuckDB)

Search-Based RAG with DuckDB and GLiNER

Local RAG with llama.cpp

A UI to quantize Hugging Face LLMs

Mistral 7B Function Calling with llama.cpp