Все публикации

Intro to burr:

Intro to burr: A State Machine for LLM apps

Llama 3.2-vision: The

Llama 3.2-vision: The best open vision model?

Moonshine: Real-Time Speech-To-Text

Moonshine: Real-Time Speech-To-Text on your laptop

NuExtract: An LLM

NuExtract: An LLM that extracts information

Using LLMs on

Using LLMs on the command line

Ollama: Running Hugging

Ollama: Running Hugging Face GGUF models just got easier!

The fastest way

The fastest way to run OpenAI Whisper Turbo on a Mac

Ollama: How to

Ollama: How to send multiple prompts to vision models

Running OpenAI Whisper

Running OpenAI Whisper Turbo on a Mac

An intro to

An intro to rerankers: A uniform API for reranking models

DuckDB dynamic column

DuckDB dynamic column selection gets even better

Ollama and LanceDB:

Ollama and LanceDB: The best combination for Local RAG?

Searching images on

Searching images on my laptop with LanceDB

Rewriting RAG Queries

Rewriting RAG Queries with OpenAI Structured Outputs

DuckDB function chaining:

DuckDB function chaining: The simpler SQL you didn't know you needed

Why OpenAI's new

Why OpenAI's new Structured Outputs feature is awesome!

What Are Matryoshka

What Are Matryoshka Embeddings?

How to evaluate

How to evaluate retrieval in RAG pipelines

Hybrid Search for

Hybrid Search for RAG in DuckDB (Reciprocal Rank Fusion)

Full-Text Search vs

Full-Text Search vs Vector Search (RAG with DuckDB)

Search-Based RAG with

Search-Based RAG with DuckDB and GLiNER

Local RAG with

Local RAG with llama.cpp

A UI to

A UI to quantize Hugging Face LLMs

Mistral 7B Function

Mistral 7B Function Calling with llama.cpp