Все публикации

Pandas versus Polars: A Quick Comparison of Single Node DataFrame Alternatives

Physics of Language Models - Extracting Knowledge

Improving Data Quality in MultiModal Models (Molmo and PixMO)

Are you Smarter than a AI Language Model like GPT4?

Why Logloss is a better loss function than Mean Squared Error

4 Techniques for Dimensionality Reduction: PCA, AutoEncoder, TSNE, and UMAP

Why you need an Evaluation Application for LLMs, such as, Braintrust

Oasis: A New Generative AI Gaming Engine, watch it power Minecraft

Training Kolmogorov-Arnold Networks (KAN) using Pytorch and Nixtla on M3/M4 Time Series Datasets

Kolmogorov-Arnold Networks (KAN) for Time Series AND HiPPO-KAN

NotebookLlama - An open source version of NotebookLM

Speed up XGBoost using Hist split method (faster than Exact, and Approx)

MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical

The Benefits of Quantization: Research from Neural Magic

Fixing Imbalanced Data in Machine Learning

Selecting and Speeding up your Sentence Transformer Models

Feature Selection Methods for Machine Learning, plus Feature Selection Curves

Embeddings, Context, and the Static Embeddings in Sentence Transformers

ColPali: Bringing Vision Language Models to Document Retrieval

Start using Llama 3.2 Vision Models with Hugging Face Transformers (on Snowflake)

Text Similarity Techniques: Lexical, Semantic, and Hashing

Practical Lessons in Building Generative AI: RAG and Text to SQL

DSBench: How Far are Data Science Agents Becoming Data Science Experts

Feature Selection with Boruta, MRMR, and Recursive Feature Elimination