Blog

Insights on custom AI models, open-source ML, and the future of specialized intelligence.

Oumi OSS v0.8: Deploy, MCP, and Batch Inference Everywhere

Oumi OSS v0.8: Deploy, MCP, and Batch Inference Everywhere

Case Study: Kaizen Gaming builds a specialized text-to-SQL model for sports analytics that beats Kimi-K2

With Oumi, Kaizen fine-tuned a 3B model that matched or exceeded frontier-scale alternatives — at a fraction of the cost — by treating rule compliance as a fine-tuning problem, not a prompting problem

Case Study: Wired Informatics trains a specialised clinical-NLP model on open weights that keeps sensitive data safe

With Oumi, a healthcare AI team moved from 84.5% to 88.9% concept-validity precision on messy clinical text — without sending a single patient record to a third-party API

Case Study: Wired Informatics trains a specialised clinical-NLP model on open weights that keeps sensitive data safe

Case Study: Original Voices builds a voice-authentic AI on proprietary persona data with 31% higher authenticity

With Oumi’s technology, Original Voices’s personal-twin product moved from 52% to 83% authenticity pass rate — driven entirely by better evaluation, not more data.

Case Study: Original Voices builds a voice-authentic AI on proprietary persona data with 31% higher authenticity

Oumi can now host your Custom AI Models with a single click (and other news)

Updates for Week of May 11, 2026

Oumi can now host your Custom AI Models with a single click (and other news)

Case Study: Aurasell builds an 8B model for extracting information from websites outperforming Sonnet 4.5 by 8%

With Oumi’s technology, Aurasell's custom AI beat Anthropic's Sonnet 4.5 at extracting information from webpages by 8% in coverage and 12% in groundedness

Case Study: Ada builds a real-time guardrail for their agents with 50% fewer false positives

With Oumi's technology, Ada's custom guardrail model beat GPT-4.1 Mini by 4% in accuracy while slashing latency and false positives rate

Case Study: DMG achieves 6% higher quality and 100x lower costs for invoice validation

With Oumi’s technology, DMG’s custom AI beat GPT5.2 by 6% accuracy and 6% validity at 100x lower cost

Building Customer Support with a sub-1B Small Language Model that Beats GPT-5.4 (Part 1)

Small Language Models occupy a sweet spot in speed and cost between frontier LLMs and BERT models for a wide range of NLP tasks

Building Customer Support with a sub-1B Small Language Model that Beats GPT-5.4 (Part 1)

Oumi’s Study Finds 50% of AI Overviews Untrustworthy

A not-so-simple study of SimpleQA and AI search results

Oumi’s Study Finds 50% of AI Overviews Untrustworthy

The Era of General Purpose AI Is Over

General-purpose models were built for the average.

The Case for Specialized Intelligence

Our vision for AI at Oumi.ai

February 5, 2026

Lambda and Oumi partner for end-to-end custom model development

Enterprises can now build and deploy custom models for their specific use cases 100x faster, with 10x better cost efficiency, and superior accuracy

Lambda and Oumi partner for end-to-end custom model development

January 5, 2026

Wrapping Up 2025, Looking Ahead to 2026

That’s a wrap on 2025. Hello 2026. 🎉

November 20, 2025

Oumi v0.5.0: Data Synthesis, OpenEnv, Hyper-param Tuning

Major new features!

Oumi v0.5.0: Data Synthesis, OpenEnv, Hyper-param Tuning

October 31, 2025

DCVLR Competition Results: Data Curation for Vision-Language Reasoning

DCVLR Competition Results: Data Curation for Vision-Language Reasoning

October 23, 2025

Less (Data) is More for Fine-tuning

1000 Samples or Less for Amazing Fine-tuning

Less (Data) is More for Fine-tuning

▶ October 21, 2025

Why Less is More for Fine-Tuning

What is the evidence for successful fine-tuning with small data?

Watch on Substack

Why Less is More for Fine-Tuning

October 16, 2025

Small Fine-tuned Models are All You Need

But the devil is in the details—how can you get them right?

Small Fine-tuned Models are All You Need

October 9, 2025

Hours, Not Months – The Custom AI Era is Now

Hours, Not Months – The Custom AI Era is Now

September 14, 2025

OpenAI Just Dropped Two Massive Open-weight Models — But How Do We Separate The Reality From The Hype?

Evaluating GPT-OSS-20B and GPT-OSS-120B with LLM-as-a-Judge — strong on truthfulness, but overly conservative refusals hold them back

OpenAI Just Dropped Two Massive Open-weight Models — But How Do We Separate The Reality From The Hype?

September 14, 2025

Training Frontier Reasoning VLMs for the 2025 NeurIPS DCVLR Workshop with Oumi

Baseline data curation strategies for the NeurIPS DCVLR competition — synthesis, filtration, and a 37.6% improvement on reasoning benchmarks

Training Frontier Reasoning VLMs for the 2025 NeurIPS DCVLR Workshop with Oumi

September 14, 2025

Compete to Curate Smarter Vision-Language Data—And Win Big at NeurIPS 2025

The NeurIPS 2025 DCVLR competition challenges teams to curate compact, high-impact datasets that improve visual reasoning in small models

Compete to Curate Smarter Vision-Language Data—And Win Big at NeurIPS 2025