ThinkNimble Research

đŸŒ± Seed
Early thoughts and rough ideas
đŸ€ AI Supported Learn more about our AI attribution policy

Open & Offline Models

Overview

ThinkNimble has a strong interest in open and offline-capable models. As AI moves from toy to infrastructure, the distinction between “open weight” (just the model) and “fully open” (weights + training data + process) becomes security-critical. This note tracks the landscape.


OLMo 3: The Fully Open Benchmark

Ai2’s OLMo 3 releases what most “open” models don’t: full training data (Dolma 3, ~9.3T tokens), training process documentation, and checkpoints - not just weights. Anthropic showed that just 250 poisoned documents can backdoor an LLM of any size, making auditable training data a security requirement. OLMo 3 trains on ~6x fewer tokens than competitors while reaching similar performance. Competitors in the “fully open” space: Stanford’s Marin, Swiss AI’s Apertus - but this remains a small category.


Connections