ATOMELM

First Language Model That Runs as Firmware.

Atome LM runs entirely on $2–$5 microcontroller chips. No cloud. No RAM. No overhead. Fully air-gapped.

chip-image
2.6 KB
Runtime Size
411 KB
Peak RAM
944K Params
Framework
M0+ → STM32F4+
Target Silicon

The industry builds bigger. We build smaller.

While traditional models require massive datacenters and gigabytes of memory, Atome LM is designed from the ground up to execute within the strict constraints of embedded hardware. We replaced the heap with static allocation and rewrote the inference engine in pure C.

MetricCloud LLMATOME LM
Memory Footprint~24GB+< 512KB
DeliveryREST API / NetworkOn-Chip Instruction
DependenciesComplex OS StackAll-pure C99

Four pillars of firmware execution

01. Compiler Contract

PROVABLE STACK PARITY

02. Footprint

2.6 KB RUNTIME ENGINE

03. Memory

ZERO HEAP ALLOCATION

04. Evaluation

SCRIPT-DRIVEN BENCHMARKS

How it works.

Tokenizer
Custom BPE implementation operating directly on flash memory chunks. No intermediate string allocations.
Routing Block
Statically compiled execution graphs tailored to specific ARM Cortex instruction sets for maximum throughput.
Quantization
Extreme sub-byte quantization allowing 944K parameters to reside completely within embedded flash storage.

Real numbers, real chips.

Each row below represents a real model size, compiled and measured on specific microcontroller classes. We measure the peak RAM footprint during the heaviest inference step.

ConfigUsed ForPeak RAM
STM32F103
$2.50 · 20KB
RP2040 PICO
$1.00 · 264KB
STM32F411
$4.00 · 128KB
STM32F7
$8.00 · 320KB
ESP32-S3
$3.00 · 512KB
nano · 1.7 K paramsFootprint demo4.2 KB
byte_small · 7 K paramsTiny keyword router14.5 KBRAM
classifier · 60 K paramsThe three working prototypes42 KBno
tinystories · 60 K paramsStory-shaped text generation22 KBRAM
mid · 477 K paramsMid-range domain LM105 KBnoRAM
prod_1m · 944 K paramsProduction-class coherent prose215 KBnoRAMno

Read this honestly.

The classifier row is what most embedded AI currently targets. Notice how it fits comfortably in standard MCUs. The 1M parameter model requires more substantial SRAM, pushing into the higher end of Cortex-M. Future milestones like our Q15 fixed-point path will continue to drive these numbers down, but these are the hard limits today.

APPLICATIONS

What it's built for.

atome lm is not a general-purpose chatbot. It's a narrow specialist that you fine-tune on the data your product cares about. Below are the categories where "tiny model, fully offline, runs anywhere" is the right deal.

Smart Lightbulbs

Zero-latency voice commands without ever sending audio to the cloud.

Kids' Toys

Interactive, conversational play that remains completely private and offline.

Bedtime Story Devices

Safe, local generation of stories without requiring internet connectivity.

Automobiles

Reliable voice controls that function flawlessly even in dead zones.

Watches & Wearables

Natural language interfaces processing directly on minimal hardware.

Medical Wearables

Continuous health monitoring ensuring patient data never leaves the device.

Industrial Sensors

Real-time anomaly detection and reporting at the absolute edge.

Hearing Aids

Intelligent, local audio processing tailored to specific acoustic environments.

What atome lm is NOT for.

A general chatbot. Open-ended question answering. Knowledge retrieval. Code generation. Free translation. Any of those needs 100 M+ parameters and lives in a datacenter. atome lm is the opposite bet — deliberately narrow, deliberately fine-tuned, deliberately on the device.