First Language Model That Runs as Firmware.

Atome LM runs entirely on $2–$5 microcontroller chips. No cloud. No RAM. No overhead. Fully air-gapped.

Get the Firmware Read The Research

2.6 KB

Runtime Size

411 KB

Peak RAM

944K Params

Framework

M0+ → STM32F4+

Target Silicon

The industry builds bigger. We build smaller.

While traditional models require massive datacenters and gigabytes of memory, Atome LM is designed from the ground up to execute within the strict constraints of embedded hardware. We replaced the heap with static allocation and rewrote the inference engine in pure C.

Metric	Cloud LLM	ATOME LM
Memory Footprint	~24GB+	< 512KB
Delivery	REST API / Network	On-Chip Instruction
Dependencies	Complex OS Stack	All-pure C99

Four pillars of firmware execution

01. Compiler Contract

PROVABLE STACK PARITY

02. Footprint

2.6 KB RUNTIME ENGINE

03. Memory

ZERO HEAP ALLOCATION

04. Evaluation

SCRIPT-DRIVEN BENCHMARKS

How it works.

Tokenizer

Custom BPE implementation operating directly on flash memory chunks. No intermediate string allocations.

Routing Block

Statically compiled execution graphs tailored to specific ARM Cortex instruction sets for maximum throughput.

Quantization

Extreme sub-byte quantization allowing 944K parameters to reside completely within embedded flash storage.

Real numbers, real chips.

Each row below represents a real model size, compiled and measured on specific microcontroller classes. We measure the peak RAM footprint during the heaviest inference step.

Config	Used For	Peak RAM	STM32F103 $2.50 · 20KB	RP2040 PICO $1.00 · 264KB	STM32F411 $4.00 · 128KB	STM32F7 $8.00 · 320KB	ESP32-S3 $3.00 · 512KB
nano · 1.7 K params	Footprint demo	4.2 KB	✓	✓	✓	✓	✓
byte_small · 7 K params	Tiny keyword router	14.5 KB	RAM	✓	✓	✓	✓
classifier · 60 K params	The three working prototypes	42 KB	no	✓	✓	✓	✓
tinystories · 60 K params	Story-shaped text generation	22 KB	RAM	✓	✓	✓	✓
mid · 477 K params	Mid-range domain LM	105 KB	no	✓	RAM	✓	✓
prod_1m · 944 K params	Production-class coherent prose	215 KB	no	RAM	no	✓	✓

Read this honestly.

The classifier row is what most embedded AI currently targets. Notice how it fits comfortably in standard MCUs. The 1M parameter model requires more substantial SRAM, pushing into the higher end of Cortex-M. Future milestones like our Q15 fixed-point path will continue to drive these numbers down, but these are the hard limits today.

APPLICATIONS

What it's built for.

atome lm is not a general-purpose chatbot. It's a narrow specialist that you fine-tune on the data your product cares about. Below are the categories where "tiny model, fully offline, runs anywhere" is the right deal.

Smart Lightbulbs

Zero-latency voice commands without ever sending audio to the cloud.

Kids' Toys

Interactive, conversational play that remains completely private and offline.

Bedtime Story Devices

Safe, local generation of stories without requiring internet connectivity.

Automobiles

Reliable voice controls that function flawlessly even in dead zones.

Watches & Wearables

Natural language interfaces processing directly on minimal hardware.

Medical Wearables

Continuous health monitoring ensuring patient data never leaves the device.

Industrial Sensors

Real-time anomaly detection and reporting at the absolute edge.

Hearing Aids

Intelligent, local audio processing tailored to specific acoustic environments.

What atome lm is NOT for.

A general chatbot. Open-ended question answering. Knowledge retrieval. Code generation. Free translation. Any of those needs 100 M+ parameters and lives in a datacenter. atome lm is the opposite bet — deliberately narrow, deliberately fine-tuned, deliberately on the device.