Cache
A shared cache server for ESP32, STM32, and RP2040 projects. Cuts generation time with a 4-level cascade from byte-match to Claude AI.
One server handles caching, discovery, and generation — no cloud required.
Servers announce themselves on the LAN. No config files, no hostnames — they just appear.
Each node generates a keypair and CSR on first boot and requests a cert from the MakeGPT CA.
L0 cache backed by local S3-compatible storage. Byte-exact hits return in milliseconds.
L3 calls the Claude API when no local cache or model can answer — fully logged and billed to your key.
Each request waterfalls through levels, stopping at the first hit.
Exact-match lookup in MinIO. Fastest possible hit.
Semantic similarity via Qdrant — handles near-duplicate prompts.
Fine-tuned model via MLX on Apple Silicon. No internet needed.
Full generation via Anthropic. Result is written back to L0–L2.