How It Works¶

ProgramAsWeights (PAW) compiles natural language specifications into neural programs: small, local functions that combine a textual program with learned adapter weights.

Overview¶

The system turns a written spec into a runnable artifact. Compilation is a fixed pipeline; at runtime the SDK loads a shared base model, applies a program-specific adapter, and runs inference through llama.cpp.

The compilation pipeline¶

Compilation has three stages.

1. Pseudo-program generation¶

An untrained 4B-parameter instruction model (Qwen3-4B-Instruct) generates a pseudo-program: structured text that includes a task description and illustrative examples derived from your spec.

2. LoRA extraction¶

A trained 4B compiler model reads the original spec together with the pseudo-program and produces internal representations. A LoRA mapper maps those hidden states into LoRA adapter weights that encode the desired behavior.

3. Bundling¶

The LoRA weights are quantized to Q4_0 GGUF format (on the order of 23 MB) and packaged with the pseudo-program into a .paw file. That bundle is what the SDK downloads and caches.

Runtime behavior¶

When you run a program locally:

The SDK loads the standard interpreter (Qwen3 0.6B, about 594 MB, downloaded once).
It applies the Q4_0 LoRA adapter from the bundle.
The pseudo-program is prepended as a prompt prefix; user input follows.
Inference uses llama.cpp (CPU or GPU backends as configured).

Discrete plus continuous¶

Two mechanisms work together:

The pseudo-program supplies discrete instructions and structure (what the task is, how examples look).
The LoRA adapter supplies continuous behavioral tuning aligned to that task.

Either part alone is weaker than the combination; PAW is designed around this joint design.

Deterministic identity and caching¶

Content-addressable IDs: For a given specification and compiler version, the resulting program ID is deterministic. The same inputs yield the same identifier.

Caching: Repeated compiles of the same spec resolve quickly: the service can skip redundant pseudo-program generation and reuse cached program artifacts where applicable, so you do not pay full compilation cost on every identical request.