Intent Is All You Need

// The reframing

From representation to agency

In 2017, six words reorganized machine learning: attention is all you need.1 The Transformer replaced recurrence and convolution with a single mechanism for modeling relationships in a sequence. It gave us models that predict — extraordinary engines of representation.

But a representation is not an agent. A model that maximizes the likelihood of the next token has no notion of a state it wants to reach, no criterion for success beyond plausibility, no reason to try again when it is wrong. As AI crosses from generation into autonomous action — systems that perceive, plan, act, and adapt through iterative control loops18 — the bottleneck is no longer how to attend to context. It is how to organize computation around a goal that persists across time, tools, memory, and self-correction.

Attention is the inner loop — perception and representation. Intent is the outer loop — goal-conditioned control. The claim of this paper is not that attention was wrong. It is that attention was the organizing principle of the model era, and that intent — the explicit, first-class representation of a goal — is the organizing principle of the agent era.

The model era · Attention

Objective: maximize the likelihood of the next token given context.

# next-token objective
max_θ Σ_t log P(x_t | x_<t)
# the mechanism
Attn(Q,K,V)=softmax(QKᵀ/√d)·V

Optimizes fidelity to a distribution. No goal, no state, no consequence.

The agent era · Intent

Objective: find a policy that reaches a specified goal, maximizing return.

# goal-conditioned objective
π* = argmax_π 𝔼_π[ Σ_t γᵗ r(s_t, g) ]
# value is conditioned on the goal g
V(s, g), Q(s, a, g)

Optimizes achievement of an intent. The goal g is an input, not an afterthought.

That single change — promoting the goal g to a first-class argument of every value and policy function — is what a decade of goal-conditioned reinforcement learning has been quietly building toward. The rest of this paper traces the four pillars that make intent is all you need a buildable architecture, and shows the three systems in which we have already built it.

// The four pillars

What it takes to make intent primary

The goal becomes a first-class object

Condition the value function on the goal itself and learning reorganizes. Universal Value Function Approximators generalize value across states and goals, V(s, g)3; Hindsight Experience Replay turns every failure into a lesson by relabeling the state actually reached as the goal that was intended4. Hierarchy makes it scale: FeUdal Networks split a Manager that sets abstract directional goals in latent space from a Worker that enacts primitive actions — decoupling goal-setting from execution and enabling credit assignment across very long horizons2, the same Manager/Worker split the options framework first formalized5.

# HER: relabel achieved state s′ as a goal — reward becomes dense
replay (s, a, s′, g) and (s, a, s′, g′=s′)

GCRLUVFAHindsight ReplayFeUdal / optionshierarchical credit assignment

Agents that design and build other agents

If the goal is primary, the harness that pursues it need not be hand-built — it can be searched. Automated Design of Agentic Systems defines agents as code and lets a meta-agent program ever-better ones from a growing archive; because code is Turing-complete, the search space is every possible agent, and the discovered agents outperform state-of-the-art hand-designed ones6. The Darwin Gödel Machine closes the loop on itself — a system that rewrites its own code, improving its ability to improve, lifting SWE-bench from 20% → 50% through Darwinian evolution over an archive7. Promptbreeder evolves not just prompts but the mutation-prompts that evolve them8; Voyager accumulates an ever-growing library of self-written, reusable skills9. The harness becomes a thing the agent grows, not a thing we ship.

ADAS / Meta-Agent SearchDarwin Gödel MachinePromptbreederVoyager skill librariesself-generated tools

III

Autonomy through constant self-correction

A goal supplies what a token never could: a criterion for being wrong, and therefore a signal to correct. Reflexion reinforces an agent through verbal feedback held in episodic memory — no weight updates — reaching 91% on HumanEval against a same-model 80% baseline10. RL makes the correction durable: SCoRe teaches self-correction with multi-turn online RL and a reward bonus11; Reflect-Retry-Reward uses GRPO to reward only the self-reflection tokens that turned a failure into a success12. Feedback scales past humans — RLAIF matches RLHF using an AI judge13 — and the emerging discipline of verifier engineering organizes the whole outer loop into search → verify → feedback14. Constant self-correction, applied to a goal, is the mechanism of autonomy.

ReflexionSCoReReflect-Retry-Reward (GRPO)RLAIFprocess reward modelsverifier engineering

Intent as the primary object — the theory

The deepest theories of agency already put the goal first. In active inference, an agent acts to minimize expected free energy — and a goal is simply a prior preference over observations it expects to make; behavior is what closes the gap between the world and the intended world16. Empowerment and intrinsic-motivation accounts unify as constrained entropy maximization — goal-seeking without an external reward at all. World models such as Dreamer show that once an agent can imagine futures, planning becomes goal-conditioned inference over that model15. Attention answers what is related to what. Intent answers what should be true, and how do I make it so.

# active inference: act to minimize expected free energy toward preferred outcomes
a* = argmin_a 𝔼[ F(future | a) ] — the goal is the prior p(o)

active inference / free energyempowermentworld models (Dreamer)planning-as-inference

// The proof — built, not theorized

We didn't write a paper about it.
We shipped it.

A thesis is only as strong as what it builds. Three AAIRC systems instantiate intent is all you need across the full stack: CORTEX orchestrates toward goals, COREAI governs the autonomy, and MANTIS gives it memory.

CORTEX

The Nerve Center · goal-driven orchestration

Coordinated Orchestration of Runtime Transient Executions — Go · NATS · LMDB

CORTEX is Pillar I and III, in production. It treats agents as definitions, not implementations: you give it a goal, and an LLM-powered Nerve Center decomposes it, manufactures the agents needed to achieve it at runtime, binds them to sub-goals, and adapts as results arrive. Its own whitepaper states the thesis exactly: "the shift from task-driven to goal-driven operation is what enables true autonomy."

goal → decompose → dependency graph → Agent Factory → execute → aggregate → feedback loop ↺ → stockpile

Agent Factory + Capability Resolver — resolves abstract capabilities to tools, and calls create_custom_tool to synthesize new ones (Pillar II, in miniature).
The feedback loop — the Aggregator returns results to the Orchestration Engine, which validates against success criteria, adjusts strategy, and spawns follow-up agents. This is the self-correction outer loop.
The Agent Pool — proven definitions are stockpiled, scored by success-rate and iteration count, matched by goal-similarity (Jaccard), and pruned when they underperform. The system improves through use — evolutionary credit assignment over a live archive, exactly as ADAS and the Darwin Gödel Machine describe.
Realizes THREAD — Thinking Recursively, Executing Autonomously, Adapting Dynamically.

COREAI

The governed harness · build · deploy · govern

Production agentic-harness framework — master-planner orchestration with a hard security boundary

Autonomy without governance is a liability. COREAI is the foundational platform that makes goal-driven agents safe to deploy: a master planner spawns and supervises agents, but every action passes through a Tool Runner they cannot bypass.

agent intent → Policy Engine → Execution Engine → Audit Logger // agents CANNOT execute tools directly

Policy-gated execution — every tool call is evaluated, argument-validated, timeout-bounded, and environment-scoped before it runs.
Total auditability — output captured and redacted, a full audit trail maintained. Self-improvement you can inspect, constrain, and trust.
The security boundary that turns "agents that build agents" (Pillar II) from a research curiosity into an enterprise-deployable capability.

MANTIS

The memory substrate · part of the FluxDB line

Memory-Augmented Network Traversal & Indexing System — embedded graph DB in C on LMDB · openCypher · sub-ms reads

Intent persists only if the agent remembers. MANTIS is the graph-memory and world-model layer: a high-performance embedded graph database that stores the relationships between goals, agents, skills, tools, and outcomes — the substrate the self-correction loop reads from and writes to.

openCypher query engine over an LMDB storage core — MVCC, memory-mapped, crash-safe, no GC pauses, sub-millisecond reads.
Graph algorithms — PageRank, Dijkstra, Louvain community detection, connected components — to reason over the agent's own history and knowledge.
The episodic + semantic memory that lets a stockpiled agent, a reflected-upon failure, or a learned skill be retrieved by relevance rather than recomputed.

CORTEX orchestrates toward the goal. COREAI governs the autonomy. MANTIS remembers. Together they are the architecture of intent — a machine that is given a what, and discovers the how.

// The claim, stated plainly

Not because attention is obsolete — it remains the substrate of perception. But because the question that defines this era is no longer "what is the most likely continuation?" It is "what outcome do we want, and what will a system do, autonomously and correctably, to achieve it?"

Every mechanism above — goal-conditioned value, self-designing harnesses, self-correction, active inference — is a different answer to the same reorganization: make the goal the primary object of computation, and intelligence follows as the search for how to reach it. That is a thesis, and a program of engineering. At AAIRC, it is also a running system.

In the post-agentic world, we do not program behavior. We specify intent — and the machine builds the rest.

Work with us Explore CORTEX & COREAI

// References

Selected works

[1] Vaswani et al. Attention Is All You Need. NeurIPS 2017. arXiv:1706.03762
[2] Vezhnevets et al. FeUdal Networks for Hierarchical RL. ICML 2017. arXiv:1703.01161
[3] Schaul et al. Universal Value Function Approximators. ICML 2015
[4] Andrychowicz et al. Hindsight Experience Replay. NeurIPS 2017. arXiv:1707.01495
[5] Sutton, Precup, Singh. Between MDPs and semi-MDPs (Options). Artificial Intelligence, 1999
[6] Hu, Lu, Clune. Automated Design of Agentic Systems. ICLR 2025. arXiv:2408.08435
[7] Zhang et al. Darwin Gödel Machine. 2025. arXiv:2505.22954
[8] Fernando et al. Promptbreeder. DeepMind, 2023. arXiv:2309.16797
[9] Wang et al. Voyager: Open-Ended Embodied Agent. 2023. arXiv:2305.16291
[10] Shinn et al. Reflexion. NeurIPS 2023. arXiv:2303.11366
[11] Kumar et al. SCoRe: Self-Correction via RL. DeepMind, 2024. arXiv:2409.12917
[12] Reflect, Retry, Reward. 2025. arXiv:2505.24726
[13] Lee et al. RLAIF. ICML 2024. arXiv:2309.00267
[14] Search, Verify and Feedback: Verifier Engineering. 2024. arXiv:2411.11504
[15] Hafner et al. Dream to Control (Dreamer). 2020. arXiv:1912.01603
[16] Friston. The Free-Energy Principle / Active Inference. Nat. Rev. Neurosci., 2010
[17] Schmidhuber. Gödel Machines. 2003
[18] Alenezi. From Prompt–Response to Goal-Directed Systems. 2026. arXiv:2602.10479
[19] Advanced AI Systems Research Corp. The Nerve Center Architecture. AAIRC Technical Whitepaper, 2026

Note: "Intent is all you need" is a programmatic thesis — a synthesis across goal-conditioned RL, self-referential self-improvement, and closed-loop correction — not a single empirically settled result. Named 2023–2026 findings are drawn from the primary sources above; benchmark figures are as reported by their authors.

Intent Is AllYou Need.