Deep lesson

Agent Architecture

Learn how a model becomes a working system.

Working definition

Agent architecture is the design of the full operating system around a model: the instructions it receives, the tools it can call, the state it preserves, the permissions that constrain it, and the verification loop that decides whether the work is actually done.

By the end, you should be able to diagram an agent harness, explain what belongs outside the model, and design a small agent loop that can be tested.

0% complete

Build a one-page architecture spec for your personal agent control plane.

How it works

The agent loop is a system, not a prompt

Read the diagram from left to right. A useful agent does not simply answer once. It receives intent, plans through a harness, acts in an environment, checks evidence, and either ships an artifact or loops with better context.

What the user wants

Intent

The work starts as a desired outcome, not a vague topic. Strong intent says what should exist when the agent is finished.

Concrete example“Build a visual lesson page for two YouTube videos.”

Mental model

Understand the system before watching more videos.

Quick reading pass

Read these four ideas as the vocabulary for agent architecture. They are the labels you should use when a video explains a tool, habit, or workflow.

Before pressing play, try to predict where each idea appears in the system. That makes the video active instead of passive.

After each video, rewrite one card in your own words. If you cannot simplify it, the concept is not yours yet.

Model

The reasoning engine. It predicts and decides, but it does not inherently remember, browse, edit files, or verify work.

Learning move: pause when this shows up, name it, then write the practical rule it implies.

Harness

The shell around the model: tools, memory, permissions, prompts, routing, state, and the rules for what happens next.

Learning move: pause when this shows up, name it, then write the practical rule it implies.

Environment

The real world the agent can touch: browser, filesystem, APIs, terminals, design tools, calendars, queues, and documents.

Learning move: pause when this shows up, name it, then write the practical rule it implies.

Verifier

The feedback layer: tests, browser checks, citations, screenshots, logs, human review, and acceptance criteria.

Learning move: pause when this shows up, name it, then write the practical rule it implies.

Two-video prototype

Study these first. Slowly, actively, and with an artifact at the end.

Video 01

Agent Harness vs Everything Else

Use this as the foundation video. Do not watch it as “AI news”; watch it to separate model capability from system design.

Definition: Harness means the programmable wrapper around the model: tool access, context, state, permissions, retries, and verification.
Read: The practical insight is that most agent failures are not model failures. They are harness failures: no clear state, too many tools, no acceptance criteria, no browser check, no record of what changed, or no way to recover after an interruption.
Visualize: As you watch, place every feature into one of four boxes: model, harness, environment, or verifier. If a feature does not fit, it is probably being described too vaguely.
Do: After the video, draw your own agent harness for one task you actually care about. Include the tool boundary and the evidence that proves completion.

Video 02

What Comes After the Harness

Use this as the “next layer” video. The goal is to understand why single-agent loops become orchestration systems.

Definition: Orchestration is the layer that routes work, preserves state, decomposes tasks, evaluates outputs, and decides when to continue, retry, branch, or stop.
Read: When a task grows beyond one clean loop, the system needs memory of what happened, ownership of substeps, recovery after failure, and a way to compare candidate outputs. This is where agent work starts to look less like chat and more like software operations.
Visualize: Look for every moment where the agent needs a decision outside raw model text: choose a tool, split a task, evaluate a result, resume a session, or escalate to the user.
Do: Take the same task from video one and split it into three orchestration states: research, build, verify. Define what evidence moves the task from one state to the next.

Put it into practice

Give this prompt to Codex and make the lesson concrete.

Use this in Codex when you have a local folder where it can create a small prototype page or markdown artifact.

Codex

Build a personal agent harness diagram from a real workflow

I want to understand agent harness architecture by building a practical artifact, not just reading about it.

Create a polished one-page explainer in this workspace that teaches the difference between:
- model
- harness
- tools/environment
- memory/state
- permissions
- verification
- final artifact

Use one concrete workflow as the example: "turn a saved YouTube video into a rich learning lesson."

Requirements:
- Start by inspecting the project structure and choosing the simplest place to add this artifact.
- Include a visual system diagram with the six parts of the agent loop.
- Include short plain-English definitions.
- Include a "failure modes" section that explains what breaks when each part is missing.
- Include a "build checklist" I could use when configuring an agent.
- Make it elegant, readable, light-mode, and not generic.
- Verify it locally and tell me the URL or file path to view it.

Do not just summarize the topic. Make something I can learn from and reuse.

Guided watch sequence

Watch with a job to do.

Agent Harness vs Everything Else

Anchor the harness concept.

Draw the boundary between model, tools, memory, and verifier.

What Comes After the Harness

Understand orchestration and the missing layer.

List three things a harness must track across turns.

Creating Your Own Agentic OS

Turn architecture into a personal operating system.

Sketch your own inbox, memory, tools, and review loop.

Deep read

The ideas you should be able to explain out loud.

The harness is the product

A plain model can answer. A harness can act. The most important design decision is not which model is smartest, but what state the system keeps, what tools it exposes, and how it proves work was completed.

Proof checklist0/3 done

Can you name the state the agent needs?Can you explain which tools are safe?Can the agent recover after interruption?

Agents need contracts, not vibes

Every useful agent loop has an implicit contract: input, permissions, tools, expected output, verification, and next action. If any part is vague, the agent fills the gap with guesswork.

Proof checklist0/3 done

What is the exact output?What should the agent never do?How will success be measured?

Verification is architecture

Verification should not be an afterthought. Browser checks, test runs, citations, screenshots, diffs, and acceptance criteria are the rails that let an agent work with less supervision.

Proof checklist0/3 done

What can be checked automatically?What needs human taste?What artifact proves the result?

Misconceptions

What to stop believing.

A better model automatically makes a better agent.

The model matters, but harness design determines whether the system can act safely and repeatably.

More tools always help.

Every tool increases surface area. Strong agents have the right tools with clear permissions.

Memory means saving everything.

Useful memory is compressed, curated, and tied to future decisions.

Practice studio

Learning only counts when you make something.

Architecture canvas

Define one agent you actually want: purpose, inputs, tools, memory, risks, and verifier.

A single-page diagram and checklist.

Tool boundary audit

Pick five tools your agent could use and decide whether each should be read-only, write-with-approval, or autonomous.

A permission matrix.

Failure rehearsal

Write three ways your agent could go wrong and the signal that would catch each failure.

A verification table.

Recall check

Can you answer without rewatching?

Source shelf

Use videos as a doorway, then verify with primary sources.

DocsOpenAI Agents SDK: agents

Read this for the basic object model: instructions, tools, handoffs, guardrails, and structured outputs.

openai.github.io/openai-agents-python/agents/Open sourceDocsOpenAI Agents SDK: tracing

Use this to understand why observability is part of agent architecture.

openai.github.io/openai-agents-python/tracing/Open sourceDocsOpenAI Agents SDK: guardrails

Good follow-up for thinking about boundaries, tripwires, and tool-level checks.

openai.github.io/openai-agents-python/guardrails/Open sourceDocsOpenAI Agents SDK: handoffs

Explains delegation between specialized agents and what context gets forwarded.

openai.github.io/openai-agents-python/handoffs/Open sourceReadingModel Context Protocol

Useful for understanding how external tools and context servers become part of the agent environment.

modelcontextprotocol.io/introductionOpen source PodcastLatent Space: The AI Engineer Podcast

Best ongoing podcast lane for agent tooling, AI engineering, codegen, infra, and model shifts.

www.latent.space/podcastOpen source PodcastPractical AI podcast archive

Older but still useful practical conversations on agents, AI engineering, and production concerns.

changelog.com/practicalai/Open source

Watch next

Excellent related videos that expand the lesson.

Use these after the first two videos. They broaden the idea without losing the thread: architecture, workflow, tooling, review, and operating discipline.

Personal control plane

Creating Your Own Agentic OS is Easy

Turns the harness idea into a personal operating model with workspace, tools, memory, and recurring execution.

Agent interface layer

Hermes + Open WebUI Just Changed AI Agents Forever

Shows why a chat surface is useful, but also why the real value depends on tools, state, and verification.

Multi-agent caution

Hermes + Agent Swarms Just Changed AI Agents Forever

Good counterpoint for orchestration: parallel agents only help when ownership, outputs, and checks are clear.