Codex + Claude Workflows / Applied

The Real Reason You Keep Hitting Claude Limits (And How to Fix It)

Design sessions around context economy: fewer wandering prompts, clearer work packets, and better reuse of artifacts.

Nick Puru | AI AutomationLongformTranscript found

Quick learning frame

Read this before watching.

Coding-agent workflow is the loop of inspect, plan, edit, verify, summarize, and route the next task to the right tool.

Limit management is really workflow design.

Skill you build: Actively managing a Claude Code session's context lifecycle so you do more work per token and stop hitting weekly limits prematurely.

Watch for the shift from claim to mechanism. The learning value is the point where the transcript reveals a repeatable action, tool boundary, context move, review habit, or artifact.

Concept diagram

Where this video fits.

01Inspect

02Plan

03Edit

04Verify

05Review

06Route

Deep lesson

Turn this video into working knowledge.

2,708 cleaned transcript words reviewed across 780 timed caption segments.

Thesis

The Real Reason You Keep Hitting Claude Limits (And How to Fix It) teaches a practical codex + claude workflows move: Design sessions around context economy: fewer wandering prompts, clearer work packets, and better reuse of artifacts.

The goal is not to remember the video. The goal is to extract the operating principle, tie it to timestamped evidence, test how far the claim transfers, and make something reusable.

0:24

Context is working memory

“context rot. Once you actually see how it works, I promise you will not use Claude the same way again. Now, I run Claude across three different businesses. I've been hitting limits. My team and I, we've been...”

Context isn't a file cabinet Claude pulls from on demand; it's working memory that reloads in full on every message, so by message 30 roughly 98% of paid tokens just re-read the past and that single message can cost more than your first 15 combined. Open a real session, run /context on a fresh start, and note how many tokens (often 40-70k) are already loaded before you type anything.

4:00

Four stacking cost drivers

“features that you guys are using, the models that you pick, the length of your sessions, they're telling you it's right up front, you have to manage this actively. And most people just aren't. And then they complain...”

Four forces compound your burn: context rot (Chroma's test of 18 models showed retrieval accuracy falling from 92% at 256k tokens to 78% at 1M on the same question), weekday 5-11am Pacific throttling, extended thinking billed as output tokens at ~5x input cost, and prompt caching that expires after ~5 minutes. List which of the four apply to your own workflow today, then check /config to see if extended thinking is silently on.

9:16

Session handoff over compact

“for everything else, Claude, it's plenty smart without the thinking tax. So, you can literally cut your burn on simple tasks by a third just from this one simple change. Habit number five, sub-agents. So, anytime that I'm...”

Around 60% of the window, instead of compacting a rotted state, ask Claude for a full handoff summary (where we started, what we decided, what's open, which files matter) then /clear and paste it into a fresh session to keep the knowledge and dump the rot, which can roughly double session length for the same work. On your next long task, request a handoff summary at ~60%, clear, paste it into a new session, and compare token burn against your usual auto-compact flow.

01

Inspect

Start with this video's job: Design sessions around context economy: fewer wandering prompts, clearer work packets, and better reuse of artifacts. Treat "Inspect" as the outcome you are trying to make visible, not a topic label. Anchor it to 0:24, where the video says: “context rot. Once you actually see how it works, I promise you will not use Claude the same way again. Now, I run Claude across three different businesses. I've been hitting limits. My team and I, we've been...”

02

Plan

Use "Plan" to locate the part of the codex + claude workflows workflow the video is demonstrating. Ask what changes in your real setup if this claim is true. Anchor it to 4:00, where the video says: “features that you guys are using, the models that you pick, the length of your sessions, they're telling you it's right up front, you have to manage this actively. And most people just aren't. And then they complain...”

03

Edit

Turn "Edit" into the reusable artifact for this lesson: A routing matrix for when to use Codex, Claude, browser checks, or manual review. This is where watching becomes something you can inspect and reuse.

04

Verify

Use "Verify" as the application surface. Decide whether the idea touches a browser flow, a local file, a model choice, a source document, a UI, or a review step.

05

Review

Use "Review" to prove the lesson. The evidence should connect back to the video title, transcript anchors, and a concrete output, not a generic best-practice claim.

06

Route

Use "Route" to carry the idea forward: save the prompt, checklist, diagram, or operating rule that would make the next agent run better.

Example

Source-backed work packet

Convert the video into a scoped task that includes the transcript claim, target workflow, acceptance criteria, and proof. The output should be a routing matrix for when to use codex, claude, browser checks, or manual review..

Example

Claim vs. demo brief

Separate what the speaker claims, what the demo actually proves, and what still needs outside verification before you adopt the workflow.

Example

Teach-back module

Transform the lesson into a definition, a mechanism diagram, one misconception, one practice exercise, and a check-for-understanding question.

Do not learn it wrong

Treating the title as the lesson without checking what the transcript actually says.
Letting the prompt drift into generic advice that could apply to any video in the playlist.
Copying the tool setup without identifying the operating principle that transfers to your own stack.
Skipping the artifact, which means the learning never becomes operational or inspectable.

Transcript-derived moments

Use timestamps to study the actual video.

Problem frame

“context rot. Once you actually see how it works, I promise you will not use Claude the same way again. Now, I run Claude across three different businesses. I've been hitting limits. My team and I, we've been...”

Working mechanism

“features that you guys are using, the models that you pick, the length of your sessions, they're telling you it's right up front, you have to manage this actively. And most people just aren't. And then they complain...”

Transfer moment

“for everything else, Claude, it's plenty smart without the thinking tax. So, you can literally cut your burn on simple tasks by a third just from this one simple change. Habit number five, sub-agents. So, anytime that I'm...”

Quality check

Do not count this as learned until these are true.

01

State the transcript-backed claim in your own words: Design sessions around context economy: fewer wandering prompts, clearer work packets, and better reuse of artifacts.

02

Explain the practical stakes without hype: Limit management is really workflow design.

03

Map the idea onto the Inspect -> Plan -> Edit -> Verify -> Review -> Route sequence and name the weakest link.

04

Produce the artifact and include the evidence that proves it: A routing matrix for when to use Codex, Claude, browser checks, or manual review.

Put it into practice

Give this grounded prompt to Codex or Claude after watching.

You are helping me turn one specific YouTube video into real, durable learning.

Source video:
- Title: The Real Reason You Keep Hitting Claude Limits (And How to Fix It)
- URL: https://www.youtube.com/watch?v=r_CLYDdBdmM
- Topic: Codex + Claude Workflows
- My current learning frame: Run a real coding task in Claude Code while applying the five habits (manual /compact at 50%, /clear between unrelated tasks, a 60% session handoff, killing extended thinking by default, and offloading file-heavy work to a Haiku sub-agent) and use /cost plus /context to measure the difference.
- Why this matters: Limit management is really workflow design.

Transcript anchors from this exact video:
- 0:24 / Evidence 1: "context rot. Once you actually see how it works, I promise you will not use Claude the same way again. Now, I run Claude across three different businesses. I've been hitting limits. My team and I, we've been..."
- 2:14 / Evidence 2: "burning more tokens than your first 15 combined. Okay. So, now we know that context is working memory and it reloads at every single message. That right there, that is half the problem. So, here's the other half."
- 4:00 / Evidence 3: "features that you guys are using, the models that you pick, the length of your sessions, they're telling you it's right up front, you have to manage this actively. And most people just aren't. And then they complain..."
- 5:33 / Evidence 4: "sessions that you thought were probably cheap, they may not be cheap whatsoever. And the models that you thought you weren't using, you actually are. So, it's effectively the closest thing to an X-ray for your context. And..."
- 7:17 / Evidence 5: "file for a different feature. Now, Claude, it was just hauling three jobs worth of context around the whole time. Now, the second that I actually switch jobs, switch tasks, I just do {slash}clear. And people always panic..."
- 9:16 / Evidence 6: "for everything else, Claude, it's plenty smart without the thinking tax. So, you can literally cut your burn on simple tasks by a third just from this one simple change. Habit number five, sub-agents. So, anytime that I'm..."
- 11:01 / Evidence 7: "go ahead and read it if you guys are interested in. If you guys only take one thing from this video, take this. Stop treating Claude as a infinite chat window. You need to treat it as a..."

Your task:
1. Use the transcript anchors above as the primary source packet. If you add outside context, label it clearly as outside context and keep it secondary.
2. Create a source-check table with columns: timestamp, claim, what the demo proves, confidence, and what still needs verification.
3. Extract the actual teachable claims from the video. Do not invent claims that are not supported by the title, lesson frame, or transcript anchors.
4. Build a reusable learning artifact: A routing matrix for when to use Codex, Claude, browser checks, or manual review.
5. Include:
- a plain-English definition of the core idea
- a diagram or structured model using this sequence: Inspect -> Plan -> Edit -> Verify -> Review -> Route
- 3 concrete examples that apply the video idea to real agentic work
- 2 failure modes the video helps prevent
- a checklist I can use the next time I run Codex or Claude
- one practical exercise with a clear done signal
6. Add a "learning transfer" section: what changes in my workflow tomorrow if I actually learned this?
7. Add a "source check" section that cites which transcript anchor supports each major takeaway.

Quality bar:
- Make this specific to "The Real Reason You Keep Hitting Claude Limits (And How to Fix It)", not a generic Codex + Claude Workflows essay.
- Prefer operational examples, failure modes, and reusable artifacts over broad definitions.
- Call out uncertainty instead of smoothing over weak evidence.
- If evidence is weak, say what transcript segment or timestamp needs review instead of guessing.
- Finish with a concise artifact I could paste into my learning app.

Misconceptions

What to stop believing.

One agent should do every task.

Different tools have different strengths. Routing is part of the workflow.

More context is always better.

Relevant context helps; stale context causes drift and cost.

Practice studio

Learning only counts when you make something.

01

Transcript evidence map

Separate what the video actually says from what you already believe about the topic.

3 source-backed takeaways with timestamps, confidence, and a transfer note.

02

One useful artifact

Apply the video to a real workflow and produce a routing matrix for when to use codex, claude, browser checks, or manual review..

A reusable artifact with a done signal and one verification step.

03

Teach-back card

Explain the lesson to someone who has not watched the video yet.

A 90-second explanation, one diagram, one example, and one misconception to avoid.

Recall check

Answer first, then reveal — without rewatching.

The video argues context is not 'storage' but working memory that reloads every message. Using his surgeon analogy and the message-30 figure, what does that mean for your token cost?

Name the four stacking cost drivers, and give the specific context-rot accuracy numbers from Chroma's study.

Why does he recommend a session handoff at ~60% of the window instead of letting Claude auto-compact, and what does the handoff summary capture?

Source shelf

Use the video as a doorway, then verify with primary sources.

ReadingOpenAI Codexopenai.com/codex/ReadingClaude Code Overviewdocs.anthropic.com/en/docs/claude-code/overview