context-awareness

Hooks that tell Claude how much of its context window it has burned, how fast, and roughly how many tool calls remain — so it can plan instead of guess.

Claude Code shows you a context budget in the status line. Claude itself does not see it. Ask the model how full its context is and the answer is mostly a guess based on conversation length. This plugin closes that loop — quietly injecting the same percentage the status line already knows about, plus a burn rate and a remaining-tool-calls estimate, into the system reminders Claude actually reads.

[01] The blind spot

why this exists

Long-context models are good at reasoning across what they have already read. They are not good at reasoning about how much room is left. The status line in Claude Code surfaces this perfectly to the human; the model behind the status line never sees it.

Ask it directly and you get a softly-confident answer that mostly tracks turn count. Look at what arrives in each request — branch name, working directory, recent files — and the budget is not in there. So the model spends context the way a contractor would spend money with no visible balance: fine for an hour, increasingly reckless after that.

[02] The notifications

what claude actually sees

Two hook events do the work: UserPromptSubmit (rich, before each response) and PostToolUse (brief, after tool calls — throttled, see below). Both inject a single line of system-reminder text. Format is stable: starts with [context:, ends with ], fields pipe-delimited, fields appear only once enough data exists to compute them.

Turn 1

[context: 12% used (very small) | burn rate: calculating...]

Turn 2

[context: 14% used (very small) | +2.0% last turn | burn rate: calculating...]

Turn N

[context: 32% used (small) | +3.2% last turn | ~0.38%/tool | ~179 tool calls remaining]

Tool

[context: 33% used (small)]

The fields, once computed: current percentage and tier label, last-turn delta, session-average burn rate as percent-per-tool-call, and a projected remaining tool calls figure capped at >1000. According to the model itself, the last one carries most of the weight:

The "remaining tool calls" metric is what actually guides my decisions — it's concrete and immediate, whereas the percentage and burn rate require mental calculation. — Claude, on its own behavior

Throttling

The brief PostToolUse form is throttled by tier: when context pressure is low the notification carries little new information and the injection just costs tokens. The tracking counters still increment on every tool call, so projections stay accurate — only the emission to Claude is suppressed.

tier	range	emit every
very small	0–25%	10th call
small	25–50%	5th call
medium	50–70%	3rd call
large & up	≥ 70%	every call

The UserPromptSubmit rich form is not throttled — it fires every turn regardless of tier, since prompt boundaries are the moment Claude is about to plan the next chunk of work and the budget readout is most useful.

[03] How it works

three stages, two files

The used_percentage field comes from Claude Code's status line JSON. Status lines are not part of the plugin API, so the install step patches the user's settings.json to insert a transparent pass-through wrapper that captures the percentage on every tick, persists it to disk, then forwards stdin to whatever status line was there before. The hook reads from that file. Compaction-resilient by construction.

read order: top → bottom · all data flows downstream of the status line

[04] The tier system

how behavior changes with pressure

The percentage carries a tier label, and the tier label carries a behavior change. These are injected once at session start as guidelines — they tune Claude's stance from normal through checkpoint everything as the budget runs down.

level	range	behavior
very small	0–25%	Normal operation.
small	25–50%	Normal, but prefer `Read` with `offset`/`limit` on large files.
medium	50–70%	Be concise. `Grep` before reading. Delegate when possible. Flag if a task may trigger compaction.
large	70–80%	Be brief. Commit frequently. Maintain a status summary capturing progress and next steps.
very large	80–90%	Judgment likely degrading. Checkpoint before and during multi-part tasks. Suggest a fresh session for substantial new work.
critical	90–100%	Compaction imminent. Wrap up, commit, avoid impactful decisions. Warn the user before accepting new work.

Rough tool-call costs (in claude's own ledger)

Read a file	1 call
Grep + read section	2 calls
Edit a file	1–2 calls
Small feature	5–15 calls
Explore unfamiliar code	5–10 calls
Run tests + fix failures	3–10 calls
Subagent delegation	1 call (main window)

[05] Configuration

what claude sees, and how it reacts

Defaults ship sensible. Run /context-awareness:config to view or change settings interactively, or edit $CLAUDE_PLUGIN_DATA/config.yaml directly. The used_percentage from the status line is already adjusted to the user's auto-compaction threshold, so 100% always means "100% of the usable context window."

Providers · which signals get injected

provider	default	signal
context	enabled	Window usage %, burn rate, remaining tool calls.
cost	disabled	Session spend in USD against a user-set budget.
rate_limits	disabled	5-hour and 7-day API rate limit usage.

Effort mode · session-scoped, optional

Controls whether the plugin auto-adjusts thinking effort as pressure rises. Cycled via /context-awareness:config change.

mode	behavior
off	Default. No adjustment — effort stays where you left it.
adaptive	Downshifts at 50% (medium) and 70% (low).
planMode	High effort for planning, low for execution.

Custom guidelines

The tier-based instructions are replaced wholesale by pointing guidelines at a file path. They are not merged.

guidelines: /path/to/my-guidelines.md

[06] Honest limitations

what this plugin cannot do for you

status line caveat Status lines are not part of the plugin API. The install step patches your settings.json to insert a transparent pass-through proxy. If the plugin is uninstalled, the wrapper entry remains in settings.json — it will silently no-op rather than error, but it is a residue.

The notifications are advisory. They make the budget visible; they do not enforce anything. Claude can still spend recklessly — it just no longer has the excuse of not knowing.

Burn rate is a session average. After compaction the percentage drops and the running estimate is briefly inaccurate. Allow a few tool calls to recalibrate.

The remaining-tool-calls figure is a projection, not a guarantee. A single subagent return or a large file read can swing it. Treat it as a planning aid, not a contract.

The interesting part is not that Claude now sees a number. It is what changes when it does. With the budget visible, "should I read this whole file or grep first" stops being a stylistic choice and becomes an arithmetic one. Delegation stops being a hedge and becomes a strategy. Plan instead of guess is not a slogan — it is what happens automatically when planning has a unit of currency.