AI coding tools like Claude Code can dramatically speed up development, but costs can grow quickly if context and token usage aren’t managed—especially if you don’t use locally running models. Most token waste comes from long conversations, large logs, and unnecessary files being loaded into context.

I’ve collated a short checklist, from one-time setup to daily workflow habits, to reduce token usage, lower cost, and keep Claude responses fast and reliable. These can save 30–50% of your Claude Code cost.

Claude Code Token Optimisation Cheat Sheet

Checklist: Setup and Workflow

ActionSetup / ExampleWhy
Output compression (RTK)brew install rtkrtk gainrtk init --global. Ref: rtk-ai/rtkFree CLI proxy that compresses outputs for Claude Code, Cursor, or Aider; cuts tokens in console/CLI or VS Code (not web).
Language servere.g. npm install -g pyright or npm install -g typescript-language-server, then in Claude open /plugin and select itGives Claude type info, definitions, and references without reading large files.
ccusagenpx ccusage@latestnpx ccusage daily --project myapp --breakdownShows where tokens are spent by project, model, and time period.
Claude monitorpip install claude-monitorclaude-monitor --plan pro --refresh-rate 5Live visibility into token usage while you work.
.claudeignoreAdd node_modules/, logs/, dist/, build/, coverage/Stops Claude from scanning large generated folders.
CLAUDE.md + skillsKeep CLAUDE.md small (wc -l CLAUDE.md); move large workflows from .claude/rules/ to .claude/skills/Reduces always-loaded context; large instructions become on-demand.
Quick reference fileCreate docs/QUICK_REF.md with commands like npm run dev, npm run test, npm run buildLets Claude use a small cheat sheet instead of large docs.
Hooks.claude/hooks/session-start.shgit branch --show-current; .claude/hooks/pre-tool.shgrep "ERROR" logs/app.logInjects useful context and filters noisy logs before they reach Claude.
Session management/clear, /compact, /rename <name>, /resume <name>, /cost, /model haiku, /mcp disable <tool>Keeps context small, reduces cost, avoids carrying stale conversations.
Prompt and workflow habitse.g. “Fix race in src/auth.ts lines 42–60”; “Done when: tests pass, no lint errors”; use subagents for multi-file work; use Shift + Tab before implementationReduces unnecessary reading, follow-ups, and large-context investigations.
Scripts for repeated tasksformat.shnpm run lint && npm run format && npm run typecheckMoves repeated work out of chat and into local automation.

The Main Causes of Token Waste

SourceTypical Fix
Large CLI outputsUse RTK compression
Large generated filesConfigure .claudeignore
Long conversationsUse /clear and /compact regularly

Optimising these three areas alone typically reduces token usage by 30–50% while making Claude responses faster and more reliable.