How to reduce Claude Code token usage

How to reduce Claude Code token usage by 60%++

rtk is an open source tool that can cut down unnecessary bloat in your Claude Code session, and here are a few more tips for keeping token usage low.

by on

You might notice that your command outputs are quickly bloating your context window, and aren’t necessarily helpful or relevant to your project.

Claude Code captures all the logs and outputs that happen in its session, which can be helpful if you’re debugging something specific. Unfortunately, most of the time it’s an inefficient use of Claude tokens. A lot of these outputs are redundant anyway, or just have a lot of whitespace/special characters that quickly add up.

Using rtk to filter Claude Code context

Conveniently, there’s an open source project that solves the context bloat problem. Rust Token Killer (or rtk) is a programmatic tool that pre-filters command outputs before CC ingests them. (See it on GitHub). It intercepts by using a PreToolUse hook that rewrites bash commands (e.g. CC running cargo test will get automatically swapped to rtk cargo test).

It lints them for whitespace, boilerplate, and comments. It also aggregates logs by group and removes duplicate messages.

For example, this is the git push context that CC takes in:

Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.23 KiB | 1.23 MiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To github.com:user/repo.git
   abc1234..def5678  main -> main

With rtk, it looks like this:

ok ✓ main

Reducing token burn rate by configuring Claude

Claude output tokens are significantly more valuable than input tokens, so you might want to address that first. There isn’t one single high-impact way to cut down on tokens burned, but you can tweak your CC settings to reduce Claude’s verbosity, and just change your usage habits in general.

You can start by adding specific instructions to your system prompts that guide Claude to be less wordy. You’ll also see improvements if you clearly prompt/instruct Claude on how to complete a task (it burns significant tokens doing its own guesswork and research). Defining your project’s architecture in your CLAUDE.md will also limit the amount of tokens it burns trying to analyze its structure.

And while MCP servers can be great, they are massively token inefficient. In many cases, using the CLI equivalent is the better option, and instructing CC how to use it with a Markdown file.

Toggling off the auto-compact buffer will also give you back a decent amount of tokens.

Tracking your Claude Code token usage

Check out this guide on tracking Claude Code tokens.

You can run the /cost command within a Claude Code session to see token usage for your current session, measured by the monetary value of the tokens you’ve used. Alternatively, run /context to view how you’re filling your current context window, and see token breakdown by task category.

We’ve found that third-party tools are great, because they give even more info than stock Claude. ccusage is a lightweight CLI tool that reads and analyzes CC’s local JSONL files to get token metrics. You can run and install it with npx ccusage@latest.

Claude-Code-Usage-Monitor tracks live token consumption and burn rate, and estimate when you’re likely to hit your limits.

Environments for Claude Code

When you’re developing with Claude Code, you’ll want to make sure that every new feature is tested extensively in a secure, isolated environment. Ephemeral environments are fast enough to keep up with CC: you can spin up an environment automatically based on a branch/PR, run tests, do QA, push patches, and then merge once you’ve determined it’s ready.

Shipyard is a plug-and-play ephemeral environment solution for devs using Claude Code. Claude can interact with the environments on its own via MCP/CLI (pull logs, get each live URL, visit the environments with Playwright MCP, etc). Try it free for 30 days and see how much faster your dev/test loop gets.

Try Shipyard today

Get isolated, full-stack ephemeral environments on every PR.

About Shipyard

Shipyard manages the lifecycle of ephemeral environments for developers and their agents.

Get full-stack review environments on every pull request for dev, product, agentic, and QA workflows.

Stay connected

Latest Articles

Shipyard Newsletter
Stay in the (inner) loop

Hear about the latest and greatest in cloud native, agents, engineering, and more when you sign up for our monthly newsletter.