Comparison

Caveman and prompt caching save different things.

Caveman-style rules mainly target visible output. Prompt caching targets repeated input context. For AI coding work, both can matter, but they should not be measured as the same saving.

Output compression is easiest to see.

Shorter agent responses feel better immediately. The saving is visible, but it may be a smaller share of the total bill.

Input discipline is often the bigger leak.

Long memory files, repeated project summaries, and oversized tool outputs can consume more context than the final answer. A compact cross-tool memory pack helps reduce this waste.

Measure both.

Use the simulator to estimate visible output reduction, then keep bill impact separate so you do not make false claims.

Run the simulator