How to consume Claude Code Opus - like less token but more output? - Dave Ishii

### Token-Saving Hacks for Generating Long Lists of Approaches/Ideas with LLMs (Like Claude Opus 4.5)

Here’s a long list of practical, decent hacks to minimize token consumption while still cranking out extensive lists of approaches, ideas, or options. These focus on prompt engineering, workflow tweaks, and structural optimizations—aiming to "consume less but create many." I’ve drawn from common 2026 LLM power-user patterns (e.g., via Claude Code/API, but applicable broadly). They’re ranked roughly by ease of implementation, from simplest to more involved. Each includes why it saves tokens and a quick example.

1. **Batch sub-queries in a single prompt**: Instead of multiple calls, pack 5–10 related mini-prompts into one. Opus counts the whole input/output as one pass, so you get a mega-list without repeated context reloading.

*Savings*: Cuts overhead by 30–50% per item.

*Example*: "Generate 20 marketing strategies: First 5 for B2B SaaS; next 5 for e-commerce; next 5 for nonprofits; last 5 hybrid. Keep each to 1–2 sentences."

2. **Use numbered/bulleted skeletons upfront**: Prompt the LLM to output in a strict, terse format (e.g., "1. Idea: [brief desc]. Pros: [1-2]. Cons: [1-2]."). This forces concise responses, reducing output tokens while still yielding 50+ items.

*Savings*: Output shrinks by 40–60% without losing depth.

*Example*: "List 30 UI design approaches for a dashboard. Format: #. [Name]: [20-word desc]."

3. **Leverage few-shot examples with abbreviations**: Provide 2–3 abbreviated examples in the prompt, then ask for "50 more like these." The LLM infers the pattern, generating volume with minimal guidance.

*Savings*: Reduces input tokens by reusing patterns.

*Example*: "Ex1: Solar panel hack - mirrors for focus. Ex2: Wind turbine - urban kites. Generate 40 more green energy hacks, abbreviated like above."

4. **Chain partial lists recursively (low-depth)**: Ask for "first 20 approaches," then in the next prompt: "Continue from #21 with 20 more, no recap." Keep chains short (2–4 prompts) to build 100+ items.

*Savings*: Avoids re-sending full context each time.

*Example*: Prompt 1: "List first 25 plot twists for sci-fi novel." Prompt 2: "Next 25, starting at 26."

5. **Prompt for categories first, then expand selectively**: Get a high-level list of 10–15 categories (cheap), then only drill into 3–5 with "generate 10 approaches each." Total: 50+ detailed items for less overall.

*Savings*: Skips exhaustive expansion on everything.

*Example*: "Categorize 15 ways to optimize code. Then detail 10 approaches for the top 4 categories."

6. **Use token-efficient delimiters and compression**: Structure prompts with | or ; separators for lists, and instruct "use shorthand: no full sentences, just key phrases." Compresses output density.

*Savings*: 20–40% output reduction.

*Example*: "50 fitness routines: Name|Duration|Equipment|Benefits (comma sep)."

7. **Generate in waves with self-summarization**: Prompt for 20 ideas, then "summarize the above in 50 words and generate 20 more unique ones." Builds iteratively without full re-input.

*Savings*: Summary acts as cheap context.

*Example*: For recipe ideas: Wave 1 gets 20; wave 2 summarizes + adds 20 new.

8. **Exploit parallel tool calls (if API)**: In Claude API, use concurrent calls for sub-lists (e.g., 5 calls for 10 items each = 50 total). Each call is tiny.

*Savings*: Parallelism means no sequential token buildup.

*Example*: Script: async calls for "10 SEO tactics for [niche1]", "[niche2]", etc.

9. **Prompt for "variations on a theme" loops**: Start with 5 core ideas, then "generate 10 variations each, numbered as 1a-1j, etc." Gets 50+ with minimal new invention.

*Savings*: Builds on existing, reducing creative compute.

*Example*: "5 base app monetization models, then 10 tweaks per model."

10. **Use negative prompts to focus output**: Add "avoid: verbose explanations, examples, pros/cons unless asked." Forces bare lists, expandable later.

*Savings*: Trims fluff by 50%.

*Example*: "100 keyword ideas for 'AI tools'. Just list, no descriptions."

11. **Hybrid human-LLM filtering**: Generate a bloated list (e.g., 200 raw ideas) in one go, then manually cull to 50 and prompt "refine these 50." Offloads to you.

*Savings*: One big generation vs. many small.

*Example*: "Brainstorm 150 startup pivots, raw list only."

12. **Template-based generation**: Pre-define a template in prompt (e.g., "[Adjective] [Noun] [Verb] for [Goal]"), then "generate 40 filled templates."

*Savings*: Pattern-matching is token-cheap.

*Example*: For slogans: Template "[Fun] [Animal] [Action] your [Product]."

13. **Incremental specificity**: Start broad ("50 high-level strategies"), then one follow-up: "Add details to #1-10 only." Scale as needed.

*Savings*: Defers depth.

*Example*: "50 business models, summaries first; detail top 15 later."

14. **Combine with external tools (e.g., regex post-processing)**: Generate a semi-structured list, then use code to expand/extract (zero LLM tokens post-gen).

*Savings*: Offloads to code.

*Example*: Prompt for "50 phrases", then Python script permutations for 500+.

15. **Prompt for "clustered" lists**: "Group 40 approaches into 8 clusters of 5, with cluster titles." Gets organization + volume cheaply.

*Savings*: Clustering reuses ideas implicitly.

*Example*: "40 productivity hacks, clustered by time of day."

16. **Use "continuation tokens" in API**: For long outputs, handle truncation by prompting "continue from last item" with a snippet.

*Savings*: Avoids full re-prompts.

*Example*: If cut at #30, next: "Resume from #31: [last sentence]. Generate to #60."

17. **Few-token triggers for expansion**: Prompt once for 100 short items, then use tiny prompts like "Elaborate #5, #17, #42" in batches.

*Savings*: Micro-prompts are cheap.

*Example*: Base list first, then selective depth.

18. **Leverage model’s internal knowledge for bootstrapping**: Ask "recall 20 famous examples from history/books, then adapt each to my scenario." Builds on pre-trained data.

*Savings*: Less invention needed.

*Example*: "20 historical inventions, adapt to modern AI apps."

19. **Batch synonyms/antonyms for inflation**: Generate a core list of 10, then "for each, add 5 synonyms as new approaches." Instant 60.

*Savings*: Rephrasing is low-effort for LLM.

*Example*: "10 core diets, then 5 variant names/descriptions each."

20. **Prompt for "matrix" structures**: "Create a 10x5 matrix of [rows: niches] x [columns: tactics]." Yields 50 cells, each a mini-approach.

*Savings*: Tabular forces brevity.

*Example*: Rows: industries; Columns: growth hacks.

21. **Self-referential prompting**: "Generate 10 ideas, then use them to inspire 10 more, then 10 from those." Chain within one prompt.

*Savings*: Single call for multiples.

*Example*: "Step 1: 15 base. Step 2: 15 derived. Step 3: 15 hybrids."

22. **Compress input with summaries**: If context-heavy, summarize prior conv in 100 words, then prompt on that.

*Savings*: Shrinks input by 70%.

*Example*: "Based on this summary [paste], list 40 extensions."

23. **Use "random sampling" illusion**: Prompt "simulate generating 1000 ideas and pick the top 50 diverse ones." Gets quality volume without actual scale.

*Savings*: LLM fakes the breadth cheaply.

*Example*: For game levels: "Top 30 from imagined 500."

24. **Parallel personas in one prompt**: "As Engineer: 10 ideas. As Designer: 10. As Marketer: 10. Merge uniques." 30+ with overlap savings.

*Savings*: Shared prompt space.

*Example*: Multi-role brainstorming.

25. **Post-generation augmentation**: Generate 20, then manually/regex add variations (e.g., "swap X with Y"). Zero tokens.

*Savings*: Human multiplier.

*Example*: Base list + find/replace for niches.

26. **Prompt for "evolutionary" lists**: "Start with idea A, mutate 5 times. From each mutation, mutate 3 more." Tree structure for 20+ cheaply.

*Savings*: Branching is efficient.

*Example*: For product features.

27. **Limit vocabulary/depth per item**: "Each approach: 10 words max, no examples." Stack 100+ easily.

*Savings*: Enforces minimalism.

*Example*: "50 taglines: under 8 words each."

28. **Combine with free tiers/alternates**: Use cheaper models (e.g., Sonnet) for initial lists, Opus only for refinement.

*Savings*: 50–80% cost shift.

*Example*: Sonnet generates 100 raw; Opus polishes 30.

29. **Recursive summarization for volume**: "Generate 10 detailed, summarize each to 1-liner, then generate 5 new from each summary."

*Savings*: Summaries recycle cheaply.

*Example*: Idea expansion loop.

30. **Final hybrid: API looping with token counters**: Script to monitor tokens, generate in 10k chunks, auto-continue.

*Savings*: Precise control.

*Example*: While tokens < limit: append next 20.

These should give you a ton of mileage—mix and match for your projects. If you're targeting a specific domain (e.g., code, writing), I can tailor more!

Dave Ishii - Problem Solving Blog

Doing Math like Dragon Ball - Finding Something Important

How to consume Claude Code Opus - like less token but more output?