A prompt-engineering playbook for code

Generic prompt engineering advice works OK on code tasks. Code-specific patterns work much better. Here's the playbook experienced engineers converge on.

Provide the types

If your language has types, show them. Bad:

Write a function that processes orders.

Better:

Given this type:

type Order = { id: string; items: OrderItem[]; totalCents: number; status: "pending" | "paid" | "refunded" };

Write a function processOrder(order: Order) that marks the order paid and returns the new state.

The type is half the spec. Models use it for shape and correctness both.

Show a similar function

Models are pattern-matchers. If there's an existing function in your codebase that does something analogous, include it as a reference:

Here's how we currently handle refund events, for reference:
[paste function]

Write the equivalent handler for chargeback events. Same style and error handling.

This consistently out-performs "write a chargeback handler" without context.

Specify the file layout

Models guess wrong about where things should live in your codebase. Be explicit:

Add this function to src/server/handlers/payments.ts.
Export it from the index.
Add the corresponding test to src/server/handlers/payments.test.ts.

Ambiguous file placement is a frequent cause of "technically correct but wrong" AI edits.

Ask for tests first (or after, deliberately)

Two patterns:

TDD style. "Write the test cases for this function first. Don't implement yet." Then implement, then iterate.
Tests after. "Implement the function. Then write tests that would have caught any mistakes you made."

The second is unusual but valuable — it forces the model to reason about its own failure modes.

Constrain the output shape

For codegen, specify:

Exact function signature.
Exact file path.
Whether to include JSDoc/docstrings.
Error-handling style (throw, return error, Result type).
Null vs undefined vs optional handling (especially in TypeScript).

Every unconstrained axis is where the model makes a surprising choice.

Use the "context, task, constraint" triplet

<context>
[paste relevant existing code, types, helpers]
</context>

<task>
[one-paragraph description of what to do]
</task>

<constraints>
- No new dependencies
- Must handle null totalCents as zero
- Return the mutated order, don't mutate input
- Style follows existing code (see above)
</constraints>

This shape scales from simple to complex and keeps the model disciplined.

Review patterns

When asking for code review:

Review this PR. Focus on:
1. Correctness — does it do what the description claims?
2. Error handling — what happens on unexpected input?
3. Type safety — are there any unsafe casts or any types?

Do not nitpick style. Do not suggest alternative approaches unless
the chosen approach is incorrect.

Vague "review this" gets you a wall of generic suggestions. Specific "focus on X, ignore Y" gets a useful review.

Debugging prompts

For "the code doesn't work":

[paste code]

Expected: <what should happen>
Got: <what actually happens, including any error message>

Find the bug. Don't rewrite the function; identify the exact line(s) that are wrong and explain why.

"Don't rewrite" is key — without it, the model will restructure everything and you can't see the actual fix.

Agent-style code tasks

For multi-file agent work:

Task: [clear goal]

Constraints:
- Preserve the public API unless the task says otherwise
- Update tests to match any changes
- Don't touch files outside <list of allowed paths>

Done when: [explicit completion criteria — tests pass, type check green, build succeeds]

"Done when" is the most skipped piece. Without it, the agent claims success when it feels like it's finished.

Generic prompt engineering advice works OK on code tasks. Code-specific patterns work much better. Here's the playbook experienced engineers converge on.

Provide the types

If your language has types, show them. Bad:

Write a function that processes orders.

Better:

Given this type:

type Order = { id: string; items: OrderItem[]; totalCents: number; status: "pending" | "paid" | "refunded" };

Write a function processOrder(order: Order) that marks the order paid and returns the new state.

The type is half the spec. Models use it for shape and correctness both.

Show a similar function

Models are pattern-matchers. If there's an existing function in your codebase that does something analogous, include it as a reference:

Here's how we currently handle refund events, for reference:
[paste function]

Write the equivalent handler for chargeback events. Same style and error handling.

This consistently out-performs "write a chargeback handler" without context.

Specify the file layout

Models guess wrong about where things should live in your codebase. Be explicit:

Add this function to src/server/handlers/payments.ts.
Export it from the index.
Add the corresponding test to src/server/handlers/payments.test.ts.

Ambiguous file placement is a frequent cause of "technically correct but wrong" AI edits.

Ask for tests first (or after, deliberately)

Two patterns:

TDD style. "Write the test cases for this function first. Don't implement yet." Then implement, then iterate.
Tests after. "Implement the function. Then write tests that would have caught any mistakes you made."

The second is unusual but valuable — it forces the model to reason about its own failure modes.

Constrain the output shape

For codegen, specify:

Exact function signature.
Exact file path.
Whether to include JSDoc/docstrings.
Error-handling style (throw, return error, Result type).
Null vs undefined vs optional handling (especially in TypeScript).

Every unconstrained axis is where the model makes a surprising choice.

Use the "context, task, constraint" triplet

<context>
[paste relevant existing code, types, helpers]
</context>

<task>
[one-paragraph description of what to do]
</task>

<constraints>
- No new dependencies
- Must handle null totalCents as zero
- Return the mutated order, don't mutate input
- Style follows existing code (see above)
</constraints>

This shape scales from simple to complex and keeps the model disciplined.

Review patterns

When asking for code review:

Review this PR. Focus on:
1. Correctness — does it do what the description claims?
2. Error handling — what happens on unexpected input?
3. Type safety — are there any unsafe casts or any types?

Do not nitpick style. Do not suggest alternative approaches unless
the chosen approach is incorrect.

Vague "review this" gets you a wall of generic suggestions. Specific "focus on X, ignore Y" gets a useful review.

Debugging prompts

For "the code doesn't work":

[paste code]

Expected: <what should happen>
Got: <what actually happens, including any error message>

Find the bug. Don't rewrite the function; identify the exact line(s) that are wrong and explain why.

"Don't rewrite" is key — without it, the model will restructure everything and you can't see the actual fix.

Agent-style code tasks

For multi-file agent work:

Task: [clear goal]

Constraints:
- Preserve the public API unless the task says otherwise
- Update tests to match any changes
- Don't touch files outside <list of allowed paths>

Done when: [explicit completion criteria — tests pass, type check green, build succeeds]

"Done when" is the most skipped piece. Without it, the agent claims success when it feels like it's finished.

A prompt-engineering playbook for code

Provide the types

Show a similar function

Specify the file layout

Ask for tests first (or after, deliberately)

Constrain the output shape

Use the "context, task, constraint" triplet

Review patterns

Debugging prompts

Agent-style code tasks

2-question self-check

Continue in this track

A prompt-engineering playbook for code

Provide the types

Show a similar function

Specify the file layout

Ask for tests first (or after, deliberately)

Constrain the output shape

Use the "context, task, constraint" triplet

Review patterns

Debugging prompts

Agent-style code tasks

2-question self-check

Continue in this track