A research workflow from question to deliverable

A worked example, from question to deliverable. Real, not idealized. With the messy middle honestly represented.

The question

"I need to brief our leadership next Thursday on whether we should adopt an AI coding tool company-wide. Landscape, pros, cons, recommendation."

Day 1: scope and clarify

First move: refine the question.

What's our company size and tech stack?
What's the budget range?
What counts as success for this decision?
Are we considering a single tool or comparing tools?

Write the refined brief:

Research: Should [CompanyX, 200-engineer Python/TypeScript shop on GitHub] adopt a company-wide AI coding tool? Cover: top 3 candidate tools, productivity evidence from similar companies, concerns and risks, rough cost model, recommendation. 6-page report for leadership.

Now the research has direction.

Day 1 afternoon: initial scope

Run Perplexity:

"What are the top AI coding assistants in 2026? Productivity data from real companies."
"Known concerns about AI coding assistants in engineering teams."

Output: overview. Top tools (Copilot, Cursor, Claude Code, Windsurf), typical productivity claims (20-40% time savings on common tasks), concerns (security, code quality, team culture).

Day 2: depth

Run ChatGPT Deep Research:

"Detailed comparison of GitHub Copilot, Cursor, and Claude Code for enterprise deployment: features, pricing, security, case studies."

Output: 12-page report with 40+ citations. Covers the three tools, highlights feature matrices, cites several case studies.

Run Claude Research on the same question:

Same output shape, different synthesis. Notably flags where reported productivity gains vary by methodology.

Read both. Note where they agree (most things) and where they differ (specific gains on certain task types).

Day 2 evening: primary sources

From the AI reports' citations, identify primary sources:

GitHub's 2023-2024 Developer Productivity report.
A Microsoft-commissioned study on Copilot impact.
Independent studies by Atlassian and others.
Tool vendor case studies (cautious — marketing).

Download and read 5-6 primary sources. Take notes on methodology, sample, claimed effects.

Day 3 morning: talk to peers

AI research tools don't give you this. Email 4 engineering leaders at comparable companies:

"Have you rolled out an AI coding tool company-wide? What did you pick, what worked, what's your frank assessment one year in?"

Two reply within a day. Their answers are more valuable than any AI report. Note the patterns: tools chosen, adoption rate, unexpected issues.

Day 3 afternoon: synthesis

Open a doc. Structure:

Executive summary.
Landscape (top 3 tools).
Evidence of impact (with caveats on methodology).
Our context (why this might play differently for us).
Risks and concerns.
Cost model.
Recommendation.
Implementation notes.
References.

Draft each section. Move AI-generated content in; edit heavily for voice and specificity.

Day 4: verify and sharpen

For every load-bearing claim:

Is it cited?
Did I verify against the primary source?
Is the source credible?
Is there counter-evidence I'm missing?

Mark claims that need more work. Research them specifically.

For the "productivity gains" numbers: range across studies, not a single point. "Reported gains range from 15% to 55% depending on task type and measurement method; rigorous studies suggest 20-30% is typical for code completion tasks."

Day 5: review

Get a skeptical colleague to read it. Their pushback ("where did this number come from?") maps to places I need to strengthen.

Revise. Tighten.

Day 6: final polish

Charts for the cost model.
Clear recommendation with rationale.
Acknowledge uncertainty where it exists.
References cleaned up.

Day 7: deliver

Ship the brief. Present it. Take questions.

Good question that comes up: "Your productivity gains range is wide — what would you need to see in our own pilot to validate?"

You have an answer because you understand where the numbers come from.

What made this work

AI tools accelerated the first 60% of the research — mapping the landscape, finding sources, initial synthesis.
Primary sources and peer conversations did the remaining 40% — the part that distinguished this brief from a generic AI-produced report.
Verification and skeptical review prevented embarrassments.
Structure and voice — the final deliverable is yours, not AI-generated.

What would have failed

Shipping the ChatGPT Deep Research output directly.
Skipping primary sources.
Not talking to peers.
Confident single-number claims without methodology.

The pattern

AI research tools are accelerators. They don't replace the parts of research that matter most: asking the right question, verifying important claims, applying your own context, engaging with pushback.

Use them for what they do well. Do the rest yourself.

A worked example, from question to deliverable. Real, not idealized. With the messy middle honestly represented.

The question

"I need to brief our leadership next Thursday on whether we should adopt an AI coding tool company-wide. Landscape, pros, cons, recommendation."

Day 1: scope and clarify

First move: refine the question.

What's our company size and tech stack?
What's the budget range?
What counts as success for this decision?
Are we considering a single tool or comparing tools?

Write the refined brief:

Research: Should [CompanyX, 200-engineer Python/TypeScript shop on GitHub] adopt a company-wide AI coding tool? Cover: top 3 candidate tools, productivity evidence from similar companies, concerns and risks, rough cost model, recommendation. 6-page report for leadership.

Now the research has direction.

Day 1 afternoon: initial scope

Run Perplexity:

"What are the top AI coding assistants in 2026? Productivity data from real companies."
"Known concerns about AI coding assistants in engineering teams."

Output: overview. Top tools (Copilot, Cursor, Claude Code, Windsurf), typical productivity claims (20-40% time savings on common tasks), concerns (security, code quality, team culture).

Day 2: depth

Run ChatGPT Deep Research:

"Detailed comparison of GitHub Copilot, Cursor, and Claude Code for enterprise deployment: features, pricing, security, case studies."

Output: 12-page report with 40+ citations. Covers the three tools, highlights feature matrices, cites several case studies.

Run Claude Research on the same question:

Same output shape, different synthesis. Notably flags where reported productivity gains vary by methodology.

Read both. Note where they agree (most things) and where they differ (specific gains on certain task types).

Day 2 evening: primary sources

From the AI reports' citations, identify primary sources:

GitHub's 2023-2024 Developer Productivity report.
A Microsoft-commissioned study on Copilot impact.
Independent studies by Atlassian and others.
Tool vendor case studies (cautious — marketing).

Download and read 5-6 primary sources. Take notes on methodology, sample, claimed effects.

Day 3 morning: talk to peers

AI research tools don't give you this. Email 4 engineering leaders at comparable companies:

"Have you rolled out an AI coding tool company-wide? What did you pick, what worked, what's your frank assessment one year in?"

Two reply within a day. Their answers are more valuable than any AI report. Note the patterns: tools chosen, adoption rate, unexpected issues.

Day 3 afternoon: synthesis

Open a doc. Structure:

Executive summary.
Landscape (top 3 tools).
Evidence of impact (with caveats on methodology).
Our context (why this might play differently for us).
Risks and concerns.
Cost model.
Recommendation.
Implementation notes.
References.

Draft each section. Move AI-generated content in; edit heavily for voice and specificity.

Day 4: verify and sharpen

For every load-bearing claim:

Is it cited?
Did I verify against the primary source?
Is the source credible?
Is there counter-evidence I'm missing?

Mark claims that need more work. Research them specifically.

Day 5: review

Get a skeptical colleague to read it. Their pushback ("where did this number come from?") maps to places I need to strengthen.

Revise. Tighten.

Day 6: final polish

Charts for the cost model.
Clear recommendation with rationale.
Acknowledge uncertainty where it exists.
References cleaned up.

Day 7: deliver

Ship the brief. Present it. Take questions.

Good question that comes up: "Your productivity gains range is wide — what would you need to see in our own pilot to validate?"

You have an answer because you understand where the numbers come from.

What made this work

AI tools accelerated the first 60% of the research — mapping the landscape, finding sources, initial synthesis.
Primary sources and peer conversations did the remaining 40% — the part that distinguished this brief from a generic AI-produced report.
Verification and skeptical review prevented embarrassments.
Structure and voice — the final deliverable is yours, not AI-generated.

What would have failed

Shipping the ChatGPT Deep Research output directly.
Skipping primary sources.
Not talking to peers.
Confident single-number claims without methodology.

The pattern

Use them for what they do well. Do the rest yourself.

A research workflow from question to deliverable

The question

Day 1: scope and clarify

Day 1 afternoon: initial scope

Day 2: depth

Day 2 evening: primary sources

Day 3 morning: talk to peers

Day 3 afternoon: synthesis

Day 4: verify and sharpen

Day 5: review

Day 6: final polish

Day 7: deliver

What made this work

What would have failed

The pattern

2-question self-check

Continue in this track

A research workflow from question to deliverable

The question

Day 1: scope and clarify

Day 1 afternoon: initial scope

Day 2: depth

Day 2 evening: primary sources

Day 3 morning: talk to peers

Day 3 afternoon: synthesis

Day 4: verify and sharpen

Day 5: review

Day 6: final polish

Day 7: deliver

What made this work

What would have failed

The pattern

2-question self-check

Continue in this track