engineering
Bug Report Writing
Write actionable bug reports with precise reproduction steps, expected vs actual behavior, environment details, severity classification, and root cause analysis templates.
bugsbug-reportsreproductiondebuggingquality-assurance
Works well with agents
Works well with skills
bug-report-writing/
SKILL.md
Markdown| 1 | |
| 2 | # Bug Report Writing |
| 3 | |
| 4 | ## Before you start |
| 5 | |
| 6 | Gather the following from the user. If anything is missing, ask before proceeding: |
| 7 | |
| 8 | 1. **What broke?** — The specific behavior that is wrong (not "it doesn't work") |
| 9 | 2. **Where?** — URL, screen, API endpoint, CLI command, or code path |
| 10 | 3. **When did it start?** — First observed date, or the deploy/commit that introduced it |
| 11 | 4. **Who is affected?** — All users, specific roles, certain browsers, percentage of requests |
| 12 | 5. **How severe?** — Is there a workaround? Is data at risk? Is revenue impacted? |
| 13 | |
| 14 | If the user says something vague like "search is broken," push back: "What did you search for, what did you expect to see, and what appeared instead?" |
| 15 | |
| 16 | ## Bug report template |
| 17 | |
| 18 | ### Title |
| 19 | |
| 20 | Use the format: `[Component] Verb describing the failure — [trigger condition]` |
| 21 | |
| 22 | **Good**: "[Checkout] Payment form submits twice when user double-clicks Pay button" |
| 23 | **Bad**: "Payment bug" or "Checkout issue" or "Double charge problem" |
| 24 | |
| 25 | ### Summary |
| 26 | |
| 27 | 2-3 sentences covering WHAT is broken, WHO it affects, and WHEN it started. Include impact metrics if available. |
| 28 | |
| 29 | ### Reproduction Steps |
| 30 | |
| 31 | Write numbered steps that a developer with no context can follow to trigger the bug every time. Each step must describe exactly one action. |
| 32 | |
| 33 | Precision rules: |
| 34 | - State exact inputs, not "enter some data" |
| 35 | - Include the starting state ("logged in as admin," "empty database," "feature flag X enabled") |
| 36 | - Note timing if relevant ("within 500ms," "after waiting 30 seconds") |
| 37 | - Specify the reproduction rate: always, intermittent (X/10 attempts), or one-time |
| 38 | |
| 39 | ### Expected vs Actual Behavior |
| 40 | |
| 41 | State both explicitly. Never assume the reader knows what "correct" means. |
| 42 | |
| 43 | ``` |
| 44 | Expected: Clicking Pay multiple times submits only one payment. |
| 45 | Actual: Each click submits an independent payment request. User is charged per click. |
| 46 | ``` |
| 47 | |
| 48 | ### Environment |
| 49 | |
| 50 | Include every detail needed to reproduce: app version/commit, browser, OS, device, account type, feature flags, and API environment. The missing detail is always the one that matters. |
| 51 | |
| 52 | ### Severity Classification |
| 53 | |
| 54 | Assign one level with justification: |
| 55 | |
| 56 | | Severity | Definition | Response expectation | |
| 57 | |----------|-----------|---------------------| |
| 58 | | **Critical** | Data loss, security breach, complete feature outage, revenue loss | Drop everything. Fix or mitigate within hours. | |
| 59 | | **High** | Major feature degraded, no workaround, affects many users | Fix within the current sprint. | |
| 60 | | **Medium** | Feature partially broken, workaround exists, affects some users | Schedule in next 1-2 sprints. | |
| 61 | | **Low** | Cosmetic issue, minor inconvenience, affects few users | Backlog. Fix when convenient. | |
| 62 | |
| 63 | Always include a justification, not just the label: "Critical — active revenue loss, no automated mitigation, requires manual refunds." |
| 64 | |
| 65 | ### Screenshots / Logs |
| 66 | |
| 67 | Attach evidence: screenshots, screen recordings, log entries with timestamps, network request/response pairs, and verbatim error messages. Redact secrets, never diagnostics. |
| 68 | |
| 69 | ### Root Cause Analysis (post-investigation) |
| 70 | |
| 71 | Fill this section after debugging. Leave it as a placeholder in the initial report. Cover three things: |
| 72 | |
| 73 | - **Root cause**: The specific code path, configuration, or interaction that caused the failure |
| 74 | - **Fix**: What was changed and why it resolves the issue |
| 75 | - **Prevention**: What test, check, or safeguard will prevent recurrence |
| 76 | |
| 77 | ## Good vs bad examples |
| 78 | |
| 79 | **Bad**: "The page is slow sometimes." |
| 80 | **Good**: "The /dashboard page takes 8-12 seconds to load for accounts with >500 projects. Query `SELECT * FROM projects WHERE account_id = ?` runs a full table scan (see EXPLAIN output attached). Accounts with <50 projects load in <1s." |
| 81 | |
| 82 | **Bad**: "Login doesn't work on mobile." |
| 83 | **Good**: "[Auth] Login form fails to submit on iOS Safari 17 — keyboard 'Go' button does not trigger form submission. Tapping the Login button directly works. Affects all iOS Safari users. Workaround: tap the Login button instead of using keyboard submit." |
| 84 | |
| 85 | ## Quality checklist |
| 86 | |
| 87 | Before submitting a bug report, verify: |
| 88 | |
| 89 | - [ ] Title identifies the component, the failure, and the trigger |
| 90 | - [ ] Reproduction steps are numbered, specific, and independently followable |
| 91 | - [ ] Expected and actual behavior are both stated explicitly |
| 92 | - [ ] Environment section includes version, browser/OS, account type, and feature flags |
| 93 | - [ ] Severity is assigned with a justification, not just a label |
| 94 | - [ ] Screenshots or logs are attached as evidence, not just described |
| 95 | - [ ] No assumptions are embedded — every claim is backed by an observation |
| 96 | |
| 97 | ## Common mistakes |
| 98 | |
| 99 | - **Combining multiple bugs in one report.** "Login is slow AND the forgot-password link is broken" is two bugs. File them separately so they can be triaged, assigned, and resolved independently. |
| 100 | - **Reproduction steps that require telepathy.** "Do the thing that causes the error" is not a step. Write steps so that someone who has never seen the product can reproduce the issue. |
| 101 | - **Skipping the expected behavior.** "It shows an error" means nothing without "It should show the user's dashboard." Developers cannot fix what they do not know is wrong. |
| 102 | - **Severity inflation.** Not every bug is critical. Mislabeling severity erodes trust and causes real critical bugs to be deprioritized. Use the definitions above honestly. |
| 103 | - **Redacting too much.** Removing the error message, stack trace, or request payload "for brevity" removes the information engineers need most. Redact secrets, not diagnostics. |
| 104 |