engineering

Codebase Exploration

Systematic methodology for understanding unfamiliar repositories — finding entry points, mapping architecture layers, tracing data flows, and identifying patterns and conventions used across the codebase.

onboardingarchitecturecode-readingcodebaseexplorationreverse-engineering

Works well with agents

Code Explainer AgentCodebase Onboarder AgentSoftware Architect AgentTech Lead Agent

Works well with skills

Architecture Decision RecordCode Review ChecklistSystem Design Document
$ npx skills add The-AI-Directory-Company/(…) --skill codebase-exploration
codebase-exploration/
    • exploration-summary-nextjs-saas.md10.5 KB
    • framework-entry-points.md9.1 KB
  • SKILL.md6.1 KB
SKILL.md
Markdown
1 
2# Codebase Exploration
3 
4## Before you start
5 
6Gather the following from the user. If anything is missing, ask before proceeding:
7 
81. **What is the repository?** — URL or local path to the codebase
92. **What is the goal?** — Bug fix, feature addition, general understanding, onboarding, or audit
103. **What do you already know?** — Language, framework, or prior context (even partial)
114. **What is the scope?** — Entire repo, a specific subsystem, or a single feature flow
125. **What is the time budget?** — Quick orientation (30 min) or deep mapping (hours)
13 
14## Exploration procedure
15 
16### 1. Read the Project Manifest
17 
18Start with the files that declare what the project is and how it runs:
19 
20- `README.md`, `CONTRIBUTING.md`, `CLAUDE.md` — stated architecture, setup, conventions
21- `package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod` — language, dependencies, scripts
22- `Dockerfile`, `docker-compose.yml`, `.env.example` — runtime environment and services
23- CI config (`.github/workflows/`, `.gitlab-ci.yml`) — build steps reveal the dependency graph
24 
25Record: language, framework, build tool, test runner, deployment target.
26 
27### 2. Map the Directory Structure
28 
29Run a shallow tree (depth 2-3) and classify each top-level directory:
30 
31- **Entry points**: `src/index.*`, `app/`, `cmd/`, `main.*`
32- **Configuration**: config files, env schemas, feature flags
33- **Domain logic**: models, services, use-cases, controllers
34- **Data access**: repositories, queries, migrations, ORM schemas
35- **API surface**: routes, handlers, resolvers, RPC definitions
36- **Shared utilities**: libs, helpers, utils, common
37- **Tests**: test directories, fixture files, factories
38 
39Sketch a layer diagram: entry point -> routing -> handlers -> domain -> data access -> external services.
40 
41### 3. Trace the Primary Data Flow
42 
43Pick the most important user action (e.g., "user signs up", "order is placed") and trace it end-to-end:
44 
451. Find the route or entry point that handles it
462. Follow the handler into service/domain logic
473. Identify every database query, API call, or side effect
484. Note the response path back to the caller
495. Record each file touched and its role in the flow
50 
51This single trace reveals naming conventions, error handling patterns, and the project's layering strategy.
52 
53### 4. Identify Patterns and Conventions
54 
55Look for recurring structural patterns across 3-5 files of the same type:
56 
57- **Naming**: How are files, functions, variables, and types named?
58- **Error handling**: Exceptions, result types, error codes, or error boundaries?
59- **State management**: Global store, context, dependency injection, or passed parameters?
60- **Authentication/authorization**: Middleware, decorators, guards, or inline checks?
61- **Testing style**: Unit-heavy, integration-heavy, or end-to-end? Mocks or real dependencies?
62 
63Document each pattern with a concrete file reference.
64 
65### 5. Map External Dependencies
66 
67Identify every external system the codebase communicates with:
68 
69- Databases and caches (connection strings, ORM config)
70- Third-party APIs (HTTP clients, SDK imports)
71- Message queues or event buses
72- File storage (S3, local disk)
73- Authentication providers
74 
75For each, note: what module owns the integration, how errors are handled, and whether there is a fallback.
76 
77### 6. Locate the Test Suite
78 
79Find where tests live and assess coverage:
80 
81- Run the test command from the manifest (e.g., `npm test`, `pytest`)
82- Identify which areas have dense coverage and which have none
83- Check for test utilities, factories, or fixtures that reveal domain assumptions
84 
85### 7. Produce the Exploration Summary
86 
87Deliver a structured summary:
88 
89| Section | Content |
90|---------|---------|
91| Stack | Language, framework, runtime, key libraries |
92| Architecture | Layer diagram or description |
93| Entry points | Main files that start the application |
94| Primary data flow | Step-by-step trace of the core user action |
95| Patterns | Naming, error handling, state, auth conventions |
96| External deps | Every external system and its integration module |
97| Test coverage | Where tests exist, where they are missing |
98| Risks/concerns | Dead code, circular deps, missing docs, unclear ownership |
99 
100## Quality checklist
101 
102Before delivering the exploration summary, verify:
103 
104- [ ] The project manifest was read and stack is identified correctly
105- [ ] Directory structure is classified by responsibility, not just listed
106- [ ] At least one end-to-end data flow is traced with specific file references
107- [ ] Patterns are documented with concrete examples, not guessed
108- [ ] External dependencies are enumerated with owning modules
109- [ ] Test coverage gaps are identified
110- [ ] The summary is structured and scannable, not a wall of text
111 
112## Common mistakes
113 
114- **Jumping straight to code without reading the manifest.** The README, package manager config, and CI files answer half your questions in 5 minutes.
115- **Listing files instead of classifying them.** A directory listing is not understanding. Every folder should have a role label.
116- **Stopping at the surface layer.** Reading route definitions without tracing into handlers and data access misses the actual architecture.
117- **Assuming conventions from one file.** Check at least 3 files of the same type before declaring a pattern. One file might be an exception.
118- **Ignoring the test suite.** Tests are executable documentation. They reveal intended behavior, edge cases, and which parts the team considers important.
119- **Producing an unstructured brain dump.** The output should be a reference someone can scan in 2 minutes, not a narrative essay.
120 

©2026 ai-directory.company

·Privacy·Terms·Cookies·