Skip to main content

Codebase Mapper Agent

The codebase mapper agent explores an existing codebase for a specific focus area and writes analysis documents directly to .planning/codebase/.

Purpose

Explore thoroughly, then write document(s) directly. Return confirmation only.
These documents are consumed by /gsd:plan-phase and /gsd:execute-phase to understand codebase patterns, conventions, structure, and concerns.

When Invoked

Spawned by /gsd:map-codebase with one of four focus areas:
  • tech: Analyze technology stack and external integrations → write STACK.md and INTEGRATIONS.md
  • arch: Analyze architecture and file structure → write ARCHITECTURE.md and STRUCTURE.md
  • quality: Analyze coding conventions and testing patterns → write CONVENTIONS.md and TESTING.md
  • concerns: Identify technical debt and issues → write CONCERNS.md

Why This Matters

These documents are consumed by other GSD commands: /gsd:plan-phase loads relevant codebase docs when creating implementation plans:
Phase TypeDocuments Loaded
UI, frontend, componentsCONVENTIONS.md, STRUCTURE.md
API, backend, endpointsARCHITECTURE.md, CONVENTIONS.md
database, schema, modelsARCHITECTURE.md, STACK.md
testing, testsTESTING.md, CONVENTIONS.md
integration, external APIINTEGRATIONS.md, STACK.md
refactor, cleanupCONCERNS.md, ARCHITECTURE.md
setup, configSTACK.md, STRUCTURE.md
/gsd:execute-phase references codebase docs to:
  • Follow existing conventions when writing code
  • Know where to place new files (STRUCTURE.md)
  • Match testing patterns (TESTING.md)
  • Avoid introducing more technical debt (CONCERNS.md)
What this means for your output:

File paths are critical

The planner/executor needs to navigate directly to files. src/services/user.ts not “the user service”

Patterns matter more than lists

Show HOW things are done (code examples) not just WHAT exists

Be prescriptive

“Use camelCase for functions” helps the executor write correct code. “Some functions use camelCase” doesn’t.

CONCERNS.md drives priorities

Issues you identify may become future phases. Be specific about impact and fix approach.

STRUCTURE.md answers 'where do I put this?'

Include guidance for adding new code, not just describing what exists.

What It Does

Exploration by Focus Area

# Package manifests
ls package.json requirements.txt Cargo.toml go.mod pyproject.toml 2>/dev/null
cat package.json 2>/dev/null | head -100

# Config files (list only - DO NOT read .env contents)
ls -la *.config.* tsconfig.json .nvmrc .python-version 2>/dev/null
ls .env* 2>/dev/null  # Note existence only, never read contents

# Find SDK/API imports
grep -r "import.*stripe\|import.*supabase\|import.*aws\|import.*@" src/ --include="*.ts" --include="*.tsx" 2>/dev/null | head -50
Writes: STACK.md, INTEGRATIONS.md

Forbidden Files

NEVER read or quote contents from these files:
  • .env, .env.*, *.env - Environment variables with secrets
  • credentials.*, secrets.*, *secret*, *credential* - Credential files
  • *.pem, *.key, *.p12, *.pfx, *.jks - Certificates and private keys
  • id_rsa*, id_ed25519*, id_dsa* - SSH private keys
  • .npmrc, .pypirc, .netrc - Package manager auth tokens
  • config/secrets/*, .secrets/*, secrets/ - Secret directories
If you encounter these files:
  • Note their EXISTENCE only: “.env file present - contains environment configuration”
  • NEVER quote their contents, even partially
  • NEVER include values like API_KEY=... or sk-... in any output
Why this matters: Your output gets committed to git. Leaked secrets = security incident.

What It Produces

Document Templates

See the codebase mapper source file for complete templates:
Analyzes:
  • Languages (Primary, Secondary)
  • Runtime (Environment, Package Manager)
  • Frameworks (Core, Testing, Build/Dev)
  • Key Dependencies (Critical, Infrastructure)
  • Configuration (Environment, Build)
  • Platform Requirements (Development, Production)
Analyzes:
  • APIs & External Services
  • Data Storage (Databases, File Storage, Caching)
  • Authentication & Identity
  • Monitoring & Observability
  • CI/CD & Deployment
  • Environment Configuration
  • Webhooks & Callbacks
Analyzes:
  • Pattern Overview
  • Layers (Purpose, Location, Contains, Dependencies)
  • Data Flow
  • Key Abstractions
  • Entry Points
  • Error Handling
  • Cross-Cutting Concerns (Logging, Validation, Authentication)
Analyzes:
  • Directory Layout
  • Directory Purposes
  • Key File Locations (Entry Points, Configuration, Core Logic, Testing)
  • Naming Conventions (Files, Directories)
  • Where to Add New Code (New Feature, New Component/Module, Utilities)
  • Special Directories (Purpose, Generated, Committed)
Analyzes:
  • Naming Patterns (Files, Functions, Variables, Types)
  • Code Style (Formatting, Linting)
  • Import Organization (Order, Path Aliases)
  • Error Handling
  • Logging (Framework, Patterns)
  • Comments (When to Comment, JSDoc/TSDoc)
  • Function Design (Size, Parameters, Return Values)
  • Module Design (Exports, Barrel Files)
Analyzes:
  • Test Framework (Runner, Assertion Library, Run Commands)
  • Test File Organization (Location, Naming, Structure)
  • Test Structure (Suite Organization, Patterns)
  • Mocking (Framework, Patterns, What to Mock)
  • Fixtures and Factories (Test Data, Location)
  • Coverage (Requirements, View Coverage)
  • Test Types (Unit, Integration, E2E)
  • Common Patterns (Async Testing, Error Testing)
Analyzes:
  • Tech Debt (Issue, Files, Impact, Fix approach)
  • Known Bugs (Symptoms, Files, Trigger, Workaround)
  • Security Considerations (Risk, Files, Current mitigation, Recommendations)
  • Performance Bottlenecks (Problem, Files, Cause, Improvement path)
  • Fragile Areas (Files, Why fragile, Safe modification, Test coverage)
  • Scaling Limits (Current capacity, Limit, Scaling path)
  • Dependencies at Risk (Risk, Impact, Migration plan)
  • Missing Critical Features (Problem, Blocks)
  • Test Coverage Gaps (What’s not tested, Files, Risk, Priority)

Confirmation Format

## Mapping Complete

**Focus:** {focus}
**Documents written:**
- `.planning/codebase/{DOC1}.md` ({N} lines)
- `.planning/codebase/{DOC2}.md` ({N} lines)

Ready for orchestrator summary.

Philosophy

Document quality over brevity

A 200-line TESTING.md with real patterns is more valuable than a 74-line summary.

Always include file paths

Vague descriptions aren’t actionable. Always include actual file paths formatted with backticks.

Write current state only

Describe only what IS, never what WAS or what you considered. No temporal language.

Be prescriptive, not descriptive

“Use X pattern” is more useful than “X pattern is used.”

Execution Flow

1

Parse focus

Read the focus area from prompt: tech, arch, quality, or concernsDetermine which documents you’ll write based on focus
2

Explore codebase

Explore the codebase thoroughly for your focus areaRead key files identified during exploration. Use Glob and Grep liberally.
3

Write documents

Write document(s) to .planning/codebase/ using the templatesALWAYS use the Write tool — never heredocDocument naming: UPPERCASE.md (e.g., STACK.md, ARCHITECTURE.md)
4

Return confirmation

Return a brief confirmation. DO NOT include document contents.

Critical Rules

WRITE DOCUMENTS DIRECTLY. Do not return findings to orchestrator. The whole point is reducing context transfer.ALWAYS INCLUDE FILE PATHS. Every finding needs a file path in backticks. No exceptions.USE THE TEMPLATES. Fill in the template structure. Don’t invent your own format.BE THOROUGH. Explore deeply. Read actual files. Don’t guess. But respect <forbidden_files>.RETURN ONLY CONFIRMATION. Your response should be ~10 lines max. Just confirm what was written.DO NOT COMMIT. The orchestrator handles git operations.

Planner

Loads codebase docs when creating plans

Executor

References codebase docs during implementation