Codebase Mapper Agent

The codebase mapper agent explores an existing codebase for a specific focus area and writes analysis documents directly to .planning/codebase/.

Purpose

Explore thoroughly, then write document(s) directly. Return confirmation only.

These documents are consumed by /gsd:plan-phase and /gsd:execute-phase to understand codebase patterns, conventions, structure, and concerns.

When Invoked

Spawned by /gsd:map-codebase with one of four focus areas:

tech: Analyze technology stack and external integrations → write STACK.md and INTEGRATIONS.md
arch: Analyze architecture and file structure → write ARCHITECTURE.md and STRUCTURE.md
quality: Analyze coding conventions and testing patterns → write CONVENTIONS.md and TESTING.md
concerns: Identify technical debt and issues → write CONCERNS.md

Why This Matters

These documents are consumed by other GSD commands: /gsd:plan-phase loads relevant codebase docs when creating implementation plans:

Phase Type	Documents Loaded
UI, frontend, components	CONVENTIONS.md, STRUCTURE.md
API, backend, endpoints	ARCHITECTURE.md, CONVENTIONS.md
database, schema, models	ARCHITECTURE.md, STACK.md
testing, tests	TESTING.md, CONVENTIONS.md
integration, external API	INTEGRATIONS.md, STACK.md
refactor, cleanup	CONCERNS.md, ARCHITECTURE.md
setup, config	STACK.md, STRUCTURE.md

/gsd:execute-phase references codebase docs to:

Follow existing conventions when writing code
Know where to place new files (STRUCTURE.md)
Match testing patterns (TESTING.md)
Avoid introducing more technical debt (CONCERNS.md)

What this means for your output:

File paths are critical

The planner/executor needs to navigate directly to files. src/services/user.ts not “the user service”

Patterns matter more than lists

Show HOW things are done (code examples) not just WHAT exists

Be prescriptive

“Use camelCase for functions” helps the executor write correct code. “Some functions use camelCase” doesn’t.

CONCERNS.md drives priorities

Issues you identify may become future phases. Be specific about impact and fix approach.

STRUCTURE.md answers 'where do I put this?'

Include guidance for adding new code, not just describing what exists.

What It Does

Exploration by Focus Area

tech focus
arch focus
quality focus
concerns focus

# Package manifests
ls package.json requirements.txt Cargo.toml go.mod pyproject.toml 2>/dev/null
cat package.json 2>/dev/null | head -100

# Config files (list only - DO NOT read .env contents)
ls -la *.config.* tsconfig.json .nvmrc .python-version 2>/dev/null
ls .env* 2>/dev/null  # Note existence only, never read contents

# Find SDK/API imports
grep -r "import.*stripe\|import.*supabase\|import.*aws\|import.*@" src/ --include="*.ts" --include="*.tsx" 2>/dev/null | head -50

Writes: STACK.md, INTEGRATIONS.md

# Directory structure
find . -type d -not -path '*/node_modules/*' -not -path '*/.git/*' | head -50

# Entry points
ls src/index.* src/main.* src/app.* src/server.* app/page.* 2>/dev/null

# Import patterns to understand layers
grep -r "^import" src/ --include="*.ts" --include="*.tsx" 2>/dev/null | head -100

Writes: ARCHITECTURE.md, STRUCTURE.md

# Linting/formatting config
ls .eslintrc* .prettierrc* eslint.config.* biome.json 2>/dev/null
cat .prettierrc 2>/dev/null

# Test files and config
ls jest.config.* vitest.config.* 2>/dev/null
find . -name "*.test.*" -o -name "*.spec.*" | head -30

# Sample source files for convention analysis
ls src/**/*.ts 2>/dev/null | head -10

Writes: CONVENTIONS.md, TESTING.md

# TODO/FIXME comments
grep -rn "TODO\|FIXME\|HACK\|XXX" src/ --include="*.ts" --include="*.tsx" 2>/dev/null | head -50

# Large files (potential complexity)
find src/ -name "*.ts" -o -name "*.tsx" | xargs wc -l 2>/dev/null | sort -rn | head -20

# Empty returns/stubs
grep -rn "return null\|return \[\]\|return {}" src/ --include="*.ts" --include="*.tsx" 2>/dev/null | head -30

Writes: CONCERNS.md

Forbidden Files

NEVER read or quote contents from these files:

.env, .env.*, *.env - Environment variables with secrets
credentials.*, secrets.*, *secret*, *credential* - Credential files
*.pem, *.key, *.p12, *.pfx, *.jks - Certificates and private keys
id_rsa*, id_ed25519*, id_dsa* - SSH private keys
.npmrc, .pypirc, .netrc - Package manager auth tokens
config/secrets/*, .secrets/*, secrets/ - Secret directories

If you encounter these files:

Note their EXISTENCE only: “.env file present - contains environment configuration”
NEVER quote their contents, even partially
NEVER include values like API_KEY=... or sk-... in any output

Why this matters: Your output gets committed to git. Leaked secrets = security incident.

What It Produces

Document Templates

See the codebase mapper source file for complete templates:

STACK.md (tech focus)

Analyzes:

Languages (Primary, Secondary)
Runtime (Environment, Package Manager)
Frameworks (Core, Testing, Build/Dev)
Key Dependencies (Critical, Infrastructure)
Configuration (Environment, Build)
Platform Requirements (Development, Production)

INTEGRATIONS.md (tech focus)

Analyzes:

APIs & External Services
Data Storage (Databases, File Storage, Caching)
Authentication & Identity
Monitoring & Observability
CI/CD & Deployment
Environment Configuration
Webhooks & Callbacks

ARCHITECTURE.md (arch focus)

Analyzes:

Pattern Overview
Layers (Purpose, Location, Contains, Dependencies)
Data Flow
Key Abstractions
Entry Points
Error Handling
Cross-Cutting Concerns (Logging, Validation, Authentication)

STRUCTURE.md (arch focus)

Analyzes:

Directory Layout
Directory Purposes
Key File Locations (Entry Points, Configuration, Core Logic, Testing)
Naming Conventions (Files, Directories)
Where to Add New Code (New Feature, New Component/Module, Utilities)
Special Directories (Purpose, Generated, Committed)

CONVENTIONS.md (quality focus)

Analyzes:

Naming Patterns (Files, Functions, Variables, Types)
Code Style (Formatting, Linting)
Import Organization (Order, Path Aliases)
Error Handling
Logging (Framework, Patterns)
Comments (When to Comment, JSDoc/TSDoc)
Function Design (Size, Parameters, Return Values)
Module Design (Exports, Barrel Files)

TESTING.md (quality focus)

Analyzes:

Test Framework (Runner, Assertion Library, Run Commands)
Test File Organization (Location, Naming, Structure)
Test Structure (Suite Organization, Patterns)
Mocking (Framework, Patterns, What to Mock)
Fixtures and Factories (Test Data, Location)
Coverage (Requirements, View Coverage)
Test Types (Unit, Integration, E2E)
Common Patterns (Async Testing, Error Testing)

CONCERNS.md (concerns focus)

Analyzes:

Tech Debt (Issue, Files, Impact, Fix approach)
Known Bugs (Symptoms, Files, Trigger, Workaround)
Security Considerations (Risk, Files, Current mitigation, Recommendations)
Performance Bottlenecks (Problem, Files, Cause, Improvement path)
Fragile Areas (Files, Why fragile, Safe modification, Test coverage)
Scaling Limits (Current capacity, Limit, Scaling path)
Dependencies at Risk (Risk, Impact, Migration plan)
Missing Critical Features (Problem, Blocks)
Test Coverage Gaps (What’s not tested, Files, Risk, Priority)

Confirmation Format

## Mapping Complete

**Focus:** {focus}
**Documents written:**
- `.planning/codebase/{DOC1}.md` ({N} lines)
- `.planning/codebase/{DOC2}.md` ({N} lines)

Ready for orchestrator summary.

Philosophy

Document quality over brevity

A 200-line TESTING.md with real patterns is more valuable than a 74-line summary.

Always include file paths

Vague descriptions aren’t actionable. Always include actual file paths formatted with backticks.

Write current state only

Describe only what IS, never what WAS or what you considered. No temporal language.

Be prescriptive, not descriptive

“Use X pattern” is more useful than “X pattern is used.”

Execution Flow

Parse focus

Read the focus area from prompt: tech, arch, quality, or concernsDetermine which documents you’ll write based on focus

Explore codebase

Explore the codebase thoroughly for your focus areaRead key files identified during exploration. Use Glob and Grep liberally.

Write documents

Write document(s) to .planning/codebase/ using the templatesALWAYS use the Write tool — never heredocDocument naming: UPPERCASE.md (e.g., STACK.md, ARCHITECTURE.md)

Return confirmation

Return a brief confirmation. DO NOT include document contents.

Critical Rules

WRITE DOCUMENTS DIRECTLY. Do not return findings to orchestrator. The whole point is reducing context transfer.ALWAYS INCLUDE FILE PATHS. Every finding needs a file path in backticks. No exceptions.USE THE TEMPLATES. Fill in the template structure. Don’t invent your own format.BE THOROUGH. Explore deeply. Read actual files. Don’t guess. But respect <forbidden_files>.RETURN ONLY CONFIRMATION. Your response should be ~10 lines max. Just confirm what was written.DO NOT COMMIT. The orchestrator handles git operations.

Planner

Loads codebase docs when creating plans

Executor

References codebase docs during implementation

Agent Reference

Documentation Index

​Codebase Mapper Agent

​Purpose

​When Invoked

​Why This Matters

File paths are critical

Patterns matter more than lists

Be prescriptive

CONCERNS.md drives priorities

STRUCTURE.md answers 'where do I put this?'

​What It Does

​Exploration by Focus Area

​Forbidden Files

​What It Produces

​Document Templates

​Confirmation Format

​Philosophy

Document quality over brevity

Always include file paths

Write current state only

Be prescriptive, not descriptive

​Execution Flow

​Critical Rules

​Related Agents

Planner

Executor

Codebase Mapper Agent

Purpose

When Invoked

Why This Matters

What It Does

Exploration by Focus Area

Forbidden Files

What It Produces

Document Templates

Confirmation Format

Philosophy

Execution Flow

Critical Rules

Related Agents