Skip to content
Canopy is in pre-release. These docs describe the product at its public launch — commands, tool names, and integration examples reflect what you'll see once binaries ship. Join the waitlist →

Debug Slow Indexing

A full canopy index on a typical project (1K–50K files) takes 5–60 seconds. If indexing is taking several minutes or hanging, one of a few common causes is usually responsible.

Run the index with default verbosity. Canopy prints a single timing line per phase:

Terminal window
canopy index . --with-search

Expected output:

canopy: indexing /home/you/repos/myapp (incremental)
canopy: AST index done — 8432 files indexed, 12 skipped, 0 errors (42184 ms)
canopy: building full-text search index...
canopy: search index done — 8432 files, 18923 chunks, 18923 indexed (3470 ms)

The two done — ... (N ms) lines tell you which phase ran long. AST indexing dominates total time on most repos; if (N ms) is huge, the next steps narrow the cause.

For deeper per-phase tracing, enable RUST_LOG=debug to surface the internal tracing::debug! events. The exact field names and module paths depend on the build (canopy-indexer, canopy-ast, canopy-search) and on the tracing subscriber’s format — there’s no stable schema to grep against, so use the debug stream to scan for the file path that’s stalling rather than for fixed labels:

Terminal window
RUST_LOG=debug canopy index . --with-search 2>&1 | tail -60
Terminal window
canopy status

If the files line is much higher than you expect, files are being indexed that shouldn’t be:

canopy status for /home/you/repos/myapp
...
files : 94832 ← suspiciously high for a mid-size project
...

Run the full debug log and look for the file discovery count:

Terminal window
CANOPY_LOG=debug canopy index . 2>&1 | head -5

If the discovery count is high, node_modules, dist, or another large directory is being traversed.

The most common cause of slow indexing is large directories that should be excluded.

Add to .canopy/config.toml:

[index]
ignored_paths = [
"node_modules",
"dist",
"build",
".next",
".turbo",
"coverage",
".cache",
"vendor",
"**/*.min.js",
"**/*.bundle.js",
"**/*.map",
]

Canopy respects .gitignore by default — any file ignored by git is also ignored by Canopy. If node_modules is in .gitignore, you don’t need to add it to ignored_paths explicitly. Check:

Terminal window
git check-ignore node_modules

If this prints node_modules, git is ignoring it and Canopy will too.

If the directory is NOT in .gitignore but should be excluded from indexing, add it to ignored_paths.

Generated files — API type stubs, bundled JS, minified assets — are expensive to parse and rarely useful to index. Exclude them by pattern:

[index]
ignored_paths = [
"**/*.generated.ts",
"**/*.generated.js",
"src/generated/**",
"openapi/generated/**",
"graphql/generated/**",
]

For very large files that slip through, set a file size limit:

[index]
max_file_size_bytes = 524288 # 512 KB — skips large lock files and bundles

Canopy’s default is 1 MB. Setting it to 512 KB filters most problem files while keeping all normal source files.

canopy index is incremental by default — it only re-parses files that have changed since the last index. After the first full index, subsequent runs are usually under 5 seconds for repos with normal commit sizes.

When indexing seems slow on every run (not just the first), you’re hitting a full re-index each time. Causes:

  • The .canopy/index/ directory is being deleted between runs (common in CI without caching)
  • A config change (ignored_paths, entry_points) triggers a rebuild
  • Using --full flag explicitly

To confirm the index is truly incremental, look at the first line canopy index prints — it explicitly names the mode:

Terminal window
canopy index .

Output for incremental (re-running on an existing index):

canopy: indexing /home/you/repos/myapp (incremental)

Output for full (first run, or after --full, or when no index is found):

canopy: indexing /home/you/repos/myapp (full)

If you see “full” on every run in CI, add index caching. See CI Cached Indexes.

canopy serve . --watch runs a file watcher in addition to serving MCP. On repos with frequent file changes (active development with hot reload, generated files changing on save), the watcher can trigger many small incremental re-indexes.

If the watcher is adding noticeable overhead:

Terminal window
# Remove --watch to disable
canopy serve .

Without --watch, the index is not updated while the server runs — it reflects the state when canopy serve started. Restart the server to pick up new changes.

Indexing completes but feels slow in practice The index itself may be fine but searches are slow. Run:

Terminal window
CANOPY_LOG=debug canopy search "test query" 2>&1 | tail -3

If search latency is high, the search index may be large. Trim ignored_paths to reduce it, or increase max_results to check if it’s a result pagination issue.

canopy index hangs indefinitely A file with unusual content (binary file misidentified as text, circular symlink in the directory tree) can cause the parser to stall. Kill the process, then run:

Terminal window
CANOPY_LOG=trace canopy index . 2>&1 | tail -20

The last few lines show which file it was processing when it stalled. Add that file or directory to ignored_paths.

Slow indexing only on CI (not local) CI runners typically have slower disk I/O than developer machines, and they start from a clean state with no index cache. See CI Cached Indexes to restore a cached index instead of re-indexing from scratch.