GitHubnpm
Tools

Index Codebase

The index_codebase tool scans and indexes an entire directory. It supports incremental re-indexing, parallel processing, and automatic file watching.

Basic Usage

index_codebase(path: "/path/to/project", project_name: "my-app")

Response

The response includes:

  • Ready: yes or Ready: no — whether indexing completed
  • qdrant_vectors: <count> — number of vectors stored (when Qdrant is available)
  • Summary of files processed, skipped, and errors

Incremental Indexing

vibe-hnindex uses SHA-1 hashing to detect file changes. Only files with changed hashes are re-processed, making subsequent indexes fast.

Re-run the same command to update an existing index:

index_codebase(path: "/path/to/project", project_name: "my-app")

Watch Mode

By default (watch: true), vibe-hnindex starts a file watcher after indexing. The watcher automatically re-indexes changed files in real-time.

index_codebase(path: "/path/to/project", project_name: "my-app", watch: true)

Disable watching:

index_codebase(path: "/path/to/project", project_name: "my-app", watch: false)

Indexing Pipeline

Scan directory → filter (40+ extensions; skip node_modules, .git, dist…)
  → SHA-1 hash → skip unchanged files
  → chunk (≈60 lines, boundary-aware, overlap)
  → embed (Ollama bge-m3, batch 32, 1024-dim)
  → SQLite (text + FTS5) + Qdrant (vectors)

Parallel Indexing (v0.8.0+)

index_codebase uses worker threads for parallel chunking and embedding. By default, it uses CPU count - 1 workers (~3-4× faster on multi-core machines).

Configure via environment variables:

# MCP env
INDEX_WORKERS=auto     # Use all available cores (default)
INDEX_WORKERS=4         # Manual: use exactly 4 workers
INDEX_WORKERS=1         # Single-threaded
INDEX_PARALLEL_BATCH=16 # Files per batch

Supported Languages

TypeScript, JavaScript, Python, Java, Go, Rust, C, C++, C#, Ruby, PHP, Swift, Kotlin, Scala, Lua, Bash, SQL, Vue, Svelte, HTML, CSS, SCSS, YAML, TOML, JSON, XML, Markdown, Protobuf, GraphQL, Terraform, Zig, Elixir, Erlang, Clojure, Haskell, OCaml, F#, Dart, Solidity, CMake, Gradle, Dockerfile, Makefile, and more.

Excluded Files

By default, these are skipped:

  • node_modules, .git, dist, build
  • __pycache__, vendor
  • Lockfiles, binaries
  • Files larger than MAX_FILE_SIZE (default 1 MB)

Customize with .hnindexignore — see Configuration → .hnindexignore.

Indexing a Single File

For re-indexing a single file in an existing project:

index_file(file_path: "/path/to/file.ts", project_name: "my-app")
  • list_projects — Lists all indexed projects with metadata
  • delete_project(project_name: "my-app") — Removes a project from SQLite and Qdrant
  • get_file_info(file_path: "...", project_name: "...") — Metadata for a specific indexed file