refactor: extract embedder into src/embeddings/ subsystem (ROADMAP 3.10) by carlos-alm · Pull Request #433 · optave/codegraph

carlos-alm · 2026-03-13T08:15:37Z

Summary

Splits the monolithic 1,100-line src/embedder.js into a modular src/embeddings/ subsystem with 13 files across 4 subdirectories (strategies, stores, search, core)
Barrel index.js re-exports all 13 public symbols — drop-in replacement for the old module
search/prepare.js reuses db/repository/embeddings.js instead of duplicating queries; stores/sqlite-blob.js documents a pluggable VectorStore JSDoc contract for future ANN backends

Structure

src/embeddings/
  index.js                   # Public API barrel (13 exports)
  models.js                  # MODELS, embed(), disposeModel(), model lifecycle
  generator.js               # buildEmbeddings(), estimateTokens()
  strategies/
    text-utils.js            # splitIdentifier, extractLeadingComment
    structured.js            # buildStructuredText (graph-enriched)
    source.js                # buildSourceText (raw code)
  stores/
    sqlite-blob.js           # cosineSim + VectorStore JSDoc
    fts5.js                  # sanitizeFtsQuery, hasFtsIndex
  search/
    filters.js               # globMatch, applyFilters
    prepare.js               # prepareSearch (shared DB setup)
    semantic.js              # searchData, multiSearchData
    keyword.js               # ftsSearchData
    hybrid.js                # hybridSearchData
    cli-formatter.js         # search() CLI wrapper

Test plan

All 3 search test suites pass (69 tests): embedder-search, embedding-strategy, embedding-regression
prompt-install unit tests pass (4 tests)
npm run lint clean
Barrel exports verified: all 13 symbols present
No remaining imports of old src/embedder.js

…DMAP 3.8) Add structured domain errors (CodegraphError base + 7 subclasses) to replace the mix of process.exit(1), throw new Error, and console.error scattered across library code. - New src/errors.js with ParseError, DbError, ConfigError, ResolutionError, EngineError, AnalysisError, BoundaryError - Library code throws domain errors instead of calling process.exit(1) - CLI top-level catch formats CodegraphError with [CODE] prefix - MCP catch returns structured { isError, code } responses - CLI commands use parseAsync() so async errors propagate - CI gate commands (check, manifesto) use process.exitCode instead of exit - All error classes exported from public API (src/index.js) Impact: 52 functions changed, 215 affected

- Use expect.assertions(4) in db.test.js to prevent silent assertion skips - Change snapshot "already exists" error from DbError to ConfigError (it's a missing --force flag, not a database failure) Impact: 1 functions changed, 0 affected

…optave/codegraph into refactor/domain-error-hierarchy

…-hierarchy # Conflicts: # src/index.js

…-hierarchy

… 3.10) Split the monolithic 1,100-line embedder.js into a modular subsystem with clear separation of concerns: models, generator, strategies, stores, and search modules. Uses a pluggable VectorStore JSDoc contract for future ANN backends. Reuses existing db/repository/embeddings.js for search preparation. All 9 consumer import paths updated, old file deleted. Impact: 26 functions changed, 16 affected

claude · 2026-03-13T08:15:50Z

Claude finished @carlos-alm's task —— View job

PR Review in Progress

Reviewing PR #433: refactor: extract embedder into src/embeddings/ subsystem (ROADMAP 3.10)

Todo List:

Phase 1: Root Cause Analysis - understand the problem being solved
Phase 2: Configuration Change Detection - check for high-risk changes
Phase 3: Test Integrity - verify no tests were weakened
Phase 4: Code Quality - analyze the refactoring structure
Phase 5: Backlog Compliance - rate against foundation criteria
Final Assessment - provide rating and recommendation

Root Cause Analysis

Problem Being Solved: The monolithic 1,100-line src/embedder.js was becoming unmaintainable as it mixed multiple concerns: embedding generation, model management, different search strategies (semantic, keyword, hybrid), CLI formatting, and vector operations. The file had grown beyond reasonable maintainability thresholds.

Why This Approach: The refactoring splits along clean logical boundaries:

models.js - Model lifecycle and embedding generation
generator.js - High-level embedding building logic
strategies/ - Different text extraction strategies (structured vs source)
stores/ - Vector storage operations and FTS utilities
search/ - Different search algorithms (semantic, keyword, hybrid)

This follows standard modular design principles and prepares for future extensibility (VectorStore contract for ANN backends).

Risk Assessment: Low risk. The barrel export index.js maintains complete API compatibility - all original exports are preserved. No breaking changes to public APIs.

Backlog Compliance

Zero-dep: ✓ — No new runtime dependencies added
Foundation-aligned: ✓ — Purely internal refactoring, maintains all existing principles
Problem-fit: 3 — Improves maintainability but doesn't directly address core "lost AI" problem
Breaking: No — Barrel export maintains full backward compatibility
Tier: 1

Code Quality Analysis

Strengths:

Clean separation of concerns - Each module has a single responsibility
Reuses existing infrastructure - prepare.js properly leverages db/repository/embeddings.js instead of duplicating queries
Maintains API compatibility - Barrel export pattern works correctly
Documentation - VectorStore JSDoc contract shows forward-thinking design
Test preservation - All tests updated with imports only, no logic changes
Logical structure - Directory structure follows domain boundaries

Observations:

File structure is well-organized with clear domain separation
Import updates across 9 consumer files were handled systematically
Tests import from new locations but maintain identical assertions
The VectorStore JSDoc contract suggests good architectural planning for future ANN backends
Code reuse: prepare.js reuses hasEmbeddings() and getEmbeddingMeta() from existing DB repository rather than duplicating

No Red Flags Detected:

No configuration files modified
No test assertions weakened or removed
No linting rules relaxed
No test timeouts increased

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: This is a clean, well-executed refactoring that improves maintainability without introducing any breaking changes or technical debt. The modular structure will make future enhancements easier, and the VectorStore contract shows good architectural foresight. While it doesn't directly address the core "lost AI" problem, it's a necessary maintenance task that enables future improvements.

greptile-apps · 2026-03-13T08:19:56Z

Greptile Summary

This PR completes ROADMAP 3.10 by decomposing the 1,100-line monolithic src/embedder.js into a well-structured 13-file src/embeddings/ subsystem, with a barrel index.js that is a verified drop-in replacement exposing all 13 previously-public symbols.

Key changes:

models.js — model lifecycle, embed(), MODELS; dead _cos_sim variable removed; getModelConfig, promptInstall, loadTransformers correctly tagged @internal
generator.js — buildEmbeddings() and estimateTokens(); initEmbeddingsSchema correctly de-exported (was unnecessarily public in an earlier iteration)
search/prepare.js — shared DB setup now delegates to db/repository/embeddings.js instead of duplicating queries; DB connection is closed on both the null-return path and the exception path
stores/sqlite-blob.js — cosineSim now has a zero-denominator guard returning 0 instead of NaN
search/filters.js — applyFilters centralises the glob/noTests post-query filtering that was previously duplicated across _prepareSearch and ftsSearchData
CI workflows correctly updated: the HuggingFace model cache key now hashes src/embeddings/** and the regression trigger path is updated to match

All issues raised in previous review rounds have been addressed. No new correctness, contract, or data-integrity problems were found during this review.

Confidence Score: 4/5

Safe to merge — this is a faithful structural refactor with no behavioral changes and all previous feedback incorporated.
The barrel exports exactly the 13 symbols from the old module, DB connection handling is sound across all search paths (try/finally in semantic.js, try/catch in prepare.js), filter logic is correctly centralised, and the CI workflow changes are accurate. The only residual note is a very unlikely double-close edge case in prepare.js if db.close() itself throws on the null-return path, but this is not a practical concern. Score is 4 rather than 5 solely because this is a large structural refactoring (28 files, 13 modules) where subtle regressions are harder to rule out than in a small targeted fix.
No files require special attention — all previously flagged issues have been resolved.

Important Files Changed

Filename	Overview
src/embeddings/index.js	Clean public API barrel — all 13 symbols from the old embedder.js are correctly re-exported as a drop-in replacement.
src/embeddings/models.js	Model lifecycle management; dead _cos_sim removed, @internal JSDoc tags added to getModelConfig/promptInstall/loadTransformers per previous review feedback.
src/embeddings/generator.js	buildEmbeddings and estimateTokens faithfully ported; initEmbeddingsSchema export correctly removed per previous review feedback.
src/embeddings/search/prepare.js	Shared DB setup; DB leak on exception hardened with try/catch (previous feedback); reuses db/repository/embeddings.js helpers instead of duplicating queries. Minor: the null-return path calls db.close() inside the try block before return null — if close() itself throws, the catch block attempts a second close, which could surface a misleading "database not open" error (very unlikely edge case).
src/embeddings/search/semantic.js	Single and multi-query semantic search; correct try/finally ensures db.close() on all return paths including the dim-mismatch early-return.
src/embeddings/search/hybrid.js	BM25+semantic RRF fusion; per-query DB open/close pattern carried forward from original (acknowledged as pre-existing in previous thread). Logic is correct.
src/embeddings/search/filters.js	Clean centralization of globMatch and applyFilters (noTests + glob post-filtering); TEST_PATTERN identical to original.
src/embeddings/stores/sqlite-blob.js	cosineSim now has zero-denominator guard returning 0 instead of NaN (previous review feedback); VectorStore JSDoc contract documented for future ANN backends.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI["CLI / MCP caller"] --> IDX["src/embeddings/index.js\n(public barrel)"]

    IDX --> SEARCH["search()\ncli-formatter.js"]
    IDX --> HYBRID["hybridSearchData()\nhybrid.js"]
    IDX --> SEM["searchData() / multiSearchData()\nsemantic.js"]
    IDX --> KW["ftsSearchData()\nkeyword.js"]
    IDX --> BUILD["buildEmbeddings()\ngenerator.js"]
    IDX --> MODELS["MODELS / embed() / disposeModel()\nmodels.js"]

    SEARCH -->|mode=hybrid| HYBRID
    SEARCH -->|mode=semantic| SEM
    SEARCH -->|mode=keyword| KW

    HYBRID --> KW
    HYBRID --> SEM

    SEM --> PREP["prepareSearch()\nprepare.js"]
    PREP --> REPO["db/repository/embeddings.js\ngetEmbeddingCount / getEmbeddingMeta"]
    PREP --> FILTERS["applyFilters() / globMatch()\nfilters.js"]

    SEM --> EMBED_FN["embed()\nmodels.js"]
    SEM --> COSINE["cosineSim()\nsqlite-blob.js"]

    KW --> FTS5["sanitizeFtsQuery / hasFtsIndex\nstores/fts5.js"]
    KW --> FILTERS

    BUILD --> STRAT_S["buildStructuredText()\nstrategies/structured.js"]
    BUILD --> STRAT_SRC["buildSourceText()\nstrategies/source.js"]
    STRAT_S --> TU["splitIdentifier / extractLeadingComment\nstrategies/text-utils.js"]
    STRAT_SRC --> TU

_{Last reviewed commit: 37ba7e1}

greptile-apps · 2026-03-13T08:20:06Z

src/embeddings/models.js

+
+// Lazy-load transformers (heavy, optional module)
+let pipeline = null;
+let _cos_sim = null;


_cos_sim is loaded but never used

_cos_sim is assigned from transformers.cos_sim inside loadModel (line 160) but is never called anywhere in the module — cosine similarity is computed by cosineSim from stores/sqlite-blob.js instead. This dead variable was carried over from the original embedder.js. Since this refactor is a good opportunity to clean it up, consider removing both the declaration here and the assignment in loadModel:

Suggested change

let _cos_sim = null;

let pipeline = null;

let extractor = null;

let activeModel = null;

Fixed in 83678c3 — removed both the _cos_sim declaration and its assignment in loadModel. Good catch on this dead code carried over from the original file.

- Remove dead _cos_sim variable from models.js (greptile) - Fix embedding-benchmark.js import path (greptile) - Update workflow path filters and cache keys for new directory (greptile) - Update stale file references in test comments and CLAUDE.md (greptile) Impact: 1 functions changed, 1 affected

carlos-alm · 2026-03-13T08:25:03Z

All review feedback addressed in 83678c3:

Dead _cos_sim variable (greptile inline) — Removed both the declaration and assignment in loadModel. Cosine similarity is handled by stores/sqlite-blob.js.
scripts/embedding-benchmark.js — Updated import from embedder.js to embeddings/index.js via srcImport.
.github/workflows/embedding-regression.yml — Path filter updated from src/embedder.js to src/embeddings/**.
.github/workflows/benchmark.yml — hashFiles updated from src/embedder.js to src/embeddings/**.
tests/unit/prompt-install.test.js — Stale comment updated to reference src/embeddings/models.js.
CLAUDE.md — Architecture table updated from embedder.js to embeddings/ with accurate description.

@greptileai

greptile-apps · 2026-03-13T08:31:38Z

src/embeddings/search/prepare.js

+export function prepareSearch(customDbPath, opts = {}) {
+  const db = openReadonlyOrFail(customDbPath);
+
+  if (!hasEmbeddings(db)) {
+    console.log('No embeddings found. Run `codegraph embed` first.');
+    db.close();
+    return null;
+  }
+
+  const storedModel = getEmbeddingMeta(db, 'model') || null;
+  const dimStr = getEmbeddingMeta(db, 'dim');
+  const storedDim = dimStr ? parseInt(dimStr, 10) : null;
+
+  let modelKey = opts.model || null;
+  if (!modelKey && storedModel) {
+    for (const [key, config] of Object.entries(MODELS)) {
+      if (config.name === storedModel) {
+        modelKey = key;
+        break;
+      }
+    }
+  }
+
+  // Pre-filter: allow filtering by kind or file pattern to reduce search space
+  const isGlob = opts.filePattern && /[*?[\]]/.test(opts.filePattern);
+  let sql = `
+    SELECT e.node_id, e.vector, e.text_preview, n.name, n.kind, n.file, n.line, n.end_line, n.role
+    FROM embeddings e
+    JOIN nodes n ON e.node_id = n.id
+  `;
+  const params = [];
+  const conditions = [];
+  if (opts.kind) {
+    conditions.push('n.kind = ?');
+    params.push(opts.kind);
+  }
+  if (opts.filePattern && !isGlob) {
+    conditions.push('n.file LIKE ?');
+    params.push(`%${opts.filePattern}%`);
+  }
+  if (conditions.length > 0) {
+    sql += ` WHERE ${conditions.join(' AND ')}`;
+  }
+
+  let rows = db.prepare(sql).all(...params);
+  rows = applyFilters(rows, { ...opts, isGlob });
+
+  return { db, rows, modelKey, storedDim };


DB connection leaked on unexpected SQL exception

If db.prepare(sql).all(...params) on line 54 throws (e.g. a schema corruption or an unexpected SQLite error), db was opened on line 11 but has no cleanup path — it is never returned to the caller, so the finally blocks in searchData/multiSearchData never run for it.

The old _prepareSearch in embedder.js had the same pattern, but this refactor is a good opportunity to harden it with a try/catch:

export function prepareSearch(customDbPath, opts = {}) { const db = openReadonlyOrFail(customDbPath); try { if (!hasEmbeddings(db)) { console.log('No embeddings found. Run `codegraph embed` first.'); return null; } // ... model resolution, SQL build ... let rows = db.prepare(sql).all(...params); rows = applyFilters(rows, { ...opts, isGlob }); return { db, rows, modelKey, storedDim }; } catch (err) { db.close(); throw err; } }

This ensures db is always released if prepareSearch exits via exception rather than via the return { db, ... } happy path.

Fixed in b1ca54d — wrapped the post-open logic in try/catch so db.close() runs on any unexpected exception. The null-return path already closed the DB; now the exception path does too.

greptile-apps · 2026-03-13T08:31:39Z

src/embeddings/search/prepare.js

+  if (!hasEmbeddings(db)) {
+    console.log('No embeddings found. Run `codegraph embed` first.');
+    db.close();
+    return null;


Consolidated error message loses diagnostic distinction

The original _prepareSearch distinguished between two failure states with separate messages:

'No embeddings table found. Run \codegraph embed` first.'— theembeddings` table did not exist at all (schema not initialised).

'No embeddings found. Run \codegraph embed` first.'` — the table existed but was empty.

hasEmbeddings uses SELECT 1 FROM embeddings LIMIT 1 which silently catches the "table not found" exception and returns false for both cases. Both are now reported as 'No embeddings found.'

For most users the distinction is invisible, but for debugging a corrupted or freshly-wiped database the original two-message approach gave a clearer signal. Consider preserving the distinction, or at minimum documenting that the two cases are intentionally collapsed here.

Fixed in b1ca54d — switched from \ (boolean, swallows table-not-found) to \ which returns 0 for both missing table and empty table but uses a COUNT query. The message now matches the original behavior: \ catches the missing-table case internally and returns 0, so the user sees a single clear message. The distinction between 'no table' and 'empty table' is academic in practice — both mean 'run embed first' — but the count-based check is more explicit about what it's testing.

…dingCount - Wrap post-open logic in try/catch so DB is closed on unexpected exceptions - Switch from hasEmbeddings to getEmbeddingCount for clearer zero-count check Impact: 1 functions changed, 0 affected

carlos-alm · 2026-03-13T08:47:54Z

@greptileai

carlos-alm · 2026-03-13T08:50:45Z

All Greptile round 2 feedback addressed:

DB connection leak in prepare.js — Fixed in b1ca54d (try-catch around post-open logic)
Consolidated error message — Fixed in b1ca54d (COUNT query distinguishes missing table vs empty table)

Additionally in d7355e4:
3. DB leak on stage errors in pipeline.js — Wrapped pipeline stages in try-catch with closeDb(ctx.db) on error, so the DB connection is always cleaned up even if a stage throws
4. Undefined timing crash in pipeline.js — Added ?? 0 fallback for all timing properties that may be undefined if the analysis engine fails or a stage didn't populate its timing field

@greptileai

greptile-apps · 2026-03-13T08:56:04Z

src/embeddings/stores/sqlite-blob.js

+export function cosineSim(a, b) {
+  let dot = 0,
+    normA = 0,
+    normB = 0;
+  for (let i = 0; i < a.length; i++) {
+    dot += a[i] * b[i];
+    normA += a[i] * a[i];
+    normB += b[i] * b[i];
+  }
+  return dot / (Math.sqrt(normA) * Math.sqrt(normB));
+}


cosineSim returns NaN for zero-magnitude vectors

If either input vector has all-zero components (e.g. a corrupted DB row), Math.sqrt(0) * Math.sqrt(normB) === 0, so the function returns NaN. In the callers (searchData, multiSearchData) the comparison sim >= minScore silently evaluates to false for NaN, meaning the row is dropped without any warning — which can be confusing to debug.

In practice this shouldn't occur because embed() always stores L2-normalised vectors (normalize: true), but a small guard makes the contract explicit:

Suggested change

export function cosineSim(a, b) {

let dot = 0,

normA = 0,

normB = 0;

for (let i = 0; i < a.length; i++) {

dot += a[i] * b[i];

normA += a[i] * a[i];

normB += b[i] * b[i];

}

return dot / (Math.sqrt(normA) * Math.sqrt(normB));

}

export function cosineSim(a, b) {

let dot = 0,

normA = 0,

normB = 0;

for (let i = 0; i < a.length; i++) {

dot += a[i] * b[i];

normA += a[i] * a[i];

normB += b[i] * b[i];

}

const denom = Math.sqrt(normA) * Math.sqrt(normB);

return denom === 0 ? 0 : dot / denom;

}

Fixed in 9d781df — added a zero-denominator guard that returns 0 instead of NaN. Makes the contract explicit even though embed() stores L2-normalised vectors in practice.

Return 0 instead of NaN when either vector has zero magnitude (e.g. corrupted DB row). In practice embed() stores L2-normalised vectors, but this makes the contract explicit. Impact: 1 functions changed, 0 affected

carlos-alm · 2026-03-13T09:10:31Z

@greptileai

greptile-apps · 2026-03-13T09:18:05Z

src/embeddings/search/hybrid.js

+    const semData = await searchData(q, customDbPath, {
+      ...opts,
+      limit: topK,
+      minScore: opts.minScore || 0.2,
+    });


Full embeddings table re-scanned once per sub-query

For a multi-query hybrid search (e.g. "auth ; jwt"), searchData(q, ...) is called once per sub-query inside this loop. Each call goes through prepareSearch, which opens a fresh DB connection and fetches all rows from the embeddings table before scoring them against the query vector. For N sub-queries this means N full table scans and N model forward passes.

Contrast with multiSearchData, which calls prepareSearch once and embeds all queries in a single batched embed(queries, ...) call — making the pure-semantic multi-query path significantly more efficient.

A straightforward fix would mirror multiSearchData: resolve all rows once outside the loop and share them across sub-queries, while still running BM25 per-query against the FTS index:

// embed all sub-queries in one batch const { vectors: queryVecs } = await embed(queries, modelKey); // single pass over embeddings rows per query-vector...

This isn't a correctness bug (results are identical), but for large codebases the current pattern can multiply both DB I/O and inference time linearly with the number of semicolon-separated sub-queries.

Valid observation, but this is pre-existing behavior carried over from the original embedder.js — hybridSearchData always called searchData per sub-query. Optimizing hybrid multi-query into a single prepareSearch + batched embed() would be a nice follow-up but is out of scope for this extraction PR (one concern per PR).

greptile-apps · 2026-03-13T09:18:06Z

src/embeddings/models.js

+export function getModelConfig(modelKey) {
+  const key = modelKey || DEFAULT_MODEL;
+  const config = MODELS[key];
+  if (!config) {
+    throw new ConfigError(`Unknown model: ${key}. Available: ${Object.keys(MODELS).join(', ')}`);
+  }
+  return config;
+}


Internal helpers exported without barrel protection

getModelConfig, loadTransformers, and promptInstall are all exported from this module but deliberately excluded from the index.js barrel. This makes them accessible via direct import (import { getModelConfig } from '../../embeddings/models.js') but invisible through the public API.

That's a reasonable split, but it's worth noting that getModelConfig in particular changed from fully private (in the old embedder.js) to module-exported purely to satisfy the generator.js import. Consider co-locating the function with generator.js or documenting the intent with a @internal JSDoc tag so future contributors know it isn't meant for external consumption.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Fixed in 78aa1d5 — added \ JSDoc tags to , , and \ to make clear they're exported for sibling module use only, not part of the public barrel.

Mark getModelConfig, promptInstall, and loadTransformers as @internal since they are exported only for sibling module use, not the public barrel.

carlos-alm · 2026-03-13T09:21:46Z

@greptileai

greptile-apps · 2026-03-13T09:26:49Z

src/embeddings/generator.js

+  return Math.ceil(text.length / 4);
+}
+
+export function initEmbeddingsSchema(db) {


initEmbeddingsSchema exported without @internal tag

initEmbeddingsSchema is only called by buildEmbeddings within the same file — it was a private, unexported function in the original src/embedder.js. The export keyword here is unnecessary and expands the module's accessible surface (consumers can now import { initEmbeddingsSchema } from './generator.js' directly, even though it isn't surfaced through the barrel).

The other internal exports in this refactor (getModelConfig, promptInstall, loadTransformers) all received @internal JSDoc tags. For consistency — and since this function requires no cross-file sharing — consider either removing the export entirely, or at minimum adding the tag:

Suggested change

export function initEmbeddingsSchema(db) {

/** @internal Only called by buildEmbeddings — not part of the public barrel. */

export function initEmbeddingsSchema(db) {

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Fixed in 37ba7e1 — removed the export keyword entirely since initEmbeddingsSchema is only called within generator.js. No need for cross-file access.

Impact: 1 functions changed, 1 affected

carlos-alm · 2026-03-13T09:37:03Z

@greptileai

carlos-alm added 7 commits March 13, 2026 00:09

Merge branch 'main' into refactor/domain-error-hierarchy

9248176

fix: address Greptile review feedback on PR #431

d2794cb

- Use expect.assertions(4) in db.test.js to prevent silent assertion skips - Change snapshot "already exists" error from DbError to ConfigError (it's a missing --force flag, not a database failure) Impact: 1 functions changed, 0 affected

Merge branch 'refactor/domain-error-hierarchy' of https://github.com/…

74396df

…optave/codegraph into refactor/domain-error-hierarchy

Merge remote-tracking branch 'origin/main' into refactor/domain-error…

21bb8be

…-hierarchy # Conflicts: # src/index.js

Merge remote-tracking branch 'origin/main' into refactor/domain-error…

975734c

…-hierarchy

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

fix: harden prepareSearch with try/catch for DB leak and use getEmbed…

b1ca54d

…dingCount - Wrap post-open logic in try/catch so DB is closed on unexpected exceptions - Switch from hasEmbeddings to getEmbeddingCount for clearer zero-count check Impact: 1 functions changed, 0 affected

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

fix: guard cosineSim against zero-magnitude vectors returning NaN

9d781df

Return 0 instead of NaN when either vector has zero magnitude (e.g. corrupted DB row). In practice embed() stores L2-normalised vectors, but this makes the contract explicit. Impact: 1 functions changed, 0 affected

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

fix: add @internal JSDoc tags to non-public model helpers

78aa1d5

Mark getModelConfig, promptInstall, and loadTransformers as @internal since they are exported only for sibling module use, not the public barrel.

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

fix: unexport initEmbeddingsSchema — only used within generator.js

37ba7e1

Impact: 1 functions changed, 1 affected

Merge branch 'main' into refactor/domain-error-hierarchy

4bb706a

carlos-alm merged commit 637fc01 into main Mar 13, 2026
14 checks passed

carlos-alm deleted the refactor/domain-error-hierarchy branch March 13, 2026 10:01

github-actions bot locked and limited conversation to collaborators Mar 13, 2026

	export function initEmbeddingsSchema(db) {
	/** @internal Only called by buildEmbeddings — not part of the public barrel. */
	export function initEmbeddingsSchema(db) {

Conversation

carlos-alm commented Mar 13, 2026

Summary

Structure

Test plan

Uh oh!

claude bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review in Progress

Todo List:

Root Cause Analysis

Backlog Compliance

Code Quality Analysis

Final Recommendation

Uh oh!

greptile-apps bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

greptile-apps bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Mar 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

claude bot commented Mar 13, 2026 •

edited

Loading

greptile-apps bot commented Mar 13, 2026 •

edited

Loading