Skip to content

feat: memory embedding router (local + API) (by Lumen)#198

Open
conoremclaughlin wants to merge 5 commits intomainfrom
lumen/feat/memory-embedding-router
Open

feat: memory embedding router (local + API) (by Lumen)#198
conoremclaughlin wants to merge 5 commits intomainfrom
lumen/feat/memory-embedding-router

Conversation

@conoremclaughlin
Copy link
Owner

Summary

Adds a first implementation pass for semantic memory embeddings with a provider router that supports both local and API backends.

What’s included

  • New embedding router service with vetted model catalog:
    • packages/api/src/services/embeddings/router.ts
    • packages/api/src/services/embeddings/vetted-models.ts
  • Environment/config support for embedding provider selection and recall tuning
  • MemoryRepository.remember() now attempts to embed each new memory and persist the vector metadata
  • MemoryRepository.recall() now attempts semantic recall via pgvector RPC and falls back to text search
  • New migration:
    • supabase/migrations/20260308092624_memory_embedding_recall.sql
    • adds idx_memories_embedding (HNSW)
    • adds match_memories(...) RPC for filtered cosine similarity search

Design notes

  • Router supports both local (Ollama) and API (OpenAI) with fallback behavior.
  • Current schema is still memories.embedding vector(1024), so runtime dimensions are enforced to 1024 for now.
  • This keeps implementation compatible today while leaving a path to model-aware indexing later.

Verification

  • yarn workspace @personal-context/api test src/data/repositories/memory-repository.test.ts
  • yarn workspace @personal-context/api test src/mcp/tools/memory-handlers.test.ts

Follow-ups

  • Add async janitor/backfill re-embedding job
  • Add model-aware dimension/index strategy when we commit to Option B
  • Expose vetted model list + active embedding profile via MCP/admin endpoint

- add recallMode (text/semantic/hybrid/auto) and hybrid reranking in memory recall
- add internal non-PII gold set dataset with memory-derived provenance notes
- add memory recall benchmark runner with JSON export + DB persistence tables
- wire recallMode through MCP schemas/tool docs and add scripts for benchmark execution
- add threadKey/focusText-aware scoring for high-salience bootstrap memories
- extend bootstrap MCP schema with threadKey/focusText and honor memoryLimit
- add bootstrap-relevance benchmark dataset + runner with DB persistence
- add benchmark script entries and score unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant