Skip to content

feat: drop memory index#43

Merged
mudler merged 3 commits intomainfrom
feat/drop-memory-index
Mar 20, 2026
Merged

feat: drop memory index#43
mudler merged 3 commits intomainfrom
feat/drop-memory-index

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Mar 19, 2026

This PR drops the memory index to use the filesystem instead

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the persisted in-memory “index” from PersistentKB state and shifts chunk lookup/deletion to be driven by engine metadata (source) plus filesystem scanning of assetDir. It also introduces a lightweight in-memory MockEngine and a Ginkgo/Gomega test suite to exercise persistence behavior without embeddings.

Changes:

  • Drop CollectionState.Index / PersistentKB.index and derive document listing from the on-disk UUID layout.
  • Extend the Engine interface with GetBySource and implement it for Postgres/Chromem/Mock (LocalAI stubbed).
  • Add mock-based persistency tests covering store/list/search/reset/remove and external sources.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
rag/persistency.go Removes state/index map and switches document/chunk operations to filesystem keys + Engine.GetBySource; updates reset/repopulate/migration logic.
rag/engine.go Extends Engine interface with GetBySource.
rag/engine/postgres.go Adds GetBySource query implementation for Postgres-backed engine.
rag/engine/chromem.go Adds GetBySource implementation using a metadata-filtered query.
rag/engine/localai.go Adds GetBySource stub returning “not implemented”.
rag/engine/mock.go Adds new in-memory MockEngine for tests (no embeddings/external deps).
rag/persistency_mock_test.go Adds a comprehensive mock-engine test suite for PersistentKB.
Comments suppressed due to low confidence (1)

rag/persistency.go:200

  • PersistentKB.Reset ignores errors from os.RemoveAll/os.MkdirAll/db.save and os.RemoveAll(db.path), so Reset can return nil even when the on-disk state wasn’t actually cleared. Please handle and propagate these errors (and consider ordering so Engine.Reset failures don’t leave disk/state partially reset).
func (db *PersistentKB) Reset() error {
	db.Lock()
	os.RemoveAll(db.assetDir)
	os.MkdirAll(db.assetDir, 0755)
	db.sources = []*ExternalSource{}
	db.save()
	db.Unlock()
	if err := db.Engine.Reset(); err != nil {
		return err
	}
	os.RemoveAll(db.path)
	return nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

os.Remove(oldPath)
xlog.Info("Migrated entry", "old_key", fileName, "new_key", filepath.Join(fileUUID, fileName))
}

Comment on lines +172 to +182
count := c.collection.Count()
if count == 0 {
return nil, nil
}

// Use Query with a where filter to find documents by source metadata.
// We use a dummy query and request all documents, relying on the where
// filter to narrow results.
res, err := c.collection.Query(ctx, ".", count, map[string]string{"source": source}, nil)
if err != nil {
return nil, fmt.Errorf("error querying by source: %v", err)
Expect(filepath.Base(docs[0])).To(Equal("replace.txt"))

// Count should be roughly the same (old chunks removed, new added)
Expect(kb.Count()).To(BeNumerically("~", countAfterFirst, countAfterFirst))
)

// newMockKB creates a PersistentKB backed by a MockEngine.
// It writes a minimal state file so the constructor skips the embedding check.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler force-pushed the feat/drop-memory-index branch from e622b4f to 36b3173 Compare March 19, 2026 18:36
mudler and others added 2 commits March 19, 2026 19:45
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler force-pushed the feat/drop-memory-index branch from c27f9de to 2c4903b Compare March 19, 2026 23:20
@mudler mudler merged commit 8190f97 into main Mar 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants