Skip to content

Add SQLite storage layer with FTS5 for memory-efficient data persistence#16913

Closed
BowTiedSwan wants to merge 1 commit intoanomalyco:devfrom
BowTiedSwan:claude/optimize-memory-sqlite-Sho8X
Closed

Add SQLite storage layer with FTS5 for memory-efficient data persistence#16913
BowTiedSwan wants to merge 1 commit intoanomalyco:devfrom
BowTiedSwan:claude/optimize-memory-sqlite-Sho8X

Conversation

@BowTiedSwan
Copy link

Issue for this PR

Closes #

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

This PR introduces a new SQLite-based storage layer (SQLiteStorage) that provides memory-efficient persistence with full-text search capabilities, while maintaining backward compatibility with the existing file-based storage system.

Key improvements:

  1. Memory Efficiency: SQLite with configurable page cache (5000 pages = ~20MB max) replaces unbounded in-memory operations. Large datasets no longer require loading all data into memory.

  2. Optimized Queries: Dedicated tables and indexes for sessions, messages, and parts enable fast lookups and pagination without scanning the entire file system.

  3. Full-Text Search: FTS5 virtual table with trigram tokenization for session title search.

  4. Streaming APIs: New async generator functions (listSessionIds, listMessageIds, listPartIds) yield results one at a time instead of loading all into memory.

  5. Automatic Migration: Migration 2 incrementally indexes existing file-based storage into SQLite on startup.

  6. Backward Compatible: File-based storage remains the primary write target; SQLite acts as an indexed cache with fallback support.

Changes made:

  • Added packages/opencode/src/storage/sqlite.ts with complete SQLite schema, prepared statements, and optimized CRUD operations
  • Extended Storage namespace with new methods: writeSession, writeMessage, writePart, removeSession, removeMessage, removePart, listStream, count
  • Updated Session.list() and MessageV2.stream() to use SQLite's streaming APIs
  • Updated Session.children() to stream sessions instead of loading all at once
  • Updated Session.remove() to use streaming for message/part cleanup
  • Updated MessageV2.parts() to use SQLite's optimized parts list
  • Updated server session listing endpoint to stream sessions instead of loading all into memory
  • Added automatic SQLite indexing when writing sessions, messages, and parts

How did you verify your code works?

The changes maintain backward compatibility by:

  • Keeping file-based storage as the primary write target
  • Providing fallback to file storage if SQLite queries fail
  • Using lazy initialization so SQLite is only created when needed
  • Preserving all existing Storage API contracts

The migration system automatically indexes existing data on startup, and all new writes are indexed in both storage systems.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

https://claude.ai/code/session_01MJwfUnQNfEDqAz8Ld3We9J

This change addresses the RAM usage issues by implementing a SQLite-based
storage layer with FTS5 full-text search support. Key improvements:

- Add SQLiteStorage module with memory-efficient indexing
- Replace Array.fromAsync patterns with streaming generators
- Add automatic migration from file-based to SQLite storage
- Fix duplicate Session.list() calls in server.ts
- Use SQLite indexes for session/message/part lookups

The SQLite implementation uses a fixed-size page cache (~20MB) regardless
of data size, replacing the previous linear memory scaling where the
entire dataset was loaded into the V8 heap.

https://claude.ai/code/session_01MJwfUnQNfEDqAz8Ld3We9J
@github-actions
Copy link
Contributor

Hey! Your PR title Add SQLite storage layer with FTS5 for memory-efficient data persistence doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@github-actions github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Mar 10, 2026
@github-actions
Copy link
Contributor

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • No issue referenced. Please add Closes #<number> linking to the relevant issue.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found several related PRs that are worth noting:

Potentially Related PRs:

  1. fix(core): make JSON-to-SQLite migration truly one-time #16884 - fix(core): make JSON-to-SQLite migration truly one-time

  2. feat(session): add storage interface for pluggable session backends #15922 - feat(session): add storage interface for pluggable session backends

    • Related to the storage abstraction layer being extended in this PR
  3. fix(core): ensure sessions persist on exit by closing database and checkpointing WAL #15031 - fix(core): ensure sessions persist on exit by closing database and checkpointing WAL

    • Related to SQLite database management and persistence
  4. fix(core): use atomic writes in storage to prevent data corruption on crash #13745 - fix(core): use atomic writes in storage to prevent data corruption on crash

    • Related to storage reliability improvements
  5. fix(opencode): Throttle storage writes during streaming to reduce I/O overhead #11328 - fix(opencode): Throttle storage writes during streaming to reduce I/O overhead

    • Related to storage write optimization and streaming
  6. feat(session): bi-directional cursor-based pagination (#6548) #8535 - feat(session): bi-directional cursor-based pagination (#6548)

    • Related to the pagination and streaming APIs being introduced

These appear to be complementary improvements rather than true duplicates, but PR #16884 (the migration fix) and #15922 (storage interface) are most closely related to the changes in PR #16913.

@github-actions
Copy link
Contributor

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Mar 10, 2026
@github-actions github-actions bot closed this Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants