Skip to content

Conversation

@a-klos
Copy link
Member

@a-klos a-klos commented Oct 15, 2025

This pull request introduces several improvements and refactors to the retriever and reranker configuration, document retrieval logic, and related documentation. The main focus is on making settings more explicit and robust, optimizing retrieval performance, and improving summary/document handling.

Configuration and Settings Refactor:

  • Standardizes the retriever's global cap setting to RETRIEVER_TOTAL_K_DOCUMENTS, replacing legacy environment variable names and ensuring backward compatibility. This change is reflected in both the code (RetrieverSettings) and configuration files (values.yaml, README.md). [1] [2] [3] [4] [5]
  • Expands reranker settings to include a minimum relevance score and an enabled flag, making reranker behavior more configurable. [1] [2] [3]

Retriever and Reranker Logic Improvements:

  • Refactors the CompositeRetriever to:
    • Run all retrievers concurrently using asyncio.gather for significant latency reduction.
    • Improve duplicate filtering and add early global pruning based on the new cap.
    • Add summary document expansion logic, so related documents are fetched and summaries are handled more robustly.
    • Make reranker invocation conditional and safer, with error handling and respect for new settings. [1] [2] [3] [4] [5]
  • Updates dependency injection to pass new settings to retriever and reranker components. [1] [2]
  • Adds a batch document fetch method to the Qdrant vector database implementation to support summary expansion.

User Experience and Documentation:

  • Improves error handling in chat retrieval: if only summary documents are found (no underlying content), a user-friendly "no documents found" message is returned.
  • Clarifies documentation regarding configuration variables and setup instructions, especially for Windows users and certificate issuer setup. [1] [2]

Summary of Most Important Changes:

Configuration and Settings:

  • Standardizes and documents the canonical retriever document cap as RETRIEVER_TOTAL_K_DOCUMENTS, deprecating legacy names and updating all relevant configs and documentation. [1] [2] [3] [4] [5]
  • Adds min_relevance_score and enabled fields to reranker settings, with corresponding config support. [1] [2] [3]

Retriever and Reranker Logic:

  • Refactors CompositeRetriever to use concurrent retrieval, more efficient duplicate filtering, summary document expansion, and robust reranker invocation with new settings. [1] [2] [3] [4] [5]
  • Updates dependency injection to propagate new retriever and reranker settings throughout the application. [1] [2]
  • Adds a batch document retrieval method to the Qdrant vector database for summary expansion support.

User Experience:

  • Improves chat retrieval error handling to return a clear message when only summaries (no underlying documents) are found.
  • Enhances documentation for configuration and setup clarity, especially for Windows and certificate setup. [1] [2]

a-klos and others added 15 commits September 4, 2025 08:51
…and libraries; enhance tracing in TracedRunnable to include input and output in span updates.
…uster setup script with ingress-nginx installation
@robodev-r2d2 robodev-r2d2 marked this pull request as ready for review December 16, 2025 09:51
@a-klos a-klos merged commit fdd8570 into main Dec 16, 2025
12 checks passed
@a-klos a-klos deleted the chore/retriever-performance-improvement branch December 16, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants