Add documentation: Explain the Embedding Generation Process #5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Embedding Generation Process
This document outlines the process of generating embeddings for documents, which is a crucial component for enabling semantic search within the application. It details the model used, the parameters involved, and the role of the Supabase function responsible for embedding generation.
Target Audience: Backend Developers, Data Scientists
Semantic Search
Semantic search aims to understand the meaning behind search queries and documents, rather than simply matching keywords. This allows users to find relevant information even if the exact words they use don't appear in the documents themselves. Our application leverages embeddings to achieve this.
The core idea is to represent both search queries and documents as vectors in a high-dimensional space. The closer two vectors are in this space, the more semantically similar the corresponding query and document are. We use cosine similarity to measure the distance between these vectors.
Embed