UN-2105 [FEAT] Include adapter name in error messages by pk-zipstack · Pull Request #1825 · Zipstack/unstract

pk-zipstack · 2026-03-05T04:27:46Z

What

Preserve the user-facing adapter instance name (e.g., "Unstract Trial LLM") from the platform service response and include it in SDK1 error messages across LLM, Embedding, VectorDB, and X2Text adapter consumers.

Why

When an adapter error occurs during workflow execution, the error message only showed the provider name (e.g., "azureopenai") or a raw instance UUID, making it difficult for users to identify which configured adapter caused the failure.
This was reported in UN-2105.

How

In PlatformHelper._get_adapter_configuration(), the adapter name was already fetched from the platform service but immediately discarded via pop(). Now it is stored back in the config dict under a private key _adapter_name.
Each adapter consumer (LLM, Embedding, VectorDB, X2Text) extracts this name during init and includes it in error messages.
- Before: Error from LLM provider 'azureopenai': ...
- After: Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...
Index key generation sites (index.py, utils/indexing.py, prompt-service index_v2.py) strip the _adapter_name key before hashing to maintain backward compatibility with existing document index keys.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. The _adapter_name key is stripped before index key hashing, so existing indexed documents are unaffected. The only user-visible change is richer error messages that now include the adapter instance name alongside the provider name.

Database Migrations

None

Env Config

None

Relevant Docs

None

Related Issues or PRs

Fixes UN-2105

Dependencies Versions

None

Notes on Testing

Configure an adapter with a known name (e.g., "My OpenAI Adapter")
Trigger an adapter error (e.g., invalid credentials)
Verify the error message includes the adapter instance name

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

Preserve the user-facing adapter name (e.g., "Unstract Trial LLM") from the platform service response and include it in error messages across SDK1 adapter consumers (LLM, Embedding, VectorDB, X2Text). Previously, adapter names were discarded during config retrieval and errors only showed provider names or instance IDs, making it hard to identify which adapter caused the issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-05T04:28:02Z

Summary by CodeRabbit

Bug Fixes
- Improved error messages across embedding, LLM, vector database, and text extraction operations to include adapter context for better troubleshooting and debugging.
- Enhanced error reporting consistency with proper adapter metadata handling in index key generation.

Walkthrough

The changes add adapter name tracking throughout the SDK to provide contextual information in error messages. A new ADAPTER_NAME constant enables storing adapter metadata in configurations. Multiple adapter classes now capture and propagate adapter names from configs for improved error reporting. Index key generation retrieves and sanitizes adapter configs to maintain backward compatibility by removing adapter name metadata before hashing.

Changes

Cohort / File(s)	Summary
Adapter Metadata Constants `unstract/sdk1/src/unstract/sdk1/constants.py`	Added new `ADAPTER_NAME = "_adapter_name"` constant to the `Common` class for adapter metadata storage.
Adapter Configuration Storage `unstract/sdk1/src/unstract/sdk1/platform.py`	Updated `_get_adapter_configuration` to store adapter name in adapter_data dictionary using the new `ADAPTER_NAME` key after extraction.
Adapter Error Reporting `unstract/sdk1/src/unstract/sdk1/llm.py`, `unstract/sdk1/src/unstract/sdk1/embedding.py`, `unstract/sdk1/src/unstract/sdk1/vector_db.py`, `unstract/sdk1/src/unstract/sdk1/x2txt.py`	Added adapter name capture from configs and enhanced error messages to include adapter context. Error messages now display adapter name when available, falling back to adapter instance ID or provider name for better troubleshooting.
Index Key Generation `unstract/sdk1/src/unstract/sdk1/index.py`, `unstract/sdk1/src/unstract/sdk1/utils/indexing.py`, `prompt-service/src/unstract/prompt_service/core/index_v2.py`	Refactored to explicitly retrieve and sanitize adapter configurations by stripping `ADAPTER_NAME` metadata before hashing. Maintains backward compatibility by using cleaned configs in index key generation instead of inline adapter_config calls.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the main change: adding adapter names to error messages across the SDK, which aligns with the comprehensive modifications throughout the codebase.
Description check	✅ Passed	The PR description fully addresses all template sections including What, Why, How, breakage analysis, testing notes, and related issues, providing complete context for the feature.
Docstring Coverage	✅ Passed	Docstring coverage is 81.25% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/include-adapter-name-in-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

github-actions · 2026-03-05T04:29:16Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 63 passed, 0 failed (63 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-03-05T04:29:57Z

Quality Gate failed

Failed conditions
11.2% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

prompt-service/src/unstract/prompt_service/core/index_v2.py (1)
100-109: ⚠️ Potential issue | 🟠 Major

Pre-existing issue: file_hash may be undefined.

Note: This is not introduced by this PR, but file_hash is only assigned on line 81 when file_info.file_hash is falsy. When file_info.file_hash is truthy, the variable file_hash used on line 101 will be undefined, causing a NameError.

This should likely be:
file_hash = file_info.file_hash
if not file_hash:
    file_hash = fs.get_hash_from_file(path=file_info.file_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prompt-service/src/unstract/prompt_service/core/index_v2.py` around lines 100
- 109, The variable file_hash can be undefined when file_info.file_hash is
truthy; update the assignment logic in the code that populates index_key so that
file_hash is always set from file_info.file_hash first (e.g., file_hash =
file_info.file_hash) and only call
fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then
use that ensured file_hash when building index_key (references: file_hash,
file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/
chunk_overlap).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/embedding.py`:
- Around line 100-105: The async embedding error paths still call
parse_litellm_err with only provider values; update those calls to include the
adapter display string like the sync path does. Compute adapter_info using
self._adapter_name and self.adapter.get_name() (same logic as the shown
adapter_info variable) and pass adapter_info into parse_litellm_err instead of
the provider-only argument in all async error raises (search for
parse_litellm_err in this module, e.g., the async embedding functions around the
current raise and the other occurrences referenced). Ensure the raise statements
use "raise parse_litellm_err(e, adapter_info) from e" so adapter context is
included.

In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 266-273: The acomplete error path is building messages that only
include the provider name and drops the adapter instance name; update the error
construction in the acomplete handler to use the same adapter_info logic as
elsewhere (use self._adapter_name when present, formatted as
"'{self._adapter_name}' ({self.adapter.get_provider()})", else fall back to
"'{self.adapter.get_provider()}'") and use that adapter_info when composing
error_msg (same change should be applied to the other occurrence around lines
346-353) so async callers receive the adapter instance name in the error text.

---

Outside diff comments:
In `@prompt-service/src/unstract/prompt_service/core/index_v2.py`:
- Around line 100-109: The variable file_hash can be undefined when
file_info.file_hash is truthy; update the assignment logic in the code that
populates index_key so that file_hash is always set from file_info.file_hash
first (e.g., file_hash = file_info.file_hash) and only call
fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then
use that ensured file_hash when building index_key (references: file_hash,
file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/
chunk_overlap).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ff0f61a0-7b62-4b96-ae66-eecf6097d69f

📥 Commits

Reviewing files that changed from the base of the PR and between e0c2af0 and 6b9dd9f.

📒 Files selected for processing (9)

prompt-service/src/unstract/prompt_service/core/index_v2.py
unstract/sdk1/src/unstract/sdk1/constants.py
unstract/sdk1/src/unstract/sdk1/embedding.py
unstract/sdk1/src/unstract/sdk1/index.py
unstract/sdk1/src/unstract/sdk1/llm.py
unstract/sdk1/src/unstract/sdk1/platform.py
unstract/sdk1/src/unstract/sdk1/utils/indexing.py
unstract/sdk1/src/unstract/sdk1/vector_db.py
unstract/sdk1/src/unstract/sdk1/x2txt.py

coderabbitai · 2026-03-05T04:36:30Z

unstract/sdk1/src/unstract/sdk1/embedding.py

+            adapter_info = (
+                f"{self._adapter_name} ({self.adapter.get_name()})"
+                if self._adapter_name
+                else self.adapter.get_name()
+            )
+            raise parse_litellm_err(e, adapter_info) from e


⚠️ Potential issue | 🟠 Major

Async embedding errors still miss adapter-name context.

You updated sync paths, but Line 153 and Line 166 still pass provider-only values to parse_litellm_err. That leaves async consumers outside the UN-2105 behavior.

Proposed fix

class Embedding: + def _adapter_info(self) -> str: + return ( + f"{self._adapter_name} ({self.adapter.get_name()})" + if self._adapter_name + else self.adapter.get_name() + ) + @@ except Exception as e: - adapter_info = ( - f"{self._adapter_name} ({self.adapter.get_name()})" - if self._adapter_name - else self.adapter.get_name() - ) - raise parse_litellm_err(e, adapter_info) from e + raise parse_litellm_err(e, self._adapter_info()) from e @@ except Exception as e: - adapter_info = ( - f"{self._adapter_name} ({self.adapter.get_name()})" - if self._adapter_name - else self.adapter.get_name() - ) - raise parse_litellm_err(e, adapter_info) from e + raise parse_litellm_err(e, self._adapter_info()) from e @@ except Exception as e: - adapter_info = ( - f"{self._adapter_name} ({self.adapter.get_name()})" - if self._adapter_name - else self.adapter.get_name() - ) - raise parse_litellm_err(e, adapter_info) from e + raise parse_litellm_err(e, self._adapter_info()) from e @@ except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + raise parse_litellm_err(e, self._adapter_info()) from e @@ except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + raise parse_litellm_err(e, self._adapter_info()) from e

Also applies to: 119-124, 136-141

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/embedding.py` around lines 100 - 105, The async embedding error paths still call parse_litellm_err with only provider values; update those calls to include the adapter display string like the sync path does. Compute adapter_info using self._adapter_name and self.adapter.get_name() (same logic as the shown adapter_info variable) and pass adapter_info into parse_litellm_err instead of the provider-only argument in all async error raises (search for parse_litellm_err in this module, e.g., the async embedding functions around the current raise and the other occurrences referenced). Ensure the raise statements use "raise parse_litellm_err(e, adapter_info) from e" so adapter context is included.

coderabbitai · 2026-03-05T04:36:30Z

unstract/sdk1/src/unstract/sdk1/llm.py

+            adapter_info = (
+                f"'{self._adapter_name}' ({self.adapter.get_provider()})"
+                if self._adapter_name
+                else f"'{self.adapter.get_provider()}'"
+            )
            error_msg = (
-                f"Error from LLM provider '{self.adapter.get_provider()}': "
+                f"Error from LLM adapter {adapter_info}: "
                f"{strip_litellm_prefix(str(e))}"


⚠️ Potential issue | 🟠 Major

acomplete still emits provider-only errors.

Line 406 still builds "Error from LLM provider ..." and drops adapter instance name context, so async callers don’t get the same UN-2105 behavior.

Proposed fix

@@ + def _adapter_info(self) -> str: + return ( + f"'{self._adapter_name}' ({self.adapter.get_provider()})" + if self._adapter_name + else f"'{self.adapter.get_provider()}'" + ) + @@ - adapter_info = ( - f"'{self._adapter_name}' ({self.adapter.get_provider()})" - if self._adapter_name - else f"'{self.adapter.get_provider()}'" - ) + adapter_info = self._adapter_info() @@ - adapter_info = ( - f"'{self._adapter_name}' ({self.adapter.get_provider()})" - if self._adapter_name - else f"'{self.adapter.get_provider()}'" - ) + adapter_info = self._adapter_info() @@ - error_msg = ( - f"Error from LLM provider '{self.adapter.get_provider()}': " - f"{strip_litellm_prefix(str(e))}" - ) + error_msg = ( + f"Error from LLM adapter {self._adapter_info()}: " + f"{strip_litellm_prefix(str(e))}" + )

Also applies to: 346-353

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 266 - 273, The acomplete error path is building messages that only include the provider name and drops the adapter instance name; update the error construction in the acomplete handler to use the same adapter_info logic as elsewhere (use self._adapter_name when present, formatted as "'{self._adapter_name}' ({self.adapter.get_provider()})", else fall back to "'{self.adapter.get_provider()}'") and use that adapter_info when composing error_msg (same change should be applied to the other occurrence around lines 346-353) so async callers receive the adapter instance name in the error text.

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b9dd9f

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UN-2105 [FEAT] Include adapter name in error messages#1825

UN-2105 [FEAT] Include adapter name in error messages#1825
pk-zipstack wants to merge 2 commits intomainfrom
feature/include-adapter-name-in-errors

pk-zipstack commented Mar 5, 2026 •

edited by atlassian bot

Loading

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

sonarqubecloud bot commented Mar 5, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 5, 2026

Uh oh!

coderabbitai bot Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pk-zipstack commented Mar 5, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Uh oh!

github-actions bot commented Mar 5, 2026

Test Results

Uh oh!

sonarqubecloud bot commented Mar 5, 2026

Quality Gate failed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pk-zipstack commented Mar 5, 2026 •

edited by atlassian bot

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading