UN-2105 [FEAT] Include adapter name in error messages#1825
UN-2105 [FEAT] Include adapter name in error messages#1825pk-zipstack wants to merge 18 commits intomainfrom
Conversation
Preserve the user-facing adapter name (e.g., "Unstract Trial LLM") from the platform service response and include it in error messages across SDK1 adapter consumers (LLM, Embedding, VectorDB, X2Text). Previously, adapter names were discarded during config retrieval and errors only showed provider names or instance IDs, making it hard to identify which adapter caused the issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughIntroduce Common.ADAPTER_NAME and Utils.strip_adapter_name; platform stores adapter-name metadata; adapters capture adapter names for richer error context; indexing and key-generation fetch and sanitize adapter configs to exclude adapter-name metadata from generated index keys. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
prompt-service/src/unstract/prompt_service/core/index_v2.py (1)
100-109:⚠️ Potential issue | 🟠 MajorPre-existing issue:
file_hashmay be undefined.Note: This is not introduced by this PR, but
file_hashis only assigned on line 81 whenfile_info.file_hashis falsy. Whenfile_info.file_hashis truthy, the variablefile_hashused on line 101 will be undefined, causing aNameError.This should likely be:
file_hash = file_info.file_hash if not file_hash: file_hash = fs.get_hash_from_file(path=file_info.file_path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prompt-service/src/unstract/prompt_service/core/index_v2.py` around lines 100 - 109, The variable file_hash can be undefined when file_info.file_hash is truthy; update the assignment logic in the code that populates index_key so that file_hash is always set from file_info.file_hash first (e.g., file_hash = file_info.file_hash) and only call fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then use that ensured file_hash when building index_key (references: file_hash, file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/ chunk_overlap).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/embedding.py`:
- Around line 100-105: The async embedding error paths still call
parse_litellm_err with only provider values; update those calls to include the
adapter display string like the sync path does. Compute adapter_info using
self._adapter_name and self.adapter.get_name() (same logic as the shown
adapter_info variable) and pass adapter_info into parse_litellm_err instead of
the provider-only argument in all async error raises (search for
parse_litellm_err in this module, e.g., the async embedding functions around the
current raise and the other occurrences referenced). Ensure the raise statements
use "raise parse_litellm_err(e, adapter_info) from e" so adapter context is
included.
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 266-273: The acomplete error path is building messages that only
include the provider name and drops the adapter instance name; update the error
construction in the acomplete handler to use the same adapter_info logic as
elsewhere (use self._adapter_name when present, formatted as
"'{self._adapter_name}' ({self.adapter.get_provider()})", else fall back to
"'{self.adapter.get_provider()}'") and use that adapter_info when composing
error_msg (same change should be applied to the other occurrence around lines
346-353) so async callers receive the adapter instance name in the error text.
---
Outside diff comments:
In `@prompt-service/src/unstract/prompt_service/core/index_v2.py`:
- Around line 100-109: The variable file_hash can be undefined when
file_info.file_hash is truthy; update the assignment logic in the code that
populates index_key so that file_hash is always set from file_info.file_hash
first (e.g., file_hash = file_info.file_hash) and only call
fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then
use that ensured file_hash when building index_key (references: file_hash,
file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/
chunk_overlap).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ff0f61a0-7b62-4b96-ae66-eecf6097d69f
📒 Files selected for processing (9)
prompt-service/src/unstract/prompt_service/core/index_v2.pyunstract/sdk1/src/unstract/sdk1/constants.pyunstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/index.pyunstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/src/unstract/sdk1/platform.pyunstract/sdk1/src/unstract/sdk1/utils/indexing.pyunstract/sdk1/src/unstract/sdk1/vector_db.pyunstract/sdk1/src/unstract/sdk1/x2txt.py
…er-name-in-errors
Simplify adapter_info construction using or-chaining instead of ternary conditionals to bring cognitive complexity back within the SonarCloud limit of 15. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/embedding.py (1)
141-156:⚠️ Potential issue | 🟠 MajorAsync embedding errors still drop the adapter instance name.
Line 142 and Line 155 still pass provider-only values to
parse_litellm_err(), so async callers miss the adapter-name context that the sync paths now include.Proposed fix
except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + adapter_info = self._adapter_name or self.adapter.get_name() + raise parse_litellm_err(e, adapter_info) from e @@ except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + adapter_info = self._adapter_name or self.adapter.get_name() + raise parse_litellm_err(e, adapter_info) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/embedding.py` around lines 141 - 156, The async method get_aembeddings constructs and passes only a provider string to parse_litellm_err, losing the adapter-instance context; change its exception handler to mirror the sync path by building the same adapter-aware identifier used elsewhere (use the same expression the sync code uses to include the adapter instance name) and pass that identifier into parse_litellm_err(e, <adapter-aware-identifier>) instead of the current provider_name so async callers receive the adapter name too.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/embedding.py`:
- Around line 141-156: The async method get_aembeddings constructs and passes
only a provider string to parse_litellm_err, losing the adapter-instance
context; change its exception handler to mirror the sync path by building the
same adapter-aware identifier used elsewhere (use the same expression the sync
code uses to include the adapter instance name) and pass that identifier into
parse_litellm_err(e, <adapter-aware-identifier>) instead of the current
provider_name so async callers receive the adapter name too.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6d1355c0-09a2-4691-9f8b-1307219df5a3
📒 Files selected for processing (3)
unstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/index.pyunstract/sdk1/src/unstract/sdk1/llm.py
🚧 Files skipped from review as they are similar to previous changes (1)
- unstract/sdk1/src/unstract/sdk1/llm.py
Apply same or-chaining simplification to LLM complete() and stream_complete() error handlers to stay within SonarCloud's cognitive complexity limit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)
271-275: Consider extracting_adapter_info()helper and including provider context.The error format shows either adapter name or provider, but the PR objectives suggest showing both:
"Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...". This gives users the friendly name while retaining technical context for debugging.Additionally, this logic is duplicated at lines 348 and 408. A helper method would improve consistency and maintainability.
♻️ Proposed refactor
Add a helper method to the class:
def _adapter_info(self) -> str: """Return formatted adapter identifier for error messages.""" if self._adapter_name: return f"'{self._adapter_name}' ({self.adapter.get_provider()})" return f"'{self.adapter.get_provider()}'"Then replace all three occurrences (lines 271-274, 348-351, 408-411):
- adapter_info = self._adapter_name or self.adapter.get_provider() - error_msg = ( - f"Error from LLM adapter '{adapter_info}': " - f"{strip_litellm_prefix(str(e))}" - ) + error_msg = ( + f"Error from LLM adapter {self._adapter_info()}: " + f"{strip_litellm_prefix(str(e))}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 271 - 275, Extract a new instance helper method named _adapter_info(self) that returns a formatted identifier combining the friendly adapter name and provider, e.g. when self._adapter_name exists return f"'{self._adapter_name}' ({self.adapter.get_provider()})" else return f"'{self.adapter.get_provider()}'"; then replace the duplicated logic that builds adapter_info (the blocks using self._adapter_name and self.adapter.get_provider()) with calls to this._adapter_info() in the LLM class so all error messages use the same formatted string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 271-275: Extract a new instance helper method named
_adapter_info(self) that returns a formatted identifier combining the friendly
adapter name and provider, e.g. when self._adapter_name exists return
f"'{self._adapter_name}' ({self.adapter.get_provider()})" else return
f"'{self.adapter.get_provider()}'"; then replace the duplicated logic that
builds adapter_info (the blocks using self._adapter_name and
self.adapter.get_provider()) with calls to this._adapter_info() in the LLM class
so all error messages use the same formatted string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c4f01a13-8eee-448d-92e6-286ea4f3abf8
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/llm.py
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
LGTM for the most part, address the comment to reduce code duplication
Move the repeated adapter-name stripping logic into Utils.strip_adapter_name() in sdk1/utils/common.py, replacing inline for-loops in index key generation across sdk1/index.py, sdk1/utils/indexing.py, and prompt-service/core/index_v2.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Greptile SummaryThis PR improves error message quality by including user-facing adapter instance names (e.g., Key observations:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[PlatformHelper.get_adapter_config] --> B{Public adapter?}
B -- Yes --> C[Load config from env vars\nNo display name added]
B -- No --> D[Fetch from platform service\nStore display name in config dict]
D --> E[Adapter Constructor\nLLM / Embedding / VectorDB / X2Text]
C --> E
E --> F[Pop display name from config\nStore as self._adapter_name]
F --> G[Initialize adapter with\nremaining config]
G --> H{Error occurs?}
H -- Yes --> I[Build adapter_info string\nName + provider or UUID fallback]
I --> J[Raise error with\nhuman-readable adapter identity]
G --> K[generate_index_key called]
K --> L[Fetch fresh configs\nfrom platform service]
L --> M[Utils.strip_adapter_name\nRemoves display name key in-place]
M --> N[Hash configs for\nbackward-compatible index key]
|
- LLM: Extract _get_adapter_info() helper that preserves both adapter name and provider in errors (e.g. "Unstract Trial LLM (azureopenai)") instead of dropping the provider via or-chaining. Also keeps the conditional out of nested except blocks to avoid SonarCloud cognitive complexity issues. - VectorDB/X2Text: Fix inconsistent quoting — move quotes into the template string so both named adapters and UUID fallbacks are quoted consistently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/vector_db.py (1)
90-90: Inconsistent constant import forADAPTER_NAME.This file uses
Common.ADAPTER_NAME(imported fromunstract.sdk1.constantsat line 17), whilex2txt.pyandllm.pyuseSdkCommon.ADAPTER_NAME(an alias for the same import). Both resolve to the same constant, so functionality is correct, but consider using consistent aliasing across the codebase for maintainability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/vector_db.py` at line 90, The file uses Common.ADAPTER_NAME while other modules (x2txt.py and llm.py) use the SdkCommon alias; update unstract/sdk1/vector_db.py to import the constants module with the same alias used elsewhere (e.g., "from unstract.sdk1 import constants as SdkCommon") and replace the usage self._adapter_name = vector_db_config.pop(Common.ADAPTER_NAME, "") with the consistent SdkCommon.ADAPTER_NAME symbol so imports and references match across the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/vector_db.py`:
- Line 90: The file uses Common.ADAPTER_NAME while other modules (x2txt.py and
llm.py) use the SdkCommon alias; update unstract/sdk1/vector_db.py to import the
constants module with the same alias used elsewhere (e.g., "from unstract.sdk1
import constants as SdkCommon") and replace the usage self._adapter_name =
vector_db_config.pop(Common.ADAPTER_NAME, "") with the consistent
SdkCommon.ADAPTER_NAME symbol so imports and references match across the
codebase.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fcb93c3b-5664-4c5e-9a2b-5aa81725df64
📒 Files selected for processing (3)
unstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/src/unstract/sdk1/vector_db.pyunstract/sdk1/src/unstract/sdk1/x2txt.py
Add _get_adapter_info() helper to Embedding (mirroring LLM) so errors show both adapter name and provider when available, e.g. "My Embedding (openai)" instead of just one or the other. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Fall back to x2text_instance_id (UUID) instead of the generic adapter type name, matching the pattern used by VectorDB and X2Text init error handlers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
Why
How
PlatformHelper._get_adapter_configuration(), the adapter name was already fetched from the platform service but immediately discarded viapop(). Now it is stored back in the config dict under a private key_adapter_name.Error from LLM provider 'azureopenai': ...Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...index.py,utils/indexing.py, prompt-serviceindex_v2.py) strip the_adapter_namekey before hashing to maintain backward compatibility with existing document index keys.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
_adapter_namekey is stripped before index key hashing, so existing indexed documents are unaffected. The only user-visible change is richer error messages that now include the adapter instance name alongside the provider name.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.