LCORE-2311: Llama Stack Provider Fix#1936
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
🧰 Additional context used📓 Path-based instructions (1)src/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
🔇 Additional comments (8)
WalkthroughAdds ChangesLlama Stack Streaming Correctness
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
5e2de0e to
98a3c0e
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/llamastack/_model.py`:
- Around line 98-104: The McpCall branch starting at line 98 is missing a None
guard for event.item.id, unlike the ResponseFunctionToolCall branch which
properly guards against falsy id values. Add a conditional check to ensure
event.item.id is not None before assigning it to item_id and proceeding with the
yield and _replay_mcp_buffered_deltas call. This prevents malformed delta events
with item_id set to "None-call" from being synthesized when the id is None.
- Line 228: The assert statement used to validate that first_chunk is an
instance of responses.ResponseCreatedEvent will be stripped away when Python is
run with the -O optimization flag, causing the code to proceed with an invalid
value and raise an unclear AttributeError later. Replace the assert isinstance
check with an explicit type validation that raises an appropriate exception
(such as TypeError or ValueError) immediately if first_chunk is not a
ResponseCreatedEvent, ensuring the error is clear and cannot be optimized away.
- Around line 27-39: The module depends on private/internal pydantic-ai APIs
(RunContext, PeekableAsyncStream, Unset, number_to_datetime, and
_map_api_errors) which are implementation details subject to change. The
pydantic-ai version constraint currently has no upper bound, making it
vulnerable to silent breakage from future version updates. Locate the project's
dependency declaration file (typically pyproject.toml or setup.py) and update
the pydantic-ai constraint from "pydantic-ai>=1.99.0" to include an upper bound
such as "pydantic-ai>=1.99.0,<2.0.0" (adjusting the version number as
appropriate for stability), or alternatively add a comment documenting this
private API dependency as a known technical debt if keeping the unconstrained
version is intentional.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 366c0e5a-d8a1-4108-8bc4-3782ccd9ae4e
📒 Files selected for processing (2)
src/pydantic_ai_lightspeed/llamastack/_model.pysrc/pydantic_ai_lightspeed/llamastack/_transport.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 3
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/llamastack/_transport.pysrc/pydantic_ai_lightspeed/llamastack/_model.py
🪛 ast-grep (0.43.0)
src/pydantic_ai_lightspeed/llamastack/_transport.py
[info] 194-194: use jsonify instead of json.dumps for JSON output
Context: json.dumps(convert_pydantic_to_json_value(chunk))
Note: Security best practice.
(use-jsonify)
🔇 Additional comments (5)
src/pydantic_ai_lightspeed/llamastack/_transport.py (2)
20-20: LGTM!
186-196: LGTM!src/pydantic_ai_lightspeed/llamastack/_model.py (3)
117-173: LGTM!
213-218: LGTM!
230-242: LGTM!
| "Streamed response ended without content or tool calls" | ||
| ) | ||
|
|
||
| assert isinstance(first_chunk, responses.ResponseCreatedEvent) |
There was a problem hiding this comment.
@jrobertboos please remove assert from runtime validation
|
/retest |
Description
Fixed llama stack provider by buffering and replaying streaming events in expected order.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
StreamingResponse, chunks are streamed correctly as bytes (and otherwise preserved as prior SSE-formatted output).