LCORE-1831: Implement Redaction Safety Capability in Pydantic AI#1915
LCORE-1831: Implement Redaction Safety Capability in Pydantic AI#1915arin-deloatch wants to merge 7 commits into
Conversation
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (11)
WalkthroughThis pull request introduces a configurable PII redaction capability for Pydantic AI Lightspeed Core Stack. The implementation provides regex-based pattern matching and replacement logic, configuration models with compiled pattern caching, and integration with Pydantic AI's request/response lifecycle hooks. ChangesPII Redaction Capability Implementation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable sequence diagrams in the walkthrough.Disable the |
anik120
left a comment
There was a problem hiding this comment.
I know this is not part of the scope of this PR, but is src/pydantic_ai... leaking implementation detail again @jrobertboos?
|
|
||
|
|
||
| def _redact_message_parts( | ||
| parts: Sequence[Any], compiled_patterns: CompiledPatterns |
There was a problem hiding this comment.
Try to avoid Any in the whole module where possible.
|
|
||
| def _redact_string_content( | ||
| text: str, compiled_patterns: CompiledPatterns | ||
| ) -> str | None: |
There was a problem hiding this comment.
Prefer using Optional in the whole module.
f5586a7 to
d5a97ab
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/redaction/capability.py`:
- Line 1: Add module-level logging support by importing get_logger from log.py
and creating a logger instance at the top of the file after the module docstring
using logger = get_logger(__name__). This logger should then be used throughout
the capability module to audit PII redaction events, such as logging at debug
level when redaction rules match specific patterns and at info level for
redaction statistics and summaries. This approach aligns with coding guidelines
and provides valuable audit trails for the security-sensitive PII redaction
functionality.
- Line 5: Update all type annotations in the module to use modern pipe syntax
instead of Optional. Replace Optional[Type] with Type | None for all return type
annotations in the functions including _redact_text_content,
_redact_content_list, _redact_message_parts, and _redact_model_request. After
updating all function return types throughout the module, remove Optional from
the import statement at the top of the file since it will no longer be needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7e09e4f4-9b9b-4dcf-a8b9-cf61a5e8e1e4
📒 Files selected for processing (10)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.pysrc/pydantic_ai_lightspeed/capabilities/redaction/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_core.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/core.pysrc/pydantic_ai_lightspeed/capabilities/redaction/config.pysrc/pydantic_ai_lightspeed/capabilities/redaction/capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/redaction/__init__.py
🔇 Additional comments (6)
src/pydantic_ai_lightspeed/capabilities/redaction/config.py (1)
14-107: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_config.py (1)
1-157: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/capability.py (2)
31-258: LGTM!
261-325: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/redaction/test_capability.py (1)
1-316: LGTM!src/pydantic_ai_lightspeed/capabilities/redaction/__init__.py (1)
1-21: LGTM!
|
/ok-to-test |
asimurka
left a comment
There was a problem hiding this comment.
I personally like this decomposition to capability, core and config modules.
|
|
||
|
|
||
| @dataclass | ||
| class PiiRedactionCapability(AbstractCapability[Any]): |
There was a problem hiding this comment.
Use None for type argument as this corresponds to current implementation of lightspeed agent (with no dependencies). Use None as type annotation where possible.
|
/ok-to-test |
| @@ -0,0 +1,107 @@ | |||
| """Configuration models for PII redaction rules.""" | |||
There was a problem hiding this comment.
IMHO this config (or rather config classes/models) should be added into src/models/config.py. At least to avoid circular dependencies, to allow us to generate documentation etc.
There was a problem hiding this comment.
Understood, it was my initial understanding that we were to keep everything pydantic_ai related separate to avoid any sorts of conflicts with the core source code.
There was a problem hiding this comment.
However, I am more than happy to make the changes necessary! Just to be thorough, it is expected that RedactionRule,RedactionConfig and RedactionResult are being moved to src/models/config.py, correct?
There was a problem hiding this comment.
Correct. This will be later part of LCORE config so it will be technically implementation agnostic.
…capability. Use Optional for nullable returns and substitute Any with UserContent, ModelRequestPart, and ModelResponsePart for type-safe message handling.
d5a97ab to
271c1f2
Compare
Description
Add a regex-based PII redaction capability for pydantic-ai agents. This introduces:
core.py):redact_text()function and immutableRedactionResultmodel forsequential regex-based text substitution
config.py):RedactionRuleandRedactionConfigPydantic models withcompile-time pattern validation and per-rule/global case sensitivity controls
capability.py):PiiRedactionCapabilityintegrating with pydantic-ai'sAbstractCapabilityto redact user prompts before model requests and model response text beforereturning to the caller
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
uv run make format— passes, no reformatsuv run make verify— all linters pass (black, pylint 10/10, pyright 0 errors, ruff, pydocstyle,mypy, lint-openapi)
uv run pytest tests/unit/pydantic_ai_lightspeed/capabilities/ -v— 51/51 tests passcapability.py)Summary by CodeRabbit