Skip to content

Realtime known tool failures (exception/timeout) do not send model-visible output #3356

@bbiiggjjuu

Description

@bbiiggjjuu

Please read this first

  • Have you read the docs? Yes.
  • Have you searched for related issues? Yes. #3286 fixed the same class of problem for the unknown-tool (else) branch. This issue reports the same problem for the known-tool exception (if) branch, which remains unfixed.

Describe the bug

PR #3287 correctly added RealtimeModelSendToolOutput to the unknown-tool (else) branch in _handle_tool_call(). However, the known-tool (if) branch at line 697 calls await invoke_function_tool(...) without a try/except. When the tool raises an exception (e.g. ToolTimeoutError with timeout_behavior="raise_exception", or a user-raised ValueError), the exception propagates out of _handle_tool_call, is caught by _on_tool_call_task_done, logged, and a local RealtimeError is emitted — but no RealtimeModelSendToolOutput is sent back to the model.

This means the model waits indefinitely for a function call result that will never arrive, until the server-side timeout fires (~30 seconds). During this time the session appears to hang.

Related paths checked:

  • Known tool call success: sends RealtimeModelSendToolOutput(start_response=True)
  • Tool call rejection: sends _send_tool_rejection() with model-visible output ✓
  • Unknown tool call: sends RealtimeModelSendToolOutput with error message (fixed by fix: #3286 send realtime output for unknown tool calls #3287) ✓
  • Handoff tool call: sends tool output after session update ✓
  • Known tool call failure (exception/timeout): no RealtimeModelSendToolOutput sent

The existing test at test_session.py:1199 (assert len(mock_model.sent_tool_outputs) == 0) inadvertently codifies this missing-output behavior as expected.

Debug information

  • Agents SDK version: v0.17.1
  • Python version: Python 3.11

Repro steps

Run this script:

import asyncio
from typing import Any
from unittest.mock import AsyncMock, Mock, PropertyMock

from agents.realtime.agent import RealtimeAgent
from agents.realtime.model import RealtimeModel, RealtimeModelConfig
from agents.realtime.model_events import RealtimeModelToolCallEvent
from agents.realtime.model_inputs import RealtimeModelSendEvent, RealtimeModelSendToolOutput
from agents.realtime.session import RealtimeSession
from agents.tool import FunctionTool

class SpyModel(RealtimeModel):
    def __init__(self) -> None:
        self.sent_events: list[RealtimeModelSendEvent] = []

    async def connect(self, options: RealtimeModelConfig) -> None: pass
    def add_listener(self, listener: Any) -> None: pass
    def remove_listener(self, listener: Any) -> None: pass
    async def close(self) -> None: pass

    async def send_event(self, event: RealtimeModelSendEvent) -> None:
        self.sent_events.append(event)


async def main() -> None:
    # A tool that always raises
    async def failing_tool_fn(_ctx, _args: str) -> str:
        raise ValueError("tool failed")

    failing_tool = FunctionTool(
        name="failing_tool",
        description="always fails",
        params_json_schema={"type": "object", "properties": {}},
        on_invoke_tool=failing_tool_fn,
    )

    mock_agent = Mock(spec=RealtimeAgent)
    mock_agent.get_all_tools = AsyncMock(return_value=[failing_tool])
    type(mock_agent).handoffs = PropertyMock(return_value=[])

    model = SpyModel()
    session = RealtimeSession(model, mock_agent, None)

    # Simulate a function_call event from the model
    event = RealtimeModelToolCallEvent(name="failing_tool", call_id="call_1", arguments="{}")
    await session.on_event(event)

    tool_tasks = list(session._tool_call_tasks)
    await asyncio.gather(*tool_tasks, return_exceptions=True)

    # Check whether the model received a tool output
    has_output = any(
        isinstance(e, RealtimeModelSendToolOutput) for e in model.sent_events
    )
    print(f"Model received tool output: {has_output}")

    from agents.realtime.events import RealtimeError
    while not session._event_queue.empty():
        ev = session._event_queue.get_nowait()
        if isinstance(ev, RealtimeError):
            print(f"Local error emitted: {ev.error['message'][:80]}...")


asyncio.run(main())

Actual output:

ValueError: tool failed
Model received tool output: False
Local error emitted: Tool call task failed: tool failed...

Expected behavior

When a known function tool invocation raises an exception (including ToolTimeoutError), the session should send RealtimeModelSendToolOutput with an error message and start_response=True back to the model — just as it already does for unknown tools (#3287). The local RealtimeError event should continue to be emitted for application observers.

The fix should also cover the handoff branch (elif event.name in handoff_map, line 720) where handoff.on_invoke_handoff may similarly raise.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions