Overview
Middleware in the BeeAI Framework is code that runs “in the middle” of an execution lifecycle—intercepting the flow between when a component (like an Agent, Tool, or Model) starts and when it finishes. As these components execute, they emit events at key moments, such as starting a task, calling a tool, or completing a response . Middleware hooks into these events to inject behaviors like logging, filtering, or safety checks—all without modifying the component’s core logic. This modular approach allows you to apply consistent policies across your entire system. You can use built-in tools likeGlobalTrajectoryMiddleware for immediate debugging, or write custom middleware to handle complex needs like blocking unsafe content, enforcing rate limits, or managing authentication.
Note on Terminology: In this framework, Middleware refers to the classic software design pattern (pipeline interceptors) that runs between execution steps. This is distinct from the industry term “Agentic Middleware,” which typically refers to entire orchestration platforms.
Built-in Middleware
The following section showcases built-in middleware that you can start using right away.Global Trajectory
The fastest way to understand your agent’s execution flow is by using theGlobalTrajectoryMiddleware. It captures all events, including deeply-nested ones, and prints them to the console, using indentation to visualize the call stack .
Example
Python
| Parameter | Description |
|---|---|
target | Specify a file or stream to which to write the trajectory (pass False to disable). |
included | List of classes to include in the trajectory. |
excluded | List of classes to exclude from the trajectory. |
pretty | Use pretty formatting for the trajectory. |
prefix_by_type | Customize how instances of individual classes should be printed. |
exclude_none | Exclude None values from the printing. |
enabled | Enable/Disable the logging. |
match_nested | Whether to observe trajectories of nested run contexts. |
emitter_priority | Setting higher priority may result in capturing events without any modifications from other middlewares. |
Python
Tool Call Streaming
This middleware handles streaming tool calls in aChatModel. It observes stream updates from the Chat Model and parses tool calls on demand so that they can be consumed immediately.
It works even without streaming enabled, in which case it emits the update event at the end of the execution..
Example
Python
| Parameter | Description |
|---|---|
target | The tool that we are waiting for to be called. |
key | Refers to the name of the attribute in the tool’s schema that we want to stream. |
match_nested | Whether the middleware should be applied only to the top level. |
force_streaming | Sets the stream flag on the ChatModel. |
Core Primatives
The BeeAI Framework middleware is built on an underlying system of primitives, which are described in this section. Understanding these primitives is helpful for building complex middleware.Events
An event refers to an action initiated by a component. It carries the details of what just happened within the system. Every event has three key properties:- Name: A string identifier (e.g.,
start,success,error, or custom names likefetch_data). - Data payload: The content of the event, typically astructured as a Pydantic model.
- Metadata: Information about the context where the event was fired.
Python
Emitter
The Emitter is the core component that lets you send and watch for events. While it is typically attached to a specific class, you can also use it on its own. An emitter instance is typically the child of a root emitter to which all events are propagated. Emitters can be nested (one can be a child of another), hence they internally create a tree hierarchy. Every emitter instance has the following properties:namespacein which the emitter operates (eg:agents.requirement,tool.open_meteo, …).creatorclass which the given emitter belongs to.context(dictionary which is attached to all events emitted via the given emitter).tracemetadata (such as currentid,run_idandparent_id)
onfor registering a new event listener- The method takes
matcher(event name, callback, regex),callback(sync/async function), andoptions(priority, etc.) - The method can be used as a decorator or as a standalone function
- The method takes
offfor deregistering an event listenerpipefor propagating all captured events to another emitterchildfor creating a child emitter
The event’s
path attribute is created by concatenating namespace with an event name (eg: backend.chat.ollama.start).- defines a data object for the
fetch_dataevent, - creates an emitter from the root one,
- registers a callback listening to the
fetch_dataevent, which modifies its content, - fires the
fetch_dataevent, - logs the modified event’s data.
Python
The
emitter.on can be used directly and not just as a decorator. Example: emitter.on("fetch_data", callback).Run (Context)
TheRun class acts as a wrapper of the target implementation with its own lifecycle (an emitter with a set of events) and context (data that gets propagated to all events).
The RunContext class is a container that stores information about the current execution context.
These abstractions allow you to:
- modify
inputto the given target (listen to astartevent and modify the content of theinputproperty), - modify
outputfrom the given target (listen to asuccessevent and modify the content of theoutputproperty), - stop the run early (listen to a
startevent and set theoutputproperty to a non-Nonevalue), - propagate
context(dictionary) to any component of your system, - cancel the execution in an arbitrary place,
- gain observability into runs via structured events (for logging, tracing, and debugging).
run method gets called on a framework class that can be executed (eg, ChatModel, Agent, …).
The run object has the following methods:
- The
onmethod allows registering a callback to its emitter. - The
middlewarefor registering middleware (a function that takesRunContextas a first parameter or a class with abindmethod that takes theRunContextas a first parameter). - The
contextallows data to be set for a given execution. That data will then be propagated as metadata in every event that gets emitted.
RunContext), which internally forms a hierarchical tree structure that shares the same context.
In simpler terms, when you call one runnable (e.g., ChatModel) from within another runnable (e.g., Agent), the inner call (ChatModel) is attached to the context of the outer one (Agent).
Runnable
TheRunnable[R] class unifies common objects that can be executed and observed. It is an abstract class with the following traits:
- It has an abstract
runmethod that executes the class and returns aRun[R](Ris bound to theRunnableOutput). - It has an abstract
emittergetter. - It has a
middlewaresgetter that lists the existing middlewares.
RunnableOptions):
signal(an instance ofAbortSignal) — allows aborting the execution.context(a dictionary) — used to propagate additional data.
RunnableOutput has the following properties:
output: a list of messages (can be empty)context: a dictionary that can store additional datalast_message(getter): returns the last message if it exists, or creates an emptyAssistantMessageotherwise
Python
Event Handling
Building robust agents requires precise control over the execution lifecycle. You need the ability to not only observe your agent’s behavior but also intercept and modify it at specific points. The following sections covers the mechanics of the BeeAI Framework event system and will enable you to manage:- Scopes: Deciding whether to listen globally, per instance, or for a single run.
- *Config: Controlling listener priority, persistence, and blocking behavior.
- Lifecycle: Undestanding the exact sequence events that occur during execution.
- Debugging: Inspecting raw event streams to see exactly what your agent is doing.
- Piping: Linking emitters together via piping to create unified event streams.
Scopes
Events can be observed at three different levels. 1. Global Level Every emitter provided by the out-of-the-box modules is a child of the root emitter. This means you can listen to all events directly from the root emitter.Python
Listeners that are bound “closer” to the source are executed earlier. For those that reside at the same level, the order can be altered by setting a
priority value which is part of the EmitterOptions class. A higher priority value means the listener will be executed earlier. The default priority is 0.Python
Python
run method).
The run’s emitter is a child of the class emitter, allowing you to modify behavior for a single invocation without affecting others.
Config
When working with multiple callbacks, you may need to control execution order, or ensure that some run exclusively. You can do this using the optionaloptions argument of type EmitterOptions.
Example
Python
matcher parameter (the one that is used to match the event), the framework
decides whether to include/exclude nested events (events created from children emitters or from piping).
The default value of the match_nested depends on the matcher value. Note that the value can be set directly as shown in the example above.
| Matcher Type | Default match_nested |
|---|---|
String without . (event name) | False |
String with . (event path) | True |
"*" (match all top-level events) | False |
"*.*" (match all events) | True |
| Regex | True |
| Function | False |
If two events have the same priority, they are executed in the order they were added.
Lifecycle
When a framework component is executed, it creates a run context, which wraps the target handler and allows you to modify its input and output (Learn more in the Run (Context) section). Once aRun instance is executed (i.e., awaited), its lifecycle proceeds through the following steps:
- The
startevent is emitted. - The target implementation is executed.
- Depending on the outcome, either a
successorerrorevent is emitted. - Finally, the
finishevent is emitted.
| Event | Data Type | Description |
|---|---|---|
start | RunContextStartEvent | Triggered when the run starts. |
success | RunContextSuccessEvent | Triggered when the run succeeds. |
error | FrameworkError | Triggered when an error occurs. |
finish | RunContextFinishEvent | Triggered when the run finishes. |
Python
In this example,
create_internal_event_matcher ensures we correctly match the event.Debugging
While the Global Trajectory middleware is excellent for visualizing the structural hierarchy of a run, sometimes you need to inspect the raw stream of events as they happen. To do this quickly without setting up a full middleware class, you can register a wildcard listener (*.*) directly on your run. This captures every single event emitted during that specific execution.
Python
Piping
In some cases, one might want to propagate all events from one emitter to another (for instance when creating a child emitter).Creating Custom Middleware
While you can register individual callbacks to handle specific events, this approach can become cluttered if you have complex logic. To make your event handling reusable and modular, the BeeAI framework allows you to group listeners into a class called Middleware.When to use Middleware vs. Callbacks
- Use Callbacks (
.on/.match): For simple, one-off logic, such as logging a specific event or debugging a single run. - Use Middleware: When the logic is complex, multi-step or needs to be reused across different parts of your application.
The Middleware Protocol
A middleware component is defined by how it interacts with the RunContext. It can be structured in two ways:-
A Function: A simple function that accepts
RunContextas its first parameter. -
A Class: A class that implements a bind method, which accepts
RunContextas its first parameter.
RunContex provides access to the emitter, the instance being run, and the shared memory for that specific execution.
Example: Intercepting and Overriding
A common use case for middleware is intercepting a request before the target component executes to modify the input or provide a mock response. The following example demonstrates a middleware that intercepts thestart event. By setting the output property on the event data, the middleware effectively “mocks” the result, preventing the actual ChatModel from running.
Python
create_internal_event_matcher: A helper used to ensure you are matching the specific internal event (likestart/success/error/finish) for the correct component instance.EmitterOptions: Used here to setpriority=1andis_blocking=True, ensuring this middleware executes early and takes precedence over other callbacks.data.output: Setting this property during astartevent signals the framework to skip the underlying execution (e.g., the LLM call) and return this value immediately.
Ensure that your mock response matches the expected output type of the component you are intercepting. For example, if you override a
ChatModel, the return type must be ChatModelOutput.Registering Middleware
Once defined, you can attach middleware to a component using the.middleware() method just before execution.
Python
.run()), not the standalone emitter class itself. However, in some cases, middleware can be passed via the component’s constructor if supported.
Events glossary
The following sections list all events that can be observed for built-in components. Note that your tools/agents/etc. can emit additional events.Tools
The following events can be observed when callingTool.run(...).
| Event | Data Type | Description |
|---|---|---|
start | ToolStartEvent | Triggered when a tool starts executing. |
success | ToolSuccessEvent | Triggered when a tool completes execution successfully. |
error | ToolErrorEvent | Triggered when a tool encounters an error. |
retry | ToolRetryEvent | Triggered when a tool operation is being retried. |
finish | None | Triggered when tool execution finishes (regardless of success or error). |
Chat Models
The following events can be observed when callingChatModel.run(...).
| Event | Data Type | Description |
|---|---|---|
start | ChatModelStartEvent | Triggered when model generation begins. |
new_token | ChatModelNewTokenEvent | Triggered when a new token is generated during streaming. Streaming must be enabled. |
success | ChatModelSuccessEvent | Triggered when the model generation completes successfully. |
error | ChatModelErrorEvent | Triggered when model generation encounters an error. |
finish | None | Triggered when model generation finishes (regardless of success or error). |
Requirement Agent
| Event | Data Type | Description |
|---|---|---|
start | RequirementAgentStartEvent | Triggered when the agent begins execution. |
success | RequirementAgentSuccessEvent | Triggered when the agent successfully completes execution. |
final_answer | RequirementAgentFinalAnswerEvent | Triggered with intermediate chunks of the final answer. |
ToolCalling Agent
The following events can be observed by callingToolCallingAgent.run(...).
| Event | Data Type | Description |
|---|---|---|
start | ToolCallingAgentStartEvent | Triggered when the agent begins execution. |
success | ToolCallingAgentSuccessEvent | Triggered when the agent successfully completes execution. |
ReAct Agent
The following events can be observed by callingReActAgent.run(...).
| Event | Data Type | Description |
|---|---|---|
start | ReActAgentStartEvent | Triggered when the agent begins execution. |
error | ReActAgentErrorEvent | Triggered when the agent encounters an error. |
retry | ReActAgentRetryEvent | Triggered when the agent is retrying an operation. |
success | ReActAgentSuccessEvent | Triggered when the agent successfully completes execution. |
update and partial_update | ReActAgentUpdateEvent | Triggered when the agent updates its state. |
Workflow
The following events can be observed when callingWorkflow.run(...).
| Event | Data Type | Description |
|---|---|---|
start | WorkflowStartEvent | Triggered when a workflow step begins execution. |
success | WorkflowSuccessEvent | Triggered when a workflow step completes successfully. |
error | WorkflowErrorEvent | Triggered when a workflow step encounters an error. |
LinePrefixParser
The following events are caught internally by theLinePrefixParser.
| Event | Data Type | Description |
|---|---|---|
update | LinePrefixParserUpdate | Triggered when an update occurs. |
partial_update | LinePrefixParserUpdate | Triggered when a partial update occurs. |
StreamToolCallMiddleware
The following events are caught internally by theStreamToolCallMiddleware.
| Event | Data Type | Description |
|---|---|---|
update | StreamToolCallMiddlewareUpdateEvent | Triggered when an update occurs. |
GlobalTrajectoryMiddleware
The following events are handled internally by theGlobalTrajectoryMiddleware:
| Event | Data Type | Description |
|---|---|---|
start | GlobalTrajectoryMiddlewareStartEvent | Triggered when a target begins execution. |
success | GlobalTrajectoryMiddlewareSuccessEvent | Triggered when a target completes successfully. |
error | GlobalTrajectoryMiddlewareErrorEvent | Triggered when an error occurs during target execution. |
finish | GlobalTrajectoryMiddlewareFinishEvent | Triggered after a target has finished execution, regardless of success or failure. |
GlobalTrajectoryMiddlewareEvent class.
Python
origin attribute is the original event (e.g., start → RunContextStartEvent, etc.) that comes from the RunContext.
RunContext
Special events that are emitted before the target’s handler gets executed. A run event contains.run. in its event’s path and has internal set to true in the event’s context object.
| Event | Data Type | Description |
|---|---|---|
start | RunContextStartEvent | Triggered when the run starts. Has input (positional/keyword argument with which the function was run) and output property. Set the output property to prevent the execution of the target handler. |
success | RunContextSuccessEvent | Triggered when the run succeeds. |
error | FrameworkError | Triggered when an error occurs. |
finish | RunContextFinishEvent | Triggered when the run finishes. |