ISessionTrimmingService
Overview
The ISessionTrimmingService is invoked by the SmartSession
when the total number of tokens in the messages being submitted to the AI exceeds the limit set by the SmartEndpoint.ContextWindow
size. This service is responsible for trimming the history by removing or consolidating messages, thus enabling the session to continue interacting with the AI without significantly losing context.
Wisej.AI proactively invokes this service before the AI provider returns an error. It achieves this by comparing the total number of tokens in the last response, through the Usage property of the last messgae, message with the SmartEndpoint.ContextWindow
property of the current endpoint.
Default Implementation
The default implementation of the ISessionTrimmingService
interface is provided by the DefaultSessionTrimmingService
class. This class employs two distinct strategies to manage session data: the RollingWindow strategy and the Summarization strategy.
When trimming is initiated, Wisej.AI calls the TrimAsync()
method of the ISessionTrimmingService
. This method receives two parameters: the SmartSession
that requested the trimming and the collection of messages to be trimmed. By default, the service is set up to employ the RollingWindow
trimming strategy, aiming to reduce the message collection by 50%.
When using the RollingWindow
trimming strategy, the DefaultSessionTrimmingService
prioritizes the removal of tool calls and tool results messages first.
Below is an example of how to modify the default parameters for the DefaultSessionTrimmingService
. Since the default service is registered with a Shared
lifetime, you can adjust the properties on the service singleton:
var service = Application.Services.GetService<ISessionTrimmingService>();
service.TrimmingPercentage = 0.3f;
service.TrimmingStrategy = SmartSession.TrimmingStrategy.Summarization;
If you wish to implement a different trimming system, you can register your own service implementation at any time. This allows you to utilize other AI libraries and adopt any strategy that suits your needs. It's crucial to trim the message collection rather than the messages associated with the session. This is because Wisej.AI internally clones the messages before using the AI model, enabling parallel usage of the session.
See Context Overflow for a description of the two trimming strategies.
Last updated