Using SmartSession
Overview
The SmartSession
class serves as a representation of a session, essentially a "conversation" with the AI. Each interaction with the AI model is orchestrated by an instance of SmartSession
. When a SmartPrompt
is used directly, a temporary session is created internally to manage tool calls and agent loops effectively.
By using the SmartSession
directly, you maintain full control over the "conversation," which acts as the AI's short-term memory. This allows you to better influence and manage the model's behavior according to your specific requirements.
How to Use
Creating and utilizing a SmartSession is as straightforward as using any other Wisej.AI component. The primary requirement is establishing an endpoint. Once you've created the session, it functions similarly to a SmartPrompt, with the distinct advantage that the session object retains the complete history of the conversation.
For instance, if you use a SmartPrompt to send a request to the AI two or three times, each request is treated independently as a new instance. However, when employing a SmartSession, the entire conversation history is transmitted to the AI alongside the latest request. This ensures the model has access to the full context, effectively serving as its short-term memory, thereby allowing for more informed interactions.
Resubmitting the full context with each request has its disadvantages as well. For instance, if one of the responses within the conversation history was incorrect, the model may mistakenly continue to refer to it as a fact in subsequent interactions. This can potentially lead to the propagation of errors, as the model's retention of the entire session history allows it to inadvertently reinforce inaccurate information.
Memory Trimming
When interacting with a Large Language Model (LLM), it's crucial to manage the length and content of the conversation history. This management is essential for optimizing performance and ensuring relevant responses. Overly long histories can exceed the model's token limit, result in slower response times, and introduce irrelevant or outdated information into the conversation.
To address this, two primary approaches are used: the window approach and summarization.
Window Approach: This technique involves maintaining a "sliding window" of the most recent interactions with the LLM. Essentially, the system only keeps a predefined number of the latest utterances or tokens, effectively trimming older parts of the conversation. This method ensures that the most relevant and recent context is always used, improving the efficiency and focus of the LLM's responses.
Summarization: This strategy involves creating a concise summary of the entire conversation history or the portions beyond a certain length. Summarization distills the key points and important context, which the LLM can then use as a reference. This approach can be particularly useful when past interactions contain critical information that should not be lost, but must be compressed to fit within the model's processing constraints.
Both methods serve to enhance the LLM’s efficiency, enabling it to generate more accurate and contextually relevant responses by maintaining a manageable and focused conversation history.
Examples
Using the SmartSession
to interact with an AI model is similar to using a SmartPrompt
; however, the key difference is that SmartSession
retains the history of all messages exchanged with the AI. This session history serves as the AI model's "short-term memory," allowing for more context-aware interactions. In the example below, a session is established to provide persistent memory for the AI, which is then reflected in its responses. This enables the AI to deliver answers that take previous exchanges into account, resulting in a more coherent and contextually informed conversation.
Having an active session provides you with the additional capability to pre-process or post-process any message sent to the AI model. This means you can customize or adjust the inputs before they reach the AI model, or modify the outputs afterward. For instance, taking an example from Google's Prompt Engineering book regarding Chain of Thought (CoT) prompting, it may be challenging to obtain the correct answer without pre-processing unless you configure the AI to output all the intermediate steps.
As you can see, without proper preprocessing or enabling the AI to show all the reasoning steps, the answer may indeed turn out to be incorrect.
Last updated