RecursiveCharacterTextSplitterService
Wisej.AI.Services.RecursiveCharacterTextSplitterService
Last updated
Wisej.AI.Services.RecursiveCharacterTextSplitterService
Last updated
Namespace: Wisej.AI.Services
Assembly: Wisej.AI (3.5.0.0)
A service for recursively splitting text into chunks based on specified separators and chunk size constraints. This service attempts to split text by different characters to find an optimal separation strategy.
Initializes a new instance of the class with specified separators, chunk size, overlap, and length function.
chunkSize
The maximum size of each text chunk. Defaults to 1000 if not specified.
chunkOverlap
The allowed overlap size between chunks. Defaults to 200 if not specified.
separators
The array of string separators to be used for splitting the text. Default is an array containing "\n\n", "\n", " ", and "".
lengthFunction
A function to determine the length of the text, which will be used to comply with the chunk size constraint.
Splits the given text into chunks using the defined separators and chunk size constraints.
text
The text to be split into chunks.
This method seeks to split the provided text based on the list of separators, starting with more significant separators and moving to less significant ones. If none of the separators are found, it will treat the text as a sequence of individual characters. The process ensures that each chunk does not exceed the specified chunk size. If a text segment is larger than the specified chunk size, it will be recursively split further. Usage example:
Throws:
Represents a service for splitting text into an array of substrings.
Returns: . An array of strings, where each string represents a chunk of the original text.
Thrown when the text is null.