LogoLogo
HomeNewsSupportVideos
  • Welcome
  • Wisej.NET
  • Concepts
    • Getting Started
    • General Concepts
    • Architecture
    • Extensibility
    • AI Providers
    • Vector Databases
    • Usage Metrics
    • Logging & Error Handling
  • Markup
  • Components
    • API
      • SmartAdapter
        • SmartAudioTTSAdapter
        • SmartAudioWhisperAdapter
        • SmartCalendarAdapter
        • SmartChartAdapter
        • SmartChartJS3Adapter
        • SmartChatBoxAdapter
        • SmartComboBoxAdapter
        • SmartCopilotAdapter
        • SmartDataEntryAdapter
        • SmartDocumentAdapter
        • SmartFullCalendarAdapter
        • SmartObjectAdapter
        • SmartPictureBoxAdapter
        • SmartQueryAdapter
        • SmartRealtimeAdapter
        • SmartReportAdapter
        • SmartTextBoxAdapter
        • SmartAdapter.ExtendsAttribute
        • SmartAdapter.FieldNameAttribute
        • SmartAdapter.FieldPromptAttribute
        • SmartAdapter.FieldRectangleAttribute
        • SmartAdapter.WorksWithAttribute
      • SmartEndpoint
        • AmazonBedrockEndpoint
        • AnthropicEndpoint
        • AzureAIEndpoint
        • CerebrasEndpoint
        • DeepSeekEndpoint
        • GoogleAIEndpoint
        • GroqCloudEndpoint
        • GroqCloudEndpointWhisper
        • HuggingFaceEndpoint
        • HuggingFaceJavaScriptEndpoint
        • LocalAIEndpoint
        • LocalAIEndpointImageGen
        • LocalAIEndpointTTS
        • LocalAIEndpointWhisper
        • NvidiaAIEndpoint
        • OllamaEndpoint
        • OpenAIEndpoint
        • OpenAIEndpointDallE
        • OpenAIEndpointRealtime
        • OpenAIEndpointTTS
        • OpenAIEndpointWhisper
        • SambaNovaEndpoint
        • SmartHttpEndpoint
        • TogetherAIEndpoint
        • XAIEndpoint
        • SmartEndpoint.Metrics
        • SmartEndpoint.Response
      • SmartExtensions
      • SmartHub
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
      • SmartObject
      • SmartPrompt
        • SmartAgentPrompt
        • SmartParallelPrompt
        • SmartPrompt.Parameter
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
      • SmartRealtimeSession
      • SmartSession
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.Message
        • SmartSession.MessageCollection
        • SmartSession.MessageRole
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
        • SmartSession.TrimmingStrategy
      • SmartTool
        • SmartTool.IToolProvider
        • SmartTool.ToolAttribute
        • SmartTool.ToolContext
      • Markup
        • MarkupExtensions
      • Controls
        • UVLightOverlay
      • Embeddings
        • EmbeddedDocument
        • Embedding
        • Matches
        • Metadata
      • Helpers
        • ApiKeys
        • Markdown
        • TextTokenizer
      • Services
        • DefaultSessionTrimmingService
        • IDocumentConversionService
          • DefaultDocumentConversionService
        • IEmbeddingGenerationService
          • DefaultEmbeddingGenerationService
          • HuggingFaceEmbeddingGenerationService
        • IEmbeddingStorageService
          • AzureAISearchEmbeddingStorageService
          • ChromaEmbeddingStorageService
          • FileSystemEmbeddingStorageService
          • MemoryEmbeddingStorageService
          • PineconeEmbeddingStorageService
          • QdrantEmbeddingStorageService
        • IHttpClientService
          • DefaultHttpClientService
        • ILoggerService
          • DefaultLoggerService
        • IOCRService
          • DefaultOCRService
        • IRerankingService
          • DefaultRerankingService
          • LocalAIRerankingService
          • PineconeRerankingService
        • ISessionTrimmingService
          • DefaultSessionTrimmingService
        • ITextSplitterService
          • RecursiveCharacterTextSplitterService
          • TextSplitterServiceBase
        • ITokenizerService
          • DefaultTokenizerService
        • IWebSearchService
          • BingWebSearchService
          • BraveWebSearchService
          • GoogleWebSearchService
      • Tools
        • ArxivTools
        • ChartJS3Tools
        • DatabaseTools
        • DataTableFilterTools
        • DocumentSearchTools
        • DocumentTools
        • FullCalendarTools
        • IToolsContainer
        • MathTools
        • ToolsContainer
        • UtilityTools
        • WebSearchTools
    • Built-in Services
      • IOCRService
      • ILoggerService
      • ITextSplitterService
      • ITokenizerService
      • IHttpClientService
      • IWebSearchService
      • IRerankingService
      • ISessionTrimmingService
      • IDocumentConversionService
      • IEmbeddingStorageService
      • IEmbeddingGenerationService
    • Built-in SmartTools
      • ToolsContainer
      • MathTools
      • UtilityTools
      • DatabaseTools
      • DocumentTools
      • DocumentSearchTools
      • WebSearchTools
      • ChartJS3Tools
      • FullCalendarTools
    • Built-in SmartAdapters
      • SmartAdapter
      • SmartAudioTTSAdapter
      • SmartAudioWhisperAdapter
      • SmartCalendarAdapter
      • SmartChartAdapter
      • SmartChartJS3Adapter
      • SmartChatBoxAdapter
      • SmartComboBoxAdapter
      • SmartCopilotAdapter
      • SmartDataEntryAdapter
      • SmartDocumentAdapter
      • SmartFullCalendarAdapter
      • SmartObjectAdapter
      • SmartPictureBoxAdapter
      • SmartQueryAdapter
      • SmartRealtimeAdapter
      • SmartReportAdapter
      • SmartTextBoxAdapter
    • Configure Services
    • Using SmartHub
    • Using SmartTools
    • Using SmartPrompt
    • Using SmartSession
    • Using SmartRealTimeAdapter
    • UVLightOverlay Control
Powered by GitBook
On this page
  • Properties
  • ChunkOverlap
  • ChunkSize
  • Methods
  • JoinDocs(docs, separator)
  • MergeSplits(splits, separator)
  • Split(text)
  • Inherited By
  • Implements
Export as PDF
  1. Components
  2. API
  3. Services
  4. ITextSplitterService

TextSplitterServiceBase

Wisej.AI.Services.TextSplitterServiceBase

PreviousRecursiveCharacterTextSplitterServiceNextITokenizerService

Last updated 5 days ago

Namespace: Wisej.AI.Services

Assembly: Wisej.AI (3.5.0.0)

Provides the base functionality for text splitting services, allowing subclasses to define custom splitting logic.

public class TextSplitterServiceBase : ITextSplitterService
Public Class TextSplitterServiceBase
    Inherits ITextSplitterService

This abstract class is designed to facilitate the splitting of text into chunks with a defined size and overlap. It is a foundational component for text processing tasks where segmentation of text is required, such as in natural language processing or data analysis. The class is initialized with parameters that determine how text is chunked and how overlaps between chunks are handled. The functionality allows for flexible customization by providing a function to calculate the length of the text, which can be overridden as needed.

Properties

ChunkOverlap

: Gets the overlap size between chunks.

ChunkSize

: Gets the defined chunk size used for splitting text.

Methods

JoinDocs(docs, separator)

Joins a list of document strings using the specified separator and returns null if the resulting string is empty.

Parameter
Type
Description

docs

A list of document strings to be joined.

separator

The separator to be used between documents when joining them.

This method is useful for concatenating text fragments into a single string. It ensures that if the result is an empty string, it returns null instead, which can be useful for avoiding unnecessary processing of empty outputs.

Merges a list of text splits into larger chunks of the specified chunk size, ensuring overlap is maintained as needed.

Parameter
Type
Description

splits

The text segments to be merged into larger chunks.

separator

The separator to use when joining segments.

This method processes a list of text splits and combines them into larger chunks while respecting the defined chunk size and overlap. It handles character replacements to standardize the text format and ensures that no chunk exceeds the defined size without appropriate logging mechanisms, which are to be implemented. Usage example:


TextSplitterServiceBase splitter = new YourTextSplitterService(100, 10, str => str.Length);
string[] mergedChunks = splitter.MergeSplits(new List<string> { "text1", "text2" }, " ");

Throws:

Splits the given text into an array of strings based on the implemented logic in derived classes.

Parameter
Type
Description

text

The text to be split into chunks.

Inherited By

Name
Description

A service for recursively splitting text into chunks based on specified separators and chunk size constraints. This service attempts to split text by different characters to find an optimal separation strategy.

Implements

Name
Description

Represents a service for splitting text into an array of substrings.

Returns: . The joined string or null if the result is empty.

MergeSplits(splits, separator)

Returns: . An array of merged text chunks.

Thrown when splits is null.

Split(text)

Returns: . An array of strings representing the split text chunks.

String
String[]
ArgumentNullException
String[]
IReadOnlyList<String>
String
IEnumerable<String>
String
String
RecursiveCharacterTextSplitterService
ITextSplitterService
Int32
Int32