LogoLogo
HomeNewsSupportVideos
  • Welcome
  • Wisej.NET
  • Concepts
    • Getting Started
    • General Concepts
    • Architecture
    • Extensibility
    • AI Providers
    • Vector Databases
    • Usage Metrics
    • Logging & Error Handling
  • Markup
  • Components
    • API
      • SmartAdapter
        • SmartAudioTTSAdapter
        • SmartAudioWhisperAdapter
        • SmartCalendarAdapter
        • SmartChartAdapter
        • SmartChartJS3Adapter
        • SmartChatBoxAdapter
        • SmartComboBoxAdapter
        • SmartCopilotAdapter
        • SmartDataEntryAdapter
        • SmartDocumentAdapter
        • SmartFullCalendarAdapter
        • SmartObjectAdapter
        • SmartPictureBoxAdapter
        • SmartQueryAdapter
        • SmartRealtimeAdapter
        • SmartReportAdapter
        • SmartTextBoxAdapter
        • SmartAdapter.ExtendsAttribute
        • SmartAdapter.FieldNameAttribute
        • SmartAdapter.FieldPromptAttribute
        • SmartAdapter.FieldRectangleAttribute
        • SmartAdapter.WorksWithAttribute
      • SmartEndpoint
        • AmazonBedrockEndpoint
        • AnthropicEndpoint
        • AzureAIEndpoint
        • CerebrasEndpoint
        • DeepSeekEndpoint
        • GoogleAIEndpoint
        • GroqCloudEndpoint
        • GroqCloudEndpointWhisper
        • HuggingFaceEndpoint
        • HuggingFaceJavaScriptEndpoint
        • LocalAIEndpoint
        • LocalAIEndpointImageGen
        • LocalAIEndpointTTS
        • LocalAIEndpointWhisper
        • NvidiaAIEndpoint
        • OllamaEndpoint
        • OpenAIEndpoint
        • OpenAIEndpointDallE
        • OpenAIEndpointRealtime
        • OpenAIEndpointTTS
        • OpenAIEndpointWhisper
        • SambaNovaEndpoint
        • SmartHttpEndpoint
        • TogetherAIEndpoint
        • XAIEndpoint
        • SmartEndpoint.Metrics
        • SmartEndpoint.Response
      • SmartExtensions
      • SmartHub
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
      • SmartObject
      • SmartPrompt
        • SmartAgentPrompt
        • SmartParallelPrompt
        • SmartPrompt.Parameter
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
      • SmartRealtimeSession
      • SmartSession
        • SmartSession.ConvertParameterEventArgs
        • SmartSession.ConvertParameterEventHandler
        • SmartSession.ErrorEventArgs
        • SmartSession.ErrorEventHandler
        • SmartSession.InvokeToolEventArgs
        • SmartSession.InvokeToolEventHandler
        • SmartSession.Message
        • SmartSession.MessageCollection
        • SmartSession.MessageRole
        • SmartSession.MessagesEventArgs
        • SmartSession.MessagesEventHandler
        • SmartSession.TrimmingStrategy
      • SmartTool
        • SmartTool.IToolProvider
        • SmartTool.ToolAttribute
        • SmartTool.ToolContext
      • Markup
        • MarkupExtensions
      • Controls
        • UVLightOverlay
      • Embeddings
        • EmbeddedDocument
        • Embedding
        • Matches
        • Metadata
      • Helpers
        • ApiKeys
        • Markdown
        • TextTokenizer
      • Services
        • DefaultSessionTrimmingService
        • IDocumentConversionService
          • DefaultDocumentConversionService
        • IEmbeddingGenerationService
          • DefaultEmbeddingGenerationService
          • HuggingFaceEmbeddingGenerationService
        • IEmbeddingStorageService
          • AzureAISearchEmbeddingStorageService
          • ChromaEmbeddingStorageService
          • FileSystemEmbeddingStorageService
          • MemoryEmbeddingStorageService
          • PineconeEmbeddingStorageService
          • QdrantEmbeddingStorageService
        • IHttpClientService
          • DefaultHttpClientService
        • ILoggerService
          • DefaultLoggerService
        • IOCRService
          • DefaultOCRService
        • IRerankingService
          • DefaultRerankingService
          • LocalAIRerankingService
          • PineconeRerankingService
        • ISessionTrimmingService
          • DefaultSessionTrimmingService
        • ITextSplitterService
          • RecursiveCharacterTextSplitterService
          • TextSplitterServiceBase
        • ITokenizerService
          • DefaultTokenizerService
        • IWebSearchService
          • BingWebSearchService
          • BraveWebSearchService
          • GoogleWebSearchService
      • Tools
        • ArxivTools
        • ChartJS3Tools
        • DatabaseTools
        • DataTableFilterTools
        • DocumentSearchTools
        • DocumentTools
        • FullCalendarTools
        • IToolsContainer
        • MathTools
        • ToolsContainer
        • UtilityTools
        • WebSearchTools
    • Built-in Services
      • IOCRService
      • ILoggerService
      • ITextSplitterService
      • ITokenizerService
      • IHttpClientService
      • IWebSearchService
      • IRerankingService
      • ISessionTrimmingService
      • IDocumentConversionService
      • IEmbeddingStorageService
      • IEmbeddingGenerationService
    • Built-in SmartTools
      • ToolsContainer
      • MathTools
      • UtilityTools
      • DatabaseTools
      • DocumentTools
      • DocumentSearchTools
      • WebSearchTools
      • ChartJS3Tools
      • FullCalendarTools
    • Built-in SmartAdapters
      • SmartAdapter
      • SmartAudioTTSAdapter
      • SmartAudioWhisperAdapter
      • SmartCalendarAdapter
      • SmartChartAdapter
      • SmartChartJS3Adapter
      • SmartChatBoxAdapter
      • SmartComboBoxAdapter
      • SmartCopilotAdapter
      • SmartDataEntryAdapter
      • SmartDocumentAdapter
      • SmartFullCalendarAdapter
      • SmartObjectAdapter
      • SmartPictureBoxAdapter
      • SmartQueryAdapter
      • SmartRealtimeAdapter
      • SmartReportAdapter
      • SmartTextBoxAdapter
    • Configure Services
    • Using SmartHub
    • Using SmartTools
    • Using SmartPrompt
    • Using SmartSession
    • Using SmartRealTimeAdapter
    • UVLightOverlay Control
Powered by GitBook
On this page
  • Methods
  • CountTokens(text, encoder)
  • Tokenize(text, encoder)
  • TruncateContent(text, maxTokens, encoder)
  • Implemented By
Export as PDF
  1. Components
  2. API
  3. Services

ITokenizerService

Wisej.AI.Services.ITokenizerService

PreviousTextSplitterServiceBaseNextDefaultTokenizerService

Last updated 5 days ago

Namespace: Wisej.AI.Services

Assembly: Wisej.AI (3.5.0.0)

Represents a service for tokenizing text, counting tokens, and truncating content based on token limits.

public interface ITokenizerService
Public Interface ITokenizerService

The interface provides methods to handle text tokenization operations such as:

  • Counting the number of tokens in a given text.

  • Converting text into tokens.

  • Truncating text to ensure it does not exceed a specified number of tokens. These operations can be customized using a different encoder if specified.

Methods

CountTokens(text, encoder)

Counts the number of tokens in the specified text, optionally using a specified encoder.

Parameter
Type
Description

text

The text to be tokenized and counted.

The optional name of the encoder used for tokenization. If not specified, a default encoder is used.

Returns: . The total count of tokens in the specified text.

This method provides a way to determine the length of a text in terms of tokens, which can be useful for operations that have token limits. The method can utilize different models to tokenize the text, which may affect the token count. Example usage:


  ITokenizerService tokenizerService = GetTokenizerService();
  int tokenCount = tokenizerService.CountTokens("This is a sample text.");

Throws:

Tokenizes the specified text into an array of tokens, optionally using a specified encoder.

Parameter
Type
Description

text

The text to be tokenized.

The optional name of the encoder used for tokenization. If not specified, a default encoder is used.

This method splits the text into discrete tokens, which can be useful for various text processing tasks. Different tokenization models can produce different token arrays from the same text. Example usage:


  ITokenizerService tokenizerService = GetTokenizerService();
  string[] tokens = tokenizerService.Tokenize("This is a sample text.");

Throws:

Truncates the specified text to ensure it does not exceed a given number of tokens, optionally using a specified encoder.

Parameter
Type
Description

text

The text to be truncated based on token count.

maxTokens

The maximum number of tokens allowed for the text.

The optional name of the encoder used for tokenization. If not specified, a default encoder is used.

This method is useful for ensuring that the text does not exceed a certain token limit, which can be important in contexts where token usage is limited or costly. The truncation respects token boundaries. Example usage:


  ITokenizerService tokenizerService = GetTokenizerService();
  string truncatedText = tokenizerService.TruncateContent("This is a sample text that may need truncation.", 5);

Throws:

Implemented By

Name
Description

Provides services for tokenizing text, including counting tokens, tokenizing, and truncating content based on a token limit.

encoder

Thrown when text is null.

Tokenize(text, encoder)

encoder

Returns: . An array of tokens derived from the specified text.

Thrown when text is null.

TruncateContent(text, maxTokens, encoder)

encoder

Returns: . The truncated text that is within the specified token limit.

Thrown when text is null.

Thrown when maxTokens is less than zero.

ArgumentNullException
String[]
ArgumentNullException
String
ArgumentNullException
ArgumentOutOfRangeException
String
String
String
String
String
Int32
String
DefaultTokenizerService
ITokenizerService
Int32