DocumentTools
Wisej.AI.Tools.DocumentTools
Namespace: Wisej.AI.Tools
Assembly: Wisej.AI (3.5.0.0)
Provides tools for processing, embedding, and querying documents using AI services.
public class DocumentTools : ToolsContainerPublic Class DocumentTools
Inherits ToolsContainerThe DocumentTools class enables document processing tasks such as:
Embedding documents for similarity and summarization tasks.
Querying documents based on semantic similarity.
Summarizing document content using clustering techniques. It integrates with various AI services, including tokenization, text splitting, document conversion, and embedding generation.
Constructors
DocumentTools()
Initializes a new instance of the DocumentTools class.
This constructor injects the required services into the DocumentTools instance using the application's service provider.
Properties
ConversionService
IDocumentConversionService: Gets or sets the document conversion service used for converting documents to text.
EmbeddingGenerationService
IEmbeddingGenerationService: Gets or sets the embedding generation service used for generating embeddings from text.
FilePath
String: Gets or sets the file path associated with the document. (Default: null)
Setting this property will reset the internal document and stream references. If the value changes, the Stream and the internal document are set to null.
FileType
String: Gets or sets the file type of the document. (Default: null)
Setting this property will reset the internal document.
MaxClusters
Int32: Get or sets the maximum number of vector clusters to generate when performing summarization tasks. (Default: 5)
MaxContextTokens
Int32: Gets or sets the maximum number of context tokens. (Default: 4096)
MinSimilarity
Single: Gets or sets the minimum similarity threshold for document retrieval. (Default: 0.25)
RerankingEnabled
Boolean: Gets or sets a value indicating whether reranking is enabled. (Default: False)
RerankingService
IRerankingService: Gets or sets the reranking service.
Stream
Stream: Gets or sets the stream associated with the document. (Default: null)
Setting this property will reset the internal document and file path references. If the value changes, the FilePath and the internal document are set to null.
TextSplitterService
ITextSplitterService: Gets or sets the text splitter service used for splitting text into smaller chunks.
TokenizerService
ITokenizerService: Gets or sets the tokenizer service used for truncating context tokens.
TopN
Int32: Gets or sets the number of top chunks to retrieve. (Default: 10)
Methods
EmbedQuestionAsync(question)
Asynchronously generates an embedding for the specified question.
Returns: Task<Embedding>. A task that represents the asynchronous operation. The task result contains the generated Embedding for the question, or null if the input is invalid.
This method checks if the provided question is null or empty and returns null if so. Otherwise, it delegates the embedding generation to the EmbeddingGenerationService.
query_document(question)
Queries a single document based on the provided document name and question.
Returns: Task<String>. A task that represents the asynchronous operation. The task result contains the query result as a string.
RerankAsync(question, chunks)
Asynchronously reranks the provided text chunks based on their relevance to the given question.
Returns: Task<String[]>. A task that represents the asynchronous operation. The task result contains an array of reranked text chunks.
This method is intended to be overridden in derived classes to implement custom reranking logic. The method should return the chunks array reordered by relevance to the question .
summarize_document()
Summarizes the content of a specified document.
Returns: Task<String>. A task that represents the asynchronous operation. The task result contains the summary as a string.
Implements
Represents a container for tools, providing access to a hub, adapter, and a collection of parameters.
Last updated
