# DocumentTools

## Overview

The DocumentTools functionality is closely related to the [DocumentSearchTools](/ai/components/api/tools/wisej.ai.tools.documentsearchtools.md), with the key difference being that DocumentTools processes a single document at a time. Unlike DocumentSearchTools, which relies on a vector database to manage document embeddings, DocumentTools embeds the document dynamically during processing and retains it in memory.

The preconfigured prompt is this:

```ini
#
# DocumentTools
#
[DocumentTools]
Provides tools to query or summarize a document.

[DocumentTools.query_document]
Extracts the most relevant content from the document.

[DocumentTools.query_document.question]
Rewrite the user’s question to enhance the search using RAG and embeddings.

[DocumentTools.summarize_document]
Summarizes the document.

```

The DocumentTools class provides multiple methods that the AI can use to "read" a specfic document, enhancing its flexibility and adaptability in processing and analyzing data.

## Using DocumentTools

To enable the use of the [DocumentTools](/ai/components/built-in-smarttools/documenttools.md) simply add it to a SmartHub, SmartAdapter, SmartSession or SmartPrompt.

```csharp
this.smartPrompt1.UseTools(new DocumentTools {
        FilePath = Application.MapPath("App_Data\\CustomerServiceManual.pdf"
    });
```

When instantiating DocumentTools, you have the option to specify either the file path or the stream of the document you wish teh AI to "read". In addition, you can configure several other properties that influence the functionality and behavior of the tools.

## Properties

<table><thead><tr><th width="208">Name</th><th>Description</th></tr></thead><tbody><tr><td>FilePath</td><td>Read-write and overridable. Default is 10.<br>Full path of the document to "read".</td></tr><tr><td>Stream</td><td>Read-write and overridable. Default is null.<br>Stream of the document to "read". Alternative to FilePath.</td></tr><tr><td>FileType</td><td>Read-write and overridable. Default is null.<br>File type of the document in the form of a file extension string: i.e. .pdf, docx, .txt, etc. If not set, the file type is deducted from either the file name.</td></tr><tr><td>TopN</td><td>Read-write and overridable. Default is 10.<br>Maximum number of RAG content (document chunks) to retrieve.</td></tr><tr><td>MaxClusters</td><td>Read-write and overridable. Default is 5.<br><code>MaxClusters</code> determines the upper limit on the number of clusters that can be generated by the summarization function, which utilizes the <a href="https://en.wikipedia.org/wiki/K-means_clustering">K-means clustering</a> algorithm</td></tr><tr><td>MinSimilarity</td><td>Read-write and overridable. Default is 0.25f.<br>This setting defines the minimum similarity threshold used to filter and select qualified chunks from documents. By establishing this threshold, Wisej.AI can efficiently determine which segments of the document are closely aligned with the desired criteria, enhancing the accuracy and relevance of the selection process.</td></tr><tr><td>MaxContextTokens</td><td>Read-write and overridable. Default is 4096.<br>This setting specifies the maximum number of tokens that can be returned to the AI within the Retrieval-Augmented Generation (RAG) context string. By limiting the token count, you ensure that the context provided to the AI remains concise and manageable.</td></tr><tr><td>RerankingEnabled</td><td>Read-write and overridable. Default is false.<br>Enables reranking of the vector search results through the <a href="/pages/JunwAWvfkr2kWmWTnwSF">IRerankingService</a>.</td></tr></tbody></table>

## Services

This tool relies on several services, many of which are pre-configured by default:

* ITokenizerService
* ITextSplitterService
* IDocumentConversionService
* IEmbeddingGenerationService

We recommend registering your own `IDocumentConversionService` and utilizing a professional library such as Aspose for document-to-text conversion. The built-in converter currently utilizes PdfPig and OpenXML, which are the same tools employed by Semantic Kernel. However, these tools have limitations and may not be capable of accurately processing complex documents containing tables and images.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.wisej.com/ai/components/built-in-smarttools/documenttools.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
