DefaultDocumentConversionService

Wisej.AI.Services.DefaultDocumentConversionService

Namespace: Wisej.AI.Services

Assembly: Wisej.AI (3.5.0.0)

Provides functionality to convert documents from various formats into text representations.

public class DefaultDocumentConversionService : IDocumentConversionService

This class implements the IDocumentConversionService interface and provides methods to convert documents from streams into text arrays. Supported formats include PDF, DOCX, and HTML. The conversion process can also include metadata extraction if provided.

Constructors

DefaultDocumentConversionService()

Initializes a new instance of DefaultDocumentConversionService.

Methods

Convert(stream, fileType, metadata, iterator)

Converts a document from a given stream into a specified file type.

Parameter
Type
Description

stream

The stream containing the document to be converted. It must be readable and positioned at the start of the document.

fileType

The format of the input stream represented as a string. It should be the file type, i.e.: "pdf", "docx", "html", ...

metadata

Optional metadata related to the document conversion. Defaults to null.

iterator

Optional function to process each page or paragraph or section of the document being converted.

Returns: String. A string representing the the converted document.

This method reads a document of type fileType from the specified stream and convert it to an array strings representing either line, paragraphs or pages, depending on the conversion implementation. If the metadata parameter is provided, this method will output additional information about the document, such as the title, author, subject, pages, etc. Example usage:


using (var fileStream = File.OpenRead("example.docx"))
{
var conversionService = new DefaultDocumentConversionService();
string convertedDocumentPath = conversionService.Convert(fileStream, "pdf");
Console.WriteLine($"Document converted and saved to: {convertedDocumentPath}");
}
  • Ensure the stream is positioned at the beginning or the correct position for reading.

  • Supported file types should be verified with the service documentation.

  • Handle potential exceptions that may arise from invalid streams or unsupported file types.Throws:

  • ArgumentNullException Thrown when stream or fileType is null.

Implements

Name
Description

Represents a service interface for converting documents from one format to another.

Last updated