DatabaseTools

Overview

DatabaseTools set of tools can generate a SQL SELECT statement for a specific database utilizing the provided schema. It can construct fairly complex statements and apply several joins to accomplish its task.

This is the prompt:

#
# DatabaseTools
#
[DatabaseTools]
Provides tools to access the database "{{database_name}}".
Unless instructed otherwise, use it before the web search tools.

Instructions:
- Use only the tables defined in the Database Schema
- The generated SQL statement MUST be valid for "{{server_type}}".
- Define column aliases within single quotes
- Output only the SQL statement without any text before or after
- Enclose column names in []

```Database Schema
{{database-schema}}
```

[DatabaseTools.select]
Executes a SQL SELECT command on the database to retrieve the specified data.

[DatabaseTools.select.sql]
A well-formed SQL SELECT statement using the tables and columns defined in the Database Schema.

When invoked by the AI, the DatabaseTools object executes the SQL SELECT statement and returns the retrieved rows as a CSV string. The data is truncated as needed to remain within the specified maximum number of tokens.

To function, DatabaseTools require a DbConnection, a schema, and a database type. You can provide the connection directly or override the Connection property. The tools can automatically retrieve the "Columns" schema from the connection and build the schema definition, or you can initialize or override the Schema property (a simple string) if customization is needed.

If you don't specify the database type, it will infer it from the DbConnection type. It recognizes the following: "SQLServer", "Oracle", "MySql", "PostgreSQL", "DB2". If the database type cannot be inferred, it defaults to using the name of the DbConnection type with the trailing "Connection" string removed.

The schema is constructed by listing the table names, followed by the column names and their respective data types:

Chinook schema
- [Album] ([AlbumId] int,[Title] nvarchar,[ArtistId] int)
- [Artist] ([ArtistId] int,[Name] nvarchar)
- [ArtistAlbum] ([Name] nvarchar,[Albums] int)
- [Customer] ([CustomerId] int,[FirstName] nvarchar,[LastName] nvarchar,[Email] nvarchar,[SupportRepId] int,[Company] nvarchar,[Address] nvarchar,[City] nvarchar,[State] nvarchar,[Country] nvarchar,[PostalCode] nvarchar,[Phone] nvarchar,[Fax] nvarchar)
- [Employee] ([EmployeeId] int,[LastName] nvarchar,[FirstName] nvarchar,[Title] nvarchar,[ReportsTo] int,[BirthDate] datetime,[HireDate] datetime,[Address] nvarchar,[City] nvarchar,[State] nvarchar,[Country] nvarchar,[PostalCode] nvarchar,[Phone] nvarchar,[Fax] nvarchar,[Email] nvarchar)
- [Genre] ([GenreId] int,[Name] nvarchar)
- [Invoice] ([InvoiceId] int,[CustomerId] int,[InvoiceDate] datetime,[Total] numeric,[BillingAddress] nvarchar,[BillingCity] nvarchar,[BillingState] nvarchar,[BillingCountry] nvarchar,[BillingPostalCode] nvarchar)
- [InvoiceLine] ([InvoiceLineId] int,[InvoiceId] int, [TrackId] int,[UnitPrice] numeric,[Quantity] int)
- [MediaType] ([MediaTypeId] int,[Name] nvarchar)
- [Playlist] ([PlaylistId] int,[Name] nvarchar)
- [PlaylistTrack] ([PlaylistId] int,[TrackId] int)
- [Track] ([TrackId] int,[Name] nvarchar,[UnitPrice] numeric,[MediaTypeId] int,[Milliseconds] int,[Bytes] int,[GenreId] int,[Composer] nvarchar,[AlbumId] int)

However, you can construct the schema in any preferred manner and then either initialize or override the Schema property.

Using the schema mentioned above and the Chinook test database, the prompt "Top 5 countries by sales" results in the following SQL statements, which produce the data below for RAG insertion.

Generated SQL
SELECT TOP 5 [BillingCountry] AS 'Country', SUM([Total]) AS 'Sales' FROM [Invoice] GROUP BY [BillingCountry] ORDER BY SUM([Total]) DESC

Column names and data are concatenated in rows and separate by a pipe.

RAG data
Country|Sales
"USA"|523.06
"Canada"|303.96
"France"|195.10
"Brazil"|190.10
"Germany"|156.48

Depending on the task being executed by the AI, this data will be utilized to compose the final assistant message and support the adapter using the DatabaseTools.

Using DatabaseTools

To enable the use of DatabaseTools, simply add it to a SmartHub, SmartAdapter, or SmartPrompt.

this.hub.AddTools(new DatabaseTools(new SqlConnection(connectionString));
this.smartDataEntryAdapter
    .UsetTools(
        new DatabaseTools(
            new SqlConnection(connectionString),
            serverType: "SQLite"));

When creating an instance of the DatabaseTools class, you have the option to specify a database connection using a DbConnection, define a column schema using a DataTable with the Columns schema, provide a schema string as an alternative to using the DataTable columns, and include the serverType string to indicate the type of database server being used. These configurations offer versatility in connecting to and interacting with various database setups.

Alternatively, you can extend the DatabaseTools class and override the parameters responsible for returning the schema and the database connection. This approach allows for customization and flexibility in adapting the initial setup to meet specific requirements or preferences.

Properties

Name
Description

Connection

Read-only and overridable. Return a DbConnection instance. The tool will call Open() and Close() as neeed.

Columns

Read-only and overridable. Return a DataTable with the Columns schema with at least these fields: DATA_TYPE, TABLE_NAME, COLUMN_NAME. Read the common schema definition here.

Schema

Read-only and overridable. Use in alternative to the Columns property. You can override it and return the schema as a string using any formatting you prefer.

ServerType

Read-only and overridable. Return the server type to help the AI generate the correct SQL. If not set, the tool will try to detect the servet type using the type of the connection. Currently it detects: SQLServer, Oracle, MySql, PostgreSQL, DB2, or the class name without the "Connectin" postfix.

MaxContextTokens

Read-write and overridable. Returns or sets the maximum number of tokens to use when creating the RAG context string. When the total number of tokens exceeds the limit, the data returned by the query is truncated at the previous line. This ensures that the output remains within the allowable token threshold, preventing any incomplete or malformed data from being processed.

Last updated