Agent Knowledge

Profound AI's Agent Knowledge feature allows you to upload enterprise-specific documentation, or any other unstructured data, to enhance the capabilities of your AI agents. The documentation can range from technical manuals and product information to internal policies and procedural guides. Once uploaded, these documents are indexed and made searchable by AI agents, enabling them to provide comprehensive and contextually relevant responses to user queries.

Knowledge vs Instructions

It's important to distinguish between Knowledge Documents and Agent Instructions in Profound AI, as they serve different yet complementary roles in enhancing the capabilities of AI agents. The main difference is in how how information is processed and utilized. Knowledge Documents function like an encyclopedia, where documents are semantically searched, and only relevant sections are read and processed as needed. This allows these documents to contain vast amounts of information, accessible on demand. In contrast, Agent Instructions are limited by the large language model's system prompt token limit, as it involves preloaded instructions that are read all at once. The semantic search capability of Knowledge Documents ensures efficient token usage, as the AI agents only reference and process the specific parts of documents pertinent to a query. This distinction makes Knowledge Documents ideal for handling complex, information-rich queries, while Agent Instructions are more suited for providing concise, predetermined information that must fit within the token limit constraints.

Semantic Search

Semantic search refers to a search process that goes beyond the literal or explicit matching of query words or phrases. Unlike traditional keyword-based search, which focuses on finding exact matches or synonyms in the text, semantic search aims to understand the intent and contextual meaning behind a search query.

In semantic search, the system interprets the nuances of language, such as the user's intent, the relationship between words, and the context in which they are used. This involves using advanced natural language processing (NLP) techniques and understanding the semantics – the meanings and relationships of words and phrases – within the content.

For instance, if someone searches for "tips for remote work," a semantic search system wouldn't just look for document sections containing those exact words. Instead, it would also find relevant content discussing topics like "home office setup," "productivity in virtual environments," or "telecommuting best practices," understanding that these are contextually related to the original query.

In the context of Profound AI's Knowledge Documents, semantic search enables the AI agents to sift through extensive documentation and find the most relevant sections that answer or relate to the user's query. The approach used is typically referred to as RAG, or Retrieval-Augmented Generation. This approach ensures more accurate, context-aware responses and a better understanding of user needs, compared to traditional keyword-based search methods.

Uploading Documents

To add new documents, click the Add icon in the Knowledge Documents section of the IDE.

A new dialog will appear, where you can upload, remove, or select documents.

Profound AI supports various file formats, including PDF, Text, and Word Documents.

This list of uploaded documents can be shared among different agents.

Selecting and Indexing Documents

To assign a document to the agent, double-click the desired document, or single-click it and then click the Select button.

Before the document is usable within your agent, you must click the Sync button in the Knowledge Documents section.

This will index the selected document(s) in a vector database allowing for semantic search.

Finally, test searching the knowledge documents by typing a query in the Agent Preview section of the IDE.

RAG Options

A model can optionally be configured to use specific RAG, or Retrieval-Augmented Generation, options. See Model Configuration for additional details. When no RAG configuration is provided, default settings are used.

A RAG configuration consists of the following:

  • Provider - this points to the service or library of code that provides RAG capabilities. Current options include “openai” and “llamaindex”.

  • Embedding Model - this specifies the model that translates text into a representation of meaning in the form of embeddings, which are also knows as vectors. If not specified, a default embedding model from OpenAI is used.

  • Vector Database - this specifies the database that holds indexed documents. The documents are broken up into chunks, assigned embeddings/vectors using the Embedding Model, and then placed into a database.

The default vector database for the “openai” provider is the built-in cloud database used by OpenAI.

For “llamaindex”, the default is to use local .json files, which are always loaded in memory. To scale beyond this, an external vector database should be provisioned.