@robypag/cap-aicore-plugin
TypeScript icon, indicating that this package has built-in type declarations

0.1.3 • Public • Published

Release and Publish

AI Core Plugin

This package is a CDS Plugin that provides easy access to SAP AI Core Generative AI Hub functionalities. It aims to enable a configuration-based access to Completions and Embeddings in a CAP project, with minimal implementation overhead.

It takes free inspiration from the original CAP LLM Plugin and extends from there:

In no cases it wants to replace it nor it wants to be a better solution. It borns from a specific use case and quickly became a project on its own but still it is what it is: a personal project. However - if it could be of interest for others - contributions, collaborations and even constructive critiques are warmly welcomed!

[!NOTE]

Once again, please note that THIS IS NOT an official SAP software

Introduction

This plugin offers a simplified way to setup an application to include AI-based conversations. It completely handles system prompts, completions and chat context so that the caller application needs only to provide new user messages and act on responses.

Similarly, it handles a simplified way to generate and store embeddings in a HANA database - assuming that its Vector Engine is enabled.

Please read carefully the documentation, especially the section that describes the managed and un-managed modes.

Installing

This package is not yet published to NPM, so to use it please clone the repo and use it as a local NPM module in your project.

Table of Contents

Setup

The plugin uses CAP configuration to set itself up, so it requires a cds configuration entry either in package.json or .cdsrc.json. It comes with a preconfigured schema that helps with value input.

You can define a configuration for both the completion capability and the embeddings capability.

The minimum need is to provide completion configuration section of the plugin:

{
  "cds": {
    "requires": {
      "ai-core": {
        "kind": "ai-core":,
        "completions": {
          "destination": "<NAME_OF_BTP_DESTINATION>",
          "resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
          "deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_COMPLETION_MODEL>",
          "apiVersion": "<AI_CORE_COMPLETION_MODEL_API_VERSION>",
          "temperature": "<COMPLETION_MODEL_TEMPERATURE>"
        }
      }
    }
  }
}

Similarly, you can configure the embeddings section:

{
  "cds": {
    "requires": {
      "ai-core": {
        ...
        "embeddings": {
          "destination": "<NAME_OF_BTP_DESTINATION>",
          "resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
          "deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_EMBEDDING_MODEL>",
          "apiVersion": "<AI_CORE_EMBEDDING_MODEL_API_VERSION>"
        }
      }
    }
  }
}

Here is a breakdown of each property:

  • destination: the name of the BTP destination that points to the AI Core Service Instance. The plugin uses SAP Cloud SDK Connectivity to look this up.
  • resourceGroup: the name of AI Core resource group under which Configurations and Deployments are created - see Resource Groups.
  • deploymentId: The ID of the model deployment
  • apiVersion: The API version of the model. Find the available values here
  • temperature: (Only valid for Completions) Allows to influence the predictability of the generated text. Accepts values from 0 to 1, where 0 is the most deterministic (more predictable and prone to repetitions) and 1 is the less deterministic (less predictable but more prone to hallucinations)

[!WARNING]

Embeddings can only be used when running on a HANA database. SQLite does not support vectors and therefore cannot process similarity searches.

The plugin can be configured in a managed and an un-managed way. This is just a way to tell the plugin runtime to discriminate whether you would like only to use Core API functionalities like completions and embeddings or you want to have a completely managed solution, that includes database operations and context handling.

Un-Managed Configuration

To just use API functions, set up the corresponding configuration object as follows:

"ai-core": {
    "completions": {
        "managed": false,
        // Other properties of the completions object
    }
}

The same applies for embeddings. With this configuration, the plugin does not check your database model nor the service actions. It will act as a simple proxy between your code and AI Core, using the configuration provided.

This is the default configuration

Managed Configuration

The managed configuration uses all the embedded functionalities of the plugin, which are described in the following paragraphs.

Use AI Artifacts

The plugin offers a set of aspects to simplify database modeling when using AI capabilities. You can define entities at database level, include the relative aspect and you are good to go. Each aspect comes with specific custom annotations that allow the plugin to determine which entity is used to do what. You can flexibly decide wheter to use aspects or to model your database and add AI annotations to your entities.

There are currently 4 available aspects and 8 available annotations.

Annotations

Annotations allow you to "mark" specific entities, properties and functions so that the plugin knows how to behave:

Annotation Name For Description
@AIConversations Entity Sets the annotated entity as the "Conversations" entity
@AIMessages Entity Sets the annotated entity as the "Messages" entity
@AISystemPrompts Entity Sets the annotated entity as the source for static System prompts
@AIEmbeddingsStorage Entity Sets the annotated entity as the repository for vectorized texts
@AISummarize Property Marks the property as summarized: by default is the title of the @AIConversations entity. The value of this property will be generated by a completion
@AIEmbedding Property Marks the property as the container for vector values in the @AIEmbeddingsStorage entity. Must be assigned to a field of type Vector
@AITextChunk Property Marks the property as the container for text values in the @AIEmbeddingsStorage entity
@AICompletion Action Annotates an action to act as a Completion endpoint
@AIEmbeddingGenerator Action Annotates an action to act as a embedding vector generator

Entity and property annotations are used at runtime to determine how to properly handle the persistence of Messages, Conversations and Embeddings. Action annotations are used at runtime - specifically at cds.once('served') event - to attach custom handlers to action and automatically handle the processing of Completions and Embedding generation.

Aspects

Above annotations are automatically assigned if you decide to use the pre-defined aspects defined in index.cds.

  • AIConversations: Represents the base entity that contains a list of conversation between a User and the AI. It comes with a predefined @AIConversations annotation and includes a single title property annotated with @AISummarize.
  • AIMessages: Represents the base entity that contains messages exchanged between a User and the AI for a given Conversation. Includes the following properties:
Property Type Description
content LargeString Content of the message sent by either the user or the AI
role String enum 'user'/'system'/'assistant'/'tool' The role of message sender
  • AISystemPrompt: Allows to define static texts to be used as context during a conversation. They represent the value of the system message role in a conversation. There are currently two available types: SIMPLE and CONTEXT_AWARE. As the name implies, the first will be used during simple conversations, whereas the latter will be considered during RAG-aware chats.
  • AIDocumentChunks: This is the base entity that contains vector embeddings. Comes with 3 properties:
Property Type Description
embedding Vector(1536) The vector representation of a text-chunk. Comes annotated with @AIEmbedding
text LargeString The original text-chunk from which vectors are generated. Comes annotated with @AITextChunk
source LargeString The reference to the original text or document from which vectors and text-chunks are determined

Entity Modeling

As described, the above artifacts allow to design a simple database model to satisfy the minimal configuration to perform conversations and embeddings.

[!IMPORTANT]

Since aspects do not allow to manage compositions or associations, developer must add corresponding properties to entities annotated with @AIConversations and @AIMessages (regardless if entities include provided aspects or not). Specifically:

entity Chats: AIConversations {
  ...
  Messages: Composition of many Messages on Messages.Chat = $self;
}
...
entity Messages: AIMessages {
  ...
  key Chat: Association to one Chat;
}

This way, the plugin knows how to deal with relationships between the two entities. In future enhancements, the plugin will automatically add missing relationship between these base entities.

Actions

Any service action annotated with one of the two available action annotations (see annotations) will be prepended by the corresponding handler from the plugin.

The plugin service prepends its handler via service.prepend(() => service.on(...)) and calls the next() function by providing generated data. This means that, if you want to work on the AI Response, you need to implement an action handler in your application code, to which the plugin will prepend.

For example, for Chat Completions:

class MyService extends cds.ApplicationService {
    async init() {
        // * Any action name annotated with @AICompletion
        this.on('sendMessage', async (req) => {
            const { aicore } = req.data;
            // AICore contains the result of plugin function:
            const { conversationId, lastMessage, ragContext } = aicore;
            // Do your thing
        })
    }
}

Or, for embeddings:

class MyService extends cds.ApplicationService {
    async init() {
        // * Any action name annotated with @AIEmbeddingGenerator
        this.on('generateEmbeddings', async (req) => {
            const { aicore } = req.data;
            // AICore contains the result of plugin function:
            const { embeddings } = aicore;
            // Do your thing
        })
    }
}

Completions

Completions are the most basic functionality of an AI chat. They allow message exchange between a User and the AI. It's very easy using AI Core to perform a "completion": given a deployment ID for a completion model, one POST call to the completion endpoint will provide an AI response.

The plugin simplifies the consumption of the completion model by attaching to an arbitrary OData action that is annotated with @AICompletion.

[!IMPORTANT]

Please note that the annotated action must satisfy the following parameter signature:

{ conversationID: uuid | null, content: string, useRag: boolean }

Model Support

SAP AI Core supports a wide range of LLMs, which are listed in the official documention.

Each LLM requires a defined payload structure to perform the completion generation and, although many LLMs share some common properties, there is no "one fits all" model that can be used. For this plugin, it has been decided to use the widely used OpenAI format for completion generation, which is structured as follows (beside other properties to influence the behavior):

{
    messages: [{
        role: 'system' | 'user' | 'assistant'
        content: 'An arbitrary message or context'
    }]
}

The plugin will accept and return messages only in this format.

It will, however, take care about the conversion of the input to the specific requirements of the LLM targeted by the given deployment ID. Within the transformers path, a class has been defined for each different data structure. As of now, these are the models supported for completions:

BaseTransformer
    -- OpenAITransformer
    -- GeminiTransformer
    -- MistralAITransformer
    -- AnthropicTransformer
    -- LlamaTransformer

Each transformer class inherits from the BaseTransformer, which offers 4 functions:

Function Purpose
transformCompletionIn Takes the inbound context in OpenAI format and converts it to the specific LLM format
transformCompletionOut Takes the response from the LLM and converts it to the OpenAI format
transformEmbeddingIn Takes the inbound embedding request payload and converts it to the specific LLM format
transformEmbeddingOut Takes the response from the LLM and converts it to a plain object like { embeddings: <result>}

As an initial implementation, each model is matched to the corresponding transformer class in the mapping file supported-models.js.

By creating additional transformer classes, additional LLMs could be supported in the future.

Completion Behavior

System Prompt

To handle System Prompts, you can use the @AISystemPrompts annotation or include the AISystemPrompts aspect in your modeled entity. Either way, the presence of such an annotated element will make the plugin use the entity as a source to create the element system in the payload sent to the AI to generate a completion. It works this way:

  • Upon a new message, the plugin will check the existance of such entity, searching for the annotation
  • If it's found, the plugin then checks if the current requires a RAG processing or not and takes the value from entity property prompt where the property type is equal to SIMPLE (for non-RAG chats) or equal to CONTEXT_AWARE (for RAG chats). Check the logic in ai-service-handler.js
  • If the entity is NOT found, the plugin contains some generic constants in llm-prompts.js. This is likely to be changed in favor of a function parameter

Chat Context

Chat context represents the entire history of messages exchanged during a conversation: an effective AI chat "remembers" previous messages, in order to not repeat itself and to keep a true sense of conversing.

The AI Core plugin automatically manages the chat context, by storing Messages of a Conversation in the entities annotated with @AIConversations and @AIMessages: on each message, the context is rebuilt and sent to the completion endpoint.

Embeddings

As an LLM would say:

Embedding is a way to represent data, like words or images, as numerical vectors to capture relationships and meaning. Embeddings allow machines to understand, compare, and process data more effectively by transforming complex information into numerical forms that highlight patterns, similarities, and differences.

The plugin provides an easy way to produce embeddings from an arbitrary text or piece of data. There are currently two ways in which you can get embeddings:

  • Using an action annotated with @AIEmbeddingGenerator: whichever text is sent to the action, will be returned as a numerical vector. Remeber that the plugin prepends its handler, so if you need to manipulate the result you need to implement and register an action handler youself. See actions.
  • By cds.connect.to('ai-core') and calling the getEmbeddings() API function.

[!WARNING] To use embeddings, the corresponding cds configuration MUST be set. See setup.

RAG-Aware Completions

RAG-aware completions combine Retrieval-Augmented Generation (RAG) with conversational AI, enhancing responses by retrieving relevant external information, leading to more accurate, informed, and contextually appropriate dialogue in real-time.

The plugin uses the HANA Vector Engine to perform similarity searches and provide additional, specific context to the LLM.

During RAG-aware conversations, the system prompt is re-calculated on every new message: using the user query, a similarity search is performed on the entity annotated with the @AIEmbeddingsStorage entity and the resulting context is used as system prompt. This allows AI answers to be more tailored on application needs, avoiding a broader context and limiting answers to a specific topic.

RAG-aware conversations are activated by providing a truthy value to the parameter useRag of the completion action annotated with @AICompletion.

API

[!NOTE] API Calls still require the minimal configuration for embeddings and completions. See Setup.

You can always call the Core API functions, regardless of the managed aspects and actions. There are 3 main functions:

Function Parameters Description
genericCompletion(messages) Array<{ role: string, content: string }> Performs a completion call to the LLM deployed in the configured deploymentId. It expects a full chat context, including the system role. Returns the AI response in the same format.
createEmbeddings(text) text: string Generates a Vector of embeddings, using the LLM deployed in the configured deploymentId. Returns an array of numbers.
vectorSearch(params) See below Allows the execution of a generic similarity search on HANA

The third function vectorSearch allows to perform vector-based searches on a user-specified table. It accepts the following parameter:

Name Type Description
query string The text to search for
tableName string The name in HANA format that contain embeddings and texts. I.E. SAP_DEMO_EMBEDDINGS
embeddingColumnName string The name of table field that vectorized representation of data. Must be of type REAL_VECTOR (cds.Vector(1536) in CDS)
textColumnName string The name of table field that contains textual representation of data
searchAlgorithm string The name of similarity algorithm. HANA currently supports COSINE_SIMILARITY and L2DISTANCE
minScore number A value between 0 and 1. It will be used to filter out elements with a score lower than the specified value
candidates number Number of candidates to read from HANA

Returns an object with found content and a metrics object that includes similarity scores and the table entry that generated the result:

{
    content: ['I am one result in textual representation', 'I am number two'],
    metrics: [{
        score: 0.945424895818,
        textContent: 'I am one result in textual representation',
        tableEntry: {
            foo: 'bar'
        }
    }, {...}]
}

Local Testing

The plugin can only be tested if associated with a CAP application: you can quickly spin up a basic bookshop application and add the plugin usage. Testing with SQLite will only allow the usage of simple Completions: vectors are not supported in SQLite, so Embeddings and RAG-aware Completions are not working.

There are currently no checks performed by the plugin on this: if you try to deploy a model that uses Vectors to SQLite, the database driver will throw an error.

You can however perform an hybrid testing and bind your application to an SAP HANA service and still run locally. To simplify development, bind to a destination service instance as well, in order to easily consume the required destination that points to AI Core deployments.

Testing with Jest

Under development

Contributing

If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.

Please have a look at CONTRIBUTING.md for additional info.

Code of Conduct

See CODE_OF_CONDUCT.md

Licensing

See LICENSE

Package Sidebar

Install

npm i @robypag/cap-aicore-plugin

Weekly Downloads

0

Version

0.1.3

License

SEE LICENSE IN LICENSE

Unpacked Size

110 kB

Total Files

25

Last publish

Collaborators

  • robypag25