This package is a CDS Plugin that provides easy access to SAP AI Core Generative AI Hub functionalities. It aims to enable a configuration-based access to Completions and Embeddings in a CAP project, with minimal implementation overhead.
It takes free inspiration from the original CAP LLM Plugin and extends from there:
In no cases it wants to replace it nor it wants to be a better solution. It borns from a specific use case and quickly became a project on its own but still it is what it is: a personal project. However - if it could be of interest for others - contributions, collaborations and even constructive critiques are warmly welcomed!
[!NOTE]
Once again, please note that THIS IS NOT an official SAP software
This plugin offers a simplified way to setup an application to include AI-based conversations. It completely handles system prompts, completions and chat context so that the caller application needs only to provide new user messages and act on responses.
Similarly, it handles a simplified way to generate and store embeddings in a HANA database - assuming that its Vector Engine is enabled.
Please read carefully the documentation, especially the section that describes the managed
and un-managed
modes.
This package is not yet published to NPM, so to use it please clone the repo and use it as a local NPM module in your project.
- AI Core Plugin
The plugin uses CAP configuration to set itself up, so it requires a cds configuration entry either in package.json
or .cdsrc.json
.
It comes with a preconfigured schema that helps with value input.
You can define a configuration for both the completion
capability and the embeddings
capability.
The minimum need is to provide completion
configuration section of the plugin:
{
"cds": {
"requires": {
"ai-core": {
"kind": "ai-core":,
"completions": {
"destination": "<NAME_OF_BTP_DESTINATION>",
"resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
"deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_COMPLETION_MODEL>",
"apiVersion": "<AI_CORE_COMPLETION_MODEL_API_VERSION>",
"temperature": "<COMPLETION_MODEL_TEMPERATURE>"
}
}
}
}
}
Similarly, you can configure the embeddings
section:
{
"cds": {
"requires": {
"ai-core": {
...
"embeddings": {
"destination": "<NAME_OF_BTP_DESTINATION>",
"resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
"deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_EMBEDDING_MODEL>",
"apiVersion": "<AI_CORE_EMBEDDING_MODEL_API_VERSION>"
}
}
}
}
}
Here is a breakdown of each property:
-
destination
: the name of the BTP destination that points to the AI Core Service Instance. The plugin uses SAP Cloud SDK Connectivity to look this up. -
resourceGroup
: the name of AI Core resource group under which Configurations and Deployments are created - see Resource Groups. -
deploymentId
: The ID of the model deployment -
apiVersion
: The API version of the model. Find the available values here -
temperature
: (Only valid for Completions) Allows to influence the predictability of the generated text. Accepts values from 0 to 1, where 0 is the most deterministic (more predictable and prone to repetitions) and 1 is the less deterministic (less predictable but more prone to hallucinations)
[!WARNING]
Embeddings can only be used when running on a HANA database. SQLite does not support vectors and therefore cannot process similarity searches.
The plugin can be configured in a managed
and an un-managed
way.
This is just a way to tell the plugin runtime to discriminate whether you would like only to use Core API functionalities like completions
and embeddings
or you want to have a completely managed solution, that includes database operations and context handling.
To just use API functions, set up the corresponding configuration object as follows:
"ai-core": {
"completions": {
"managed": false,
// Other properties of the completions object
}
}
The same applies for embeddings
. With this configuration, the plugin does not check your database model nor the service actions. It will act as a simple proxy between your code and AI Core, using the configuration provided.
This is the default configuration
The managed configuration uses all the embedded functionalities of the plugin, which are described in the following paragraphs.
The plugin offers a set of aspects
to simplify database modeling when using AI capabilities.
You can define entities at database level, include the relative aspect
and you are good to go.
Each aspect comes with specific custom annotations
that allow the plugin to determine which entity is used to do what.
You can flexibly decide wheter to use aspects or to model your database and add AI annotations to your entities.
There are currently 4 available aspects
and 8 available annotations
.
Annotations allow you to "mark" specific entities, properties and functions so that the plugin knows how to behave:
Annotation Name | For | Description |
---|---|---|
@AIConversations |
Entity | Sets the annotated entity as the "Conversations" entity |
@AIMessages |
Entity | Sets the annotated entity as the "Messages" entity |
@AISystemPrompts |
Entity | Sets the annotated entity as the source for static System prompts |
@AIEmbeddingsStorage |
Entity | Sets the annotated entity as the repository for vectorized texts |
@AISummarize |
Property | Marks the property as summarized: by default is the title of the @AIConversations entity. The value of this property will be generated by a completion |
@AIEmbedding |
Property | Marks the property as the container for vector values in the @AIEmbeddingsStorage entity. Must be assigned to a field of type Vector
|
@AITextChunk |
Property | Marks the property as the container for text values in the @AIEmbeddingsStorage entity |
@AICompletion |
Action | Annotates an action to act as a Completion endpoint |
@AIEmbeddingGenerator |
Action | Annotates an action to act as a embedding vector generator |
Entity and property annotations are used at runtime to determine how to properly handle the persistence of Messages, Conversations and Embeddings.
Action annotations are used at runtime - specifically at cds.once('served')
event - to attach custom handlers to action and automatically handle the processing of Completions and Embedding generation.
Above annotations are automatically assigned if you decide to use the pre-defined aspects defined in index.cds.
-
AIConversations
: Represents the base entity that contains a list of conversation between a User and the AI. It comes with a predefined@AIConversations
annotation and includes a singletitle
property annotated with@AISummarize
. -
AIMessages
: Represents the base entity that contains messages exchanged between a User and the AI for a given Conversation. Includes the following properties:
Property | Type | Description |
---|---|---|
content |
LargeString | Content of the message sent by either the user or the AI |
role |
String enum 'user'/'system'/'assistant'/'tool' | The role of message sender |
-
AISystemPrompt
: Allows to define static texts to be used as context during a conversation. They represent the value of thesystem
message role in a conversation. There are currently two available types: SIMPLE and CONTEXT_AWARE. As the name implies, the first will be used during simple conversations, whereas the latter will be considered during RAG-aware chats. -
AIDocumentChunks
: This is the base entity that contains vector embeddings. Comes with 3 properties:
Property | Type | Description |
---|---|---|
embedding |
Vector(1536) | The vector representation of a text-chunk. Comes annotated with @AIEmbedding
|
text |
LargeString | The original text-chunk from which vectors are generated. Comes annotated with @AITextChunk
|
source |
LargeString | The reference to the original text or document from which vectors and text-chunks are determined |
As described, the above artifacts allow to design a simple database model to satisfy the minimal configuration to perform conversations and embeddings.
[!IMPORTANT]
Since aspects do not allow to manage compositions or associations, developer must add corresponding properties to entities annotated with
@AIConversations
and@AIMessages
(regardless if entities include provided aspects or not). Specifically:
entity Chats: AIConversations {
...
Messages: Composition of many Messages on Messages.Chat = $self;
}
...
entity Messages: AIMessages {
...
key Chat: Association to one Chat;
}
This way, the plugin knows how to deal with relationships between the two entities. In future enhancements, the plugin will automatically add missing relationship between these base entities.
Any service action annotated with one of the two available action annotations (see annotations) will be prepended by the corresponding handler from the plugin.
The plugin service prepends its handler via service.prepend(() => service.on(...))
and calls the next()
function by providing generated data.
This means that, if you want to work on the AI Response, you need to implement an action handler in your application code, to which the plugin will prepend.
For example, for Chat Completions:
class MyService extends cds.ApplicationService {
async init() {
// * Any action name annotated with @AICompletion
this.on('sendMessage', async (req) => {
const { aicore } = req.data;
// AICore contains the result of plugin function:
const { conversationId, lastMessage, ragContext } = aicore;
// Do your thing
})
}
}
Or, for embeddings:
class MyService extends cds.ApplicationService {
async init() {
// * Any action name annotated with @AIEmbeddingGenerator
this.on('generateEmbeddings', async (req) => {
const { aicore } = req.data;
// AICore contains the result of plugin function:
const { embeddings } = aicore;
// Do your thing
})
}
}
Completions are the most basic functionality of an AI chat. They allow message exchange between a User and the AI. It's very easy using AI Core to perform a "completion": given a deployment ID for a completion model, one POST call to the completion endpoint will provide an AI response.
The plugin simplifies the consumption of the completion model by attaching to an arbitrary OData action that is annotated with @AICompletion
.
[!IMPORTANT]
Please note that the annotated action must satisfy the following parameter signature:
{ conversationID: uuid | null, content: string, useRag: boolean }
SAP AI Core supports a wide range of LLMs, which are listed in the official documention.
Each LLM requires a defined payload structure to perform the completion generation and, although many LLMs share some common properties, there is no "one fits all" model that can be used. For this plugin, it has been decided to use the widely used OpenAI format for completion generation, which is structured as follows (beside other properties to influence the behavior):
{
messages: [{
role: 'system' | 'user' | 'assistant'
content: 'An arbitrary message or context'
}]
}
The plugin will accept and return messages only in this format.
It will, however, take care about the conversion of the input to the specific requirements of the LLM targeted by the given deployment ID.
Within the transformers path, a class has been defined for each different data structure. As of now, these are the models supported for completions
:
BaseTransformer
-- OpenAITransformer
-- GeminiTransformer
-- MistralAITransformer
-- AnthropicTransformer
-- LlamaTransformer
Each transformer class inherits from the BaseTransformer, which offers 4 functions:
Function | Purpose |
---|---|
transformCompletionIn | Takes the inbound context in OpenAI format and converts it to the specific LLM format |
transformCompletionOut | Takes the response from the LLM and converts it to the OpenAI format |
transformEmbeddingIn | Takes the inbound embedding request payload and converts it to the specific LLM format |
transformEmbeddingOut | Takes the response from the LLM and converts it to a plain object like { embeddings: <result>}
|
As an initial implementation, each model is matched to the corresponding transformer class in the mapping file supported-models.js.
By creating additional transformer classes, additional LLMs could be supported in the future.
To handle System Prompts, you can use the @AISystemPrompts
annotation or include the AISystemPrompts
aspect in your modeled entity.
Either way, the presence of such an annotated element will make the plugin use the entity as a source to create the element system
in the payload sent to the AI to generate a completion.
It works this way:
- Upon a new message, the plugin will check the existance of such entity, searching for the annotation
- If it's found, the plugin then checks if the current requires a RAG processing or not and takes the value from entity property
prompt
where the propertytype
is equal to SIMPLE (for non-RAG chats) or equal to CONTEXT_AWARE (for RAG chats). Check the logic in ai-service-handler.js - If the entity is NOT found, the plugin contains some generic constants in llm-prompts.js. This is likely to be changed in favor of a function parameter
Chat context represents the entire history of messages exchanged during a conversation: an effective AI chat "remembers" previous messages, in order to not repeat itself and to keep a true sense of conversing.
The AI Core plugin automatically manages the chat context, by storing Messages of a Conversation in the entities annotated with @AIConversations
and @AIMessages
: on each message, the context is rebuilt and sent to the completion endpoint.
As an LLM would say:
Embedding is a way to represent data, like words or images, as numerical vectors to capture relationships and meaning. Embeddings allow machines to understand, compare, and process data more effectively by transforming complex information into numerical forms that highlight patterns, similarities, and differences.
The plugin provides an easy way to produce embeddings from an arbitrary text or piece of data. There are currently two ways in which you can get embeddings:
- Using an action annotated with
@AIEmbeddingGenerator
: whichever text is sent to the action, will be returned as a numerical vector. Remeber that the plugin prepends its handler, so if you need to manipulate the result you need to implement and register an action handler youself. See actions. - By
cds.connect.to('ai-core')
and calling the getEmbeddings() API function.
[!WARNING] To use embeddings, the corresponding cds configuration MUST be set. See setup.
RAG-aware completions combine Retrieval-Augmented Generation (RAG) with conversational AI, enhancing responses by retrieving relevant external information, leading to more accurate, informed, and contextually appropriate dialogue in real-time.
The plugin uses the HANA Vector Engine to perform similarity searches and provide additional, specific context to the LLM.
During RAG-aware conversations, the system prompt is re-calculated on every new message: using the user query, a similarity search is performed on the entity annotated with the @AIEmbeddingsStorage
entity and the resulting context is used as system prompt.
This allows AI answers to be more tailored on application needs, avoiding a broader context and limiting answers to a specific topic.
RAG-aware conversations are activated by providing a truthy value to the parameter useRag
of the completion action annotated with @AICompletion
.
[!NOTE] API Calls still require the minimal configuration for embeddings and completions. See Setup.
You can always call the Core API functions, regardless of the managed aspects and actions. There are 3 main functions:
Function | Parameters | Description |
---|---|---|
genericCompletion(messages) | Array<{ role: string, content: string }> | Performs a completion call to the LLM deployed in the configured deploymentId. It expects a full chat context, including the system role. Returns the AI response in the same format. |
createEmbeddings(text) | text: string | Generates a Vector of embeddings, using the LLM deployed in the configured deploymentId. Returns an array of numbers. |
vectorSearch(params) | See below | Allows the execution of a generic similarity search on HANA |
The third function vectorSearch
allows to perform vector-based searches on a user-specified table. It accepts the following parameter:
Name | Type | Description |
---|---|---|
query | string | The text to search for |
tableName | string | The name in HANA format that contain embeddings and texts. I.E. SAP_DEMO_EMBEDDINGS |
embeddingColumnName | string | The name of table field that vectorized representation of data. Must be of type REAL_VECTOR (cds.Vector(1536) in CDS) |
textColumnName | string | The name of table field that contains textual representation of data |
searchAlgorithm | string | The name of similarity algorithm. HANA currently supports COSINE_SIMILARITY and L2DISTANCE
|
minScore | number | A value between 0 and 1. It will be used to filter out elements with a score lower than the specified value |
candidates | number | Number of candidates to read from HANA |
Returns an object with found content and a metrics
object that includes similarity scores and the table entry that generated the result:
{
content: ['I am one result in textual representation', 'I am number two'],
metrics: [{
score: 0.945424895818,
textContent: 'I am one result in textual representation',
tableEntry: {
foo: 'bar'
}
}, {...}]
}
The plugin can only be tested if associated with a CAP application: you can quickly spin up a basic bookshop
application and add the plugin usage.
Testing with SQLite will only allow the usage of simple Completions: vectors are not supported in SQLite, so Embeddings and RAG-aware Completions are not working.
There are currently no checks performed by the plugin on this: if you try to deploy a model that uses Vectors to SQLite, the database driver will throw an error.
You can however perform an hybrid
testing and bind your application to an SAP HANA service and still run locally.
To simplify development, bind to a destination
service instance as well, in order to easily consume the required destination that points to AI Core deployments.
Under development
If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.
Please have a look at CONTRIBUTING.md
for additional info.
See LICENSE