API Config
Overview
This area of the product allows users to configure connection information and specify the Large Language Models(LLM) they are using for both Chat Completion endpoints and Embedding endpoints.
Purpose
The purpose of the Language Model API Configuration feature is to provide AnswerRocket with the necessary third party tools to run a chat experience.
Prerequisites
Obtain API keys from the relevant provider that you are using. For example Open AI, Microsoft Azure, Google or Anthropic.
-
See the Max Release Notes for the release you are on to see what model versions were tested with the release.
-
You will need both Chat Completion model and an Embedding model.
-
For chat and completion API's make sure you get enough tokens for your environment. We recommend you start with 150K TPM for the language model and then monitor usage and adjust from there. Actual requirements will vary by use case and number of active users.
-
For OpenAI go to OpenAI API Keys to manage your keys. See OpenAI documentation to learn more.
-
For Azure create an Azure OpenAI resource and then create deployments for each of the models you plan to use. See Azure documentation to learn more.
-
For Gemini enable the Generative Language API within Google Console.
-
For Anthropic use the Anthropic Console
-
Navigation Path
-
Navigate to "Skill Studio".
-
Select "API Configurations" from the left panel.
-
Within this area, you can add a "Chat Completion" and add "Embeddings".
Configuring
Add Chat Completion
-
Click on the "Add" button next to "Chat Completion".
-
Enter the required connection information and specify the model details.
-
Name: A name you give this entry. The name is up to you. A common practice is to use the model name.
-
Choose whether your API provider. For example Azure, OpenAI or Google Gemini
-
Azure Specific Configuration
-
Endpoint: Provide the endpoint for the resource you have provisioned in Azure. This is a URL
-
Key: Provide the key for the resource you have provisioned in Azure. Keep this a secret.
-
Deployment Name: Provide the name of the deployment that the model was deployed under.
-
Model Name: The name of the model used for the deployment. The name must match
-
-
See advanced configuration notes below for the remaining fields. Generally these should not be modified.
-
-
You can set at the system level the default API configuration to use for the following 3 purposes. All of these choices can be set at the environment level or be overwritten within a Copilot's settings.
- Chat - This is the model used while Max is chatting with a user. It is used to select skills, parameters and generate responses to the user.
- Narrative - This is the model used when Max generates a narrative artifact for the skill
- Evaluations - This is the model used to run evaluations on questions
-
Once all the details are entered, click on "Add" to apply the configurations.
Add Embeddings
-
Click on the "Add" button next to "Embeddings".
-
Enter the required connection information and specify the model details.
-
Once all the details are entered, click on "Add" to apply the configurations.
Ensure that a default model is set for each of Chat, Completion, and Embedding. Each is required for the system to properly operate.
Examples and Use Cases
Cost-Efficiency and Performance Flexibility
Scenario: A user wants to use different models for different purposes to balance cost and performance.
-
Chat and Narrative: The user configures a fast and low cost model for the narrative endpoint to ensure fast response times. However uses a slower more advanced model to ensure a quality chat experience.
-
Outcome: The system runs efficiently with high performance for interactive components and cost savings for reports.
Testing and Development
Scenario: A user wants to test the latest models and easily switch between them for various tasks.
-
Testing Latest Models: The user adds new models as they become available for testing purposes.
-
Switching Models: The user can easily toggle between the currently used model and the alternative model to evaluate performance and accuracy.
-
Outcome: The user can experiment with and compare different models without losing the current setup, facilitating smoother development and testing processes.
Example 3: Skill Development
-
Skill Development: During skill development, the user's skill code can directly access different models configured in the system. A model may be very good at the task the skill writer is asking of the model. This allows them to use a custom model within their code. For example within the skill a prompt is made to generate SQL. There could be a model very good at that but not as good at other tasks.
-
Outcome: The user can leverage custom models for specific tasks within their skill.
Advanced Options/Settings
For the Chat Completion models, there are advanced options available to fine-tune the model's behavior. These options can be explored in detail on the OpenAI API website. Here are some key options:
-
Temperature: Controls the creativity of the model. Higher values (e.g., 0.8) make the output more random, while lower values (e.g., 0.2) make it more deterministic. We recommend keeping it at 0.
-
Output Tokens: Sets the maximum number of tokens to generate. Our default is 1024.
-
Input Tokens: Sets the maximum number of tokens that can be sent. Our default is 10000. For production environments we generally recommend a much higher value that ultimately depends on your use case and the tokens that are being generated by Skill responses and the user conversation.
-
Top P: Controls diversity via nucleus sampling. Default is 1.
-
Frequency Penalty: Reduces the likelihood of repeating the same line verbatim. Default is 0.
-
Presence Penalty: Increases the likelihood of introducing new topics. Default is 0.
Recommendation: Generally, it is recommended to go with the default settings provided, but specific use cases may benefit from adjustments to these parameters for increased creativity or other desired outcomes.
Best Practices
-
Consistent Updates: Regularly check for new models and updates to ensure you are using the most efficient and accurate models available. Check our release pages for what we are testing.
-
Testing and Validation: Before switching models in production, thoroughly test them in a controlled environment.
-
Set Defaults: Ensure that a default model is set for Chat, Completion, and Embedding to avoid system malfunction.
Updated 8 days ago