HomeGuidesSDK ExamplesAnnouncementsCommunity
Guides

Question Collections, Monitoring, Testing Best Practices

Seamless workflow:

  1. Identify issues: Use monitoring dashboard to spot problematic classifications
  2. Investigate scope: Click "View All Questions" to see comprehensive question history
  3. Select related questions: Use filters and multi-select to identify affected questions
  4. Create test collections: Group questions into themed collections for follow-up testing
  5. Validate fixes: Use collections for systematic verification after issue resolution

Example scenario:

  • Monitoring shows 15 questions classified as "Authentication Issues"
  • Click into the classification to see all affected questions
  • Multi-select the questions that should be resolved by your upcoming fix
  • Create "Auth Fix Validation - March 2024" collection
  • Schedule regular tests on this collection to verify the fix

Advanced Filtering for Collection Building

Groups-based collections:

  • Filter questions by user groups (admin, UAT, training groups)
  • Create collections specific to user privilege levels
  • Test how different user types experience your assistant

Time-based collections:

  • Use date filtering to focus on recent issues
  • Build collections from specific incident time periods
  • Compare question patterns before and after changes

Classification-based collections:

  • Build collections from questions with specific classifications
  • Create validation suites for particular issue types
  • Organize testing around functional areas or problem categories

Collection Organization Best Practices

Naming Conventions

Descriptive, purposeful names:

  • ✅ "Authentication Issues - March 2024"
  • ✅ "Post-Login-Fix Validation"
  • ✅ "UAT Group Regression Tests"
  • ❌ "Test Collection 1"
  • ❌ "Random Questions"

Include context that helps with:

  • Issue type or functional area
  • Time period or version relevance
  • User group or testing scope
  • Purpose (validation, regression, exploration)

Strategic Collection Types

Issue-Specific Collections:

  • Group questions by the type of problem they represent
  • Useful for focused testing after fixes
  • Easy to schedule for regular regression testing

User-Journey Collections:

  • Organize questions that represent complete user workflows
  • Test end-to-end experiences across your assistant
  • Validate that complex interactions work as expected

Validation Collections:

  • Questions specifically chosen to verify fixes or improvements
  • Pre and post-fix comparison sets
  • Critical path testing for important functionality

Exploratory Collections:

  • Questions that represent edge cases or unusual requests
  • Help identify new potential issues
  • Support ongoing assistant improvement efforts

Context-Aware Testing

When leveraging the full context preservation feature:

  • Test both standalone questions and context-dependent follow-ups
  • Verify that context is properly maintained across collection executions
  • Use this feature to build comprehensive conversation test suites

Testing Multi-Turn Conversations

With full context preservation, you can now effectively test complex conversation flows:

Steps to Implement

  1. Engage in a multi-turn conversation with the assistant
  2. Add follow-up questions from deep in the conversation to test collections
  3. Run tests to verify context-dependent responses work correctly

Benefits

  • Ensures conversational continuity
  • Tests real-world usage patterns
  • Validates context retention

Building Test Suites from Monitoring

Leverage monitoring insights to create targeted test collections:

Steps to Implement

  1. Identify problematic questions or patterns in monitoring
  2. Use multi-select to gather related questions
  3. Create focused test collections for specific issues
  4. Run regular tests to validate fixes

Benefits

  • Proactive issue detection
  • Systematic validation of improvements
  • Organized approach to quality assurance

Using Prompts to Build Custom Evaluations

Associating Evaluation Prompts to Question Collections

When an evaluation prompt is associated with a question collection, any variables defined within the prompt are automatically added to each question in that collection.

For example, suppose you want to verify that time periods are selected correctly in a set of questions. You could create an evaluation prompt that asserts expectations about the results. In this prompt, the expected values would be defined as a variable. Once the prompt is associated with the question collection, that variable becomes available for each question, allowing you to specify different expected values per question.

Expected Output

The final output returned from the prompt must match this:

A JSON object that matches the following schema:explanation: string, pass: boolean

Available Context

The full chat_entry object is available to assert against. This will include answers, visualizations, timing and more. See AnswerRockets SDK for further documentation.