Live Environment Monitoring

Overview

With Max, it is possible to maintain a set of monitoring capabilities for live production environments to ensure service reliability, user adoption, performance visibility, and continuous improvement.
These capabilities include telemetry capture, user interaction analytics, issue classification, and automated distribution of performance.

Monitoring Scope & Objectives

Track user interactions (questions asked, chats initiated, sessions, usage trends) to understand adoption and engagement.
Monitor assistant performance metrics (pass rate, failure rate, classification of issues, trending errors) to identify opportunities for improvement.
Capture telemetry and usage data (environment health, response times, error counts, resource usage) to support operations and diagnostics.
Provide dashboards and reports for stakeholders (product, operations, support) to support decision-making and proactive maintenance.

Data Access & Performance Monitoring

Max supports both a Monitoring Dashboard for viewing performance at a glance and direct access to the underlying usage data for deeper analysis.

The built-in dashboard provides real-time insights into question volume, pass rate, issue trends, and user engagement metrics.
Users can also access the same data directly through Max’s analytical capabilities, allowing them to ask natural language questions and perform custom analyses on adoption, user growth, and assistant performance.
This combination of visualization and interactive analytics enables operational transparency and supports both technical and business users in understanding system behavior.

Assistant Performance Monitoring

Max continuously evaluates assistant performance using AI-driven monitoring and classification.

AI models automatically detect and categorize issues in user interactions, such as incomplete responses, irrelevant answers, or missing data.
These classifications can be continuously updated to adapt to new patterns, feedback, and operational changes, ensuring that issue detection remains relevant over time.
Performance metrics such as Pass Rate, Question Volume, and User Growth are continuously updated and can be compared over various timeframes.
A diagnostics workflow allows stakeholders to review interactions, validate classifications, and identify opportunities for skill or data improvements.
This ongoing AI-assisted monitoring ensures issues are identified proactively and performance remains aligned with user expectations.

Key Metrics & Dashboards

Common metrics monitored in Max include:

Question Volume: number of user-initiated questions or chats per day, week, or month.
Pass Rate: percentage of interactions where the assistant successfully answered.
Failure/Unclassified Interactions: volume and percentage of chats that were not resolved or required escalation.
User Growth / Active Users: unique users interacting with the system during a given period.
Response Time / Latency: average time from user prompt to assistant reply.
Resource Usage / Errors: server load, memory/CPU usage, exceptions, and disruptions.
Issue Classification Trends: breakdown of interaction issues by AI-generated categories such as “missing data,” “incomplete response,” or “irrelevant result.”
Adoption Indicators: usage by role, sessions per user, time to first question, and drop-off rates.
User Feedback: In product user feedback is monitored and can be summarized for trends and common patterns using AI.

Dashboards in Max visualize these metrics with trend lines, period-over-period comparisons, and alerts for threshold breaches.
For more advanced exploration, Max’s analytical chat interface allows direct queries against the monitoring data, combining the power of AI-driven insights with business intelligence flexibility.
Max can be set up to generate daily reports shared through email, Slack, Microsoft Teams or the channel of your choice.