View the List of Anomalies in Applications' Behavior

You can track anomalies in data across your organization, such as low usage of applications or increase in application crashes.

Aternity collects and analyzes applications' metrics and behavior and informs you about anomalies by integrating a third-party’s service of machine learning engine for anomaly detection. The algorithm tracks the managed applications and the applications that are most commonly used (both conditions should apply).

The engine uses patented machine learning algorithms to discover anomalies in time series data and turns them into valuable business insights.

Automated alerts allow you to spot problems. For example, alerts about a low usage of applications help you identify applications outages, as the outage prevents employees from using their applications, and the usage drops.

Get alerts for anomalies in application's usage
Be notified and troubleshoot the following anomalies:
  • Application Usage Decrease
  • Application Crash Increase
Aternity tracks anomalies based on the following metrics that indicate applications usage and/or crashes:
  • Total usage time of the application for all users in organization
  • Total volume of all activities of the application
  • Total number of application crashes

More metrics for detecting various data anomalies will be added in future releases.

Important
  • The algorithm learns the usage and crash patterns to establish a baseline. The algorithm will fail to detect anomalies if the usage pattern is of low quality (in case the algorithm failed to establish a repeating pattern) or if the usage in the application is not significant enough to identify significant drops in it.
  • The algorithm requires several weeks of historical data before it can start generating anomaly alerts.

Procedure

  1. Step 1 Ensure machine learning incidents are enabled. Learn more.
  2. Step 2 Get an email alert as soon as any anomaly is detected.
    Get email alert about anomalies in app usage or crashes

    You receive alerts based on five-minute data aggregations to allow near real-time detection or alerts based on one hour data aggregations to detect anomalies that were omitted from a five-minute scale.

    To prevent false alerts and too much noise, you receive alerts only if the anomaly continues for at least 15 minutes (for five-minute aggregations) and for 2 hours (for one-hour aggregations). An email notifies about an Active incident and includes all relevant information, such as the nature of anomaly, when it started, margins of the expected normal behavior, etc.

    On a graph in the email the blue band shows the margins of a normal trend and the orange line above or below normal boundaries shows the anomaly.

    The system also detects when the anomaly ends (app behavior is back to normal) and sends another email with the Closed alert state and with all relevant information (app name, alert type, state, incident start and end times, etc.).

  3. Step 3 Drill-down into data for further investigation.
    1. a Click Analyze This Alert in the email.
      Alternatively, open a browser and sign in to Aternity. Select Main Menu > Troubleshoot > Incidents.
      Access the dashboard that lists all incident alerts and explore them one by one
    2. b To view the details of a single incident, select that row from the list and drill down to the Low Usage Incident dashboard.
      It takes you to the same dashboard you can access via the Analyze This Alert link in the email.

      The dashboard name changes depending on the incident type. It can be Low Usage Incident, Low Volume Incident or Crash Increase Incident.

      More metrics for detecting various data anomalies will be added in future releases.

      .

      Drill down to troubleshoot a specific incident
    3. c See patterns of the application usage.
      Note

      You can change the view to see the patterns of Measures or Devices. It allows to see if the drop of application usage or increase in application crashes happen on a specific device only or on many devices.

      Browse to the open incident for further investigation

      Each dot on the graph indicates a usage volume at five-minutes scale. Each dot on the next graph indicates a volume of activities at five-minutes scale.

      You can select a different application from the drop-down menu next to the application name if you want to compare the current graph with the usage time and activity of other applications.

      Field Description
      Time

      Incident Open Time

      Displays the time when the incident started.

      Incident Close Time

      Displays the time when the incident ended.

      Usage Time The usage time of an application is the total time it is running, in the foreground, and being used. This includes the wait time, the time a user spends waiting for the application to respond. For web applications, the usage time is when both the browser window and the application's tab are in the foreground. Learn more.
      Active Time

      The active time of an application is the time when it is running, in the foreground, and the user is actively interacting with it (NOT waiting for it while it is busy trying to respond). It is calculated as the usage time minus the wait time. Learn more.

      Wait Time

      The wait time of a Windows application is defined as the time users spend waiting for the application to respond when it is actively running and in use (part of the usage time). Learn more.

      Activity Volume Application Owners can use this to view an application's usage patterns, where it is used, why and by whom. For upgrades or new deployments, you can show its adoption, or compare the productivity between different locations. When you understand the usage patterns and volumes, you can better plan for future releases, by investing in the most popular aspects of the application.
      Activity Response

      (For managed applications only) Displays the response time of the activity. The response times of activities are split into client time ( light blue), and the combination or union of the backend time ( dark blue) and the network time ( blue).

      Number of Crashes Displays the number of crashes during the dashboard's timeframe.
      Number of Devices

      Displays the number of devices which were actively running an application in the foreground, during the dashboard's timeframe. The same user could access the application on more than one device.

    4. d Apply the required Timeframe to zoom on incident.
  4. Step 4 Get an email alert once the incident has been closed (the behavior is back to normal).

    If necessary, drill-down into the data for further investigation by clicking Analyze This Alert link in the email.