Correlate an Activity with the Device's Resources (Activity Resource Analysis)

The Activity Resource Analysis dashboard exposes the effects of an activity directly on a device's hardware and system resources running Windows. For example, you can watch the effects of opening a mail on the device's CPU usage. This level of detail helps you establish better theories on the possible causes of a problem, to perform a deeper root cause analysis (RCA) of the issue.

The Activity Resource Analysis dashboard

You can isolate a single activity reported from this (non mobile) device at a specific time, and view the all the device-related events which occurred while the activity took place.

For example, if the CPU usage of a device spikes above 80% during three occurrences of an activity, you can investigate this correlation, to determine the reason why the activity might be causing such behavior.

Step-by-step in Activity Resource Analysis to view the device measurements which occurred during an activity

When you first open the dashboard, it only initially displays two of its sections: Applications and Device Details. Select an application name to view its activities and processes, then select an activity name to view each activity status in a graph with its SLA thresholds and baselines. Finally, select a single activity to view the exact device measurements which occurred at the time of the activity.

Procedure

  1. Step 1 Open a browser and log in to Aternity.
  2. Step 2 You can only access the Activity Resource Analysis dashboard by drilling down on a Windows device (or user of a device) from any of the following dashboards:
    The starting point of the Activity Resource Analysis dashboard has just two sections

    The dashboard opens with only two sections:

    Field Description
    Applications

    For each monitored application, view every instance of every activity from this device during the dashboard's timeframe. It displays each occurrence as a circle with its status.

    Tip

    This could display too many circles, as it displays all the activities in a single row. Try viewing each activity separately, by selecting the application name (see below).

    Device Details

    Displays the device-related events which occurred during the performance of an activity. For example, the CPU usage or the amount of data sent to the network. For descriptions of each item in the list, see below.

    Each row displays the activity score for this set of activities, and in some cases, it also shows the maximum and average measurements for that row. Each circle represents an activity which was performed at a certain time on this device. The horizontal axis is time, where the start and end times are the dashboard's timeframe.

    Activities which occurred at a specific time
    Field Description

    Green activity

    A green activity has a normal status, when its response time is as expected, which is less than the minor baseline for this activity.

    Yellow activity

    A yellow activity has a minor status when its response time is slower than expected, since it passed the minor baseline for this activity.

    Orange activity

    A orange activity has a major status when its response time is significantly slower than expected, since it passed the major baseline for this activity.

    Red activity

    A red activity has a critical status, when the activity failed to respond.
    Note

    If an event or activity does not have any baselines, like the reading the hard disk, it does not have any status, and therefore its color is a shade of purple or blue.

  3. Step 3 Select an application name from the Applications section.

    The dashboard expands to reveal two new sections: Activities and Processes (for desktop applications only).

    Select an application name to display each of its activities and processes
    Field Description
    Activities

    Displays the list of activities for the selected application, and shows each occurrence of the activity during the timeframe as a circle with its status.

    Processes (desktop applications only)

    Displays the device processes and their measurements at the time of an activity which are presumably a direct consequence of the demands of the application. For example, if a device runs a presentation program and a graphics program at the same time, the presentation application may directly use 20% of the CPU resources during an activity, but due to the other program working at the same time, the overall CPU usage may be as high as 90%.

    During an activity, if an application uses resources (x% CPU or RAM), or sends x MB of network traffic, it is not the same as saying that it is because of the activity. They happen at the same time, so they are correlated (see Correlation vs. Causation). However, you can be reasonably confident that these device measurements occurred because of the activity.

  4. Step 4 Select an activity name to view its graph on the right hand side, where the same activity occurrences are spaced out more clearly, where slower responses are displayed above faster ones.
    View the graph of the occurrences of an activity

    Configure the graph to display the activity's various baselines and SLA thresholds using the Reference Lines menu in the top bar, to understand the reason each activity has its status. If it falls below a baseline, it has one status, while above that baseline it takes on a different status.

    View the activities with different baselines and thresholds by selecting from the Reference Lines menu
    Field Description
    Just Measurements

    Select this to display the activities without any baselines or SLA thresholds.

    With Thresholds

    Select this to display the activities with their SLA thresholds.

    With Baselines

    Select this to display the activities with their baselines.

    With Thresholds and Baselines

    Select this to display the activities with their baselines and SLA thresholds.

    Tip

    You can view any of the device measurements in more detail by selecting one or more names of the device measurements to view their graphs on the right hand side.

    Field Description

    Minor baseline (yellow dotted line)

    The Minor baseline threshold is a response time which is slower than expected for this activity. If the response is slower than this time, its status becomes minor (), because it is a minor departure from the expected performance time. If the activity is faster than this time, its status becomes normal (). The system automatically defines this threshold in a monitor, based on experienced standard behavior for that activity.

    Major baseline (red dotted line)

    The Major baseline threshold is a response time which is significantly slower than expected for this activity. If the response is slower than this time, its status becomes major (), which is a call to action, because it is a major departure from the expected performance time. The system automatically defines this threshold in a monitor, based on experienced standard behavior for that activity.

    Internal SLA threshold (yellow solid line)

    The Internal threshold is the response time (in seconds) of an activity which would be your early warning, showing you are at risk of exceeding your official service level agreement (SLA) with your customers. Any response longer than this threshold is colored yellow in the SLA dashboard, as it warns you risk breaking the SLA commitment for this activity. A Power User of Aternity can configure this threshold in the activity's monitor.

    External SLA threshold (red solid line)

    The External threshold represents the maximum response time (in seconds) of an activity as defined in your official service level agreement (SLA) with your customers. Any response longer than this threshold is colored red in the SLA dashboard, since it breaks the official SLA commitment for this activity. A Power User of Aternity can configure this threshold in the activity's monitor.

  5. Step 5 Select to highlight a single occurrence of an activity and view the highlighted device measurements which occurred during that activity's performance.
    Correlate an activity with measurements for the whole device in the Device Details section

    The Device Details section highlights the measurements observed on the device at the time of the activity. These metrics are at the level of the entire device, hence they may or may not be directly because of the activity.

    Field Description
    Boot

    Displays the total boot time of the device.

    Device Health Event

    Displays any occurrence of a device health event during the dashboard's timeframe.

    Device CPU

    View the percentage CPU utilization at a given time, measured as a percentage of the total power available. For example, if the device has four CPU cores, where one is at 100% and the others are idle, it will display a value of 25%.

    Max CPU Core Utilization

    View the individual CPU core processor with the highest percentage usage at a given time. For example, if the device has four CPU cores, where one is at 100% usage and the others are idle, it will display a value of 100%.

    Disk IO Read

    View the rate at which the device reads from the hard disk in MB per second at any given time during the activity.

    For example, if a virus scanner slows performance by issuing many disk read requests, reschedule to off-peak times. Alternatively, if the read rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    Disk IO Write

    View the rate at which the device writes to the hard disk in MB per second at any given time during the activity.

    For example, a movie editor can perform large disk writes, slowing down the device's performance. Alternatively, if the write rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    Disk Queue Length

    View the number of waiting I/O requests to read or write to the hard disk or a logical disk at a given time during the activity.

    A consistent queue for the disk indicates a bottleneck in hard disk access, which significantly impacts on system performance, either due to excess system demands on the disk, or it can be a hardware disk problem. To check if the problem is hardware, view if the speed (rate of reads and writes to the disk) is low.

    Network IO Read

    View the data downloads of this device in MB per second at any given time during the activity.

    For example, if its throughput or usage of bandwidth is low, and the user complains of slow network connections, consider checking the NIC hardware.

    Network IO Write

    View the data uploads from this device in MB per second at any given time during the activity.

    For example, if its throughput or usage of bandwidth is low, and the user complains of slow network connections, consider checking the NIC hardware.

    Device Physical Memory

    (Windows and mobile only) Displays the percentage usage of physical RAM memory at a given time during the activity.

    Device Virtual Memory

    (Windows only) View the percentage usage of virtual memory at a given time during the activity.

    High usage of virtual memory slows performance significantly, because using the hard disk instead of RAM is 1000 times slower than physical memory. To resolve, increase the capacity of RAM on the device.

    For a stronger correlation, view the highlighted circles in the Processes section (for desktop applications only) to see the device measurements which are associated directly with this application (process) and therefore are they can be tightly coupled with the activity.

    Correlate an activity with measurements for just this application in the Processes section
    Field Description
    CPU

    View the percentage CPU utilization of this Windows process while it performs an activity, measured as a percentage of the total power of all CPU cores available.

    Compare this with the Device CPU readings to understand whether this application is the cause of any spike in CPU readings.

    Physical Memory

    View the percentage utilization of memory (physical RAM) for this Windows process, while it performs an activity.

    If the activity always coincides with a spike in memory consumption, this is probably the cause of slow performance.

    Virtual Memory

    View the percentage utilization of virtual memory for this Windows process, while it performs an activity.

    If the activity always coincides with a spike in memory consumption, this is probably the cause of slow performance.

  6. Step 6 To view more details of one of the measurements, hover your mouse pointer over the circle.

    For an occurrence of an activity or device measurement, you can view the following details.

    View the details of a single occurrence of an activity
    Field Description
    Activity

    Displays the name of the monitored activity within the application.

    Application

    Displays the name of the monitored application.

    Recorded At

    Displays the time of the occurrence of this activity or device measurement.

    Activity Response

    (Managed applications only) Displays the response time of the activity. The response times of activities are split into client time ( dark blue), and the combination or union of the server time ( light blue) and the network time ( blue).

    Activity response time

    Use the actual response times (not scores) to check the performance of chronic (long term) problems. You cannot rely on measurements based on the recent baselines, as those responses would be chronically slow for some time, thereby skewing baselines to make those times look normal.

    Client Time

    Displays the client time for this activity. Client time is the time used by the device itself as part of an activity to process data before sending its first message request to the server and after the last message response arrives back from the server.

    Infra Time

    Displays the infra time for this activity. Infra time is the total time spent outside the client. It starts with the first request to the server and ends when the final response arrives at the client.

    Latency

    (Virtual sessions only) Displays the remote display latency.

    Status

    Displays the status of this activity, if it has baselines defined.

    If there are no baselines, it displays in one of the shades of blue.

    You can drill down to troubleshoot further on this activity by accessing:

  7. Step 7 To limit the display to specific periods of time, to select which device of the user to view, use the drop-down menus in the top bar of the dashboard.
    Select the data to display in the dashboard
    Table 1.
    Field Description
    Timeframe

    Choose the start time of the data displayed in this dashboard.

    You can access data in this dashboard (retention) going back up to three days.

    This dashboard displays raw data in real time, refreshing every time you access it or whenever you manually refresh the browser page.

    Reference Lines

    Configures the thresholds to display in the graphs on the right side of the dashboard (see the step above).

    Username / Hostname

    Displays the information for a user who performs a certain activity on one or multiple devices (except for mobile devices), or displays the information for a device which has one or multiple users perform a specific activity during the period of time selected in the Timeframe. For example, if you have a user who reads his Outlook mail on his laptop and on his desktop, you can see the data for both devices, or you can limit the display to one device. If you have a device (hostname) which has several users running the same application, you can choose to display the data regarding all the users (usernames), or for one user only.

Example

To troubleshoot a user complaining of slow performance reading emails:

  1. Use the Device Inventory dashboard to view the details of the user's device.

  2. Drill-down to the Activity Resource Analysis dashboard to check if the slowdown is due to Microsoft Outlook or other applications.

  3. Use the default timeframe initially (48 hours) to view all the applications which have run on the device during that time. Check for yellow or orange activities. For example, you may find that Microsoft Outlook turned to major (orange) and then to minor (yellow), and around the same time, another application, BranchPortal, also became major.

  4. Use the custom option of the Timeframe drop-down menu to focus on the problematic times. The shortest interval you can choose is one hour.

  5. Select Microsoft Outlook in the Applications section to display its Activities and Processes.

  6. Look for any orange or yellow activities. If the status of Open Inbox has a major activity, select Open Inbox in the Activities section and view its graph on the right side of the window to see the response times and thresholds, which gives them the status of minor or major.

  7. Select the orange circle on the Open Inbox activity row to highlight its elements in the Processes section, showing that the CPU overloaded at that time (its average load is 70% and its maximum load is 90%).

  8. Select CPU in the Processes section to view the graph on the right side of the dashboard. Verify the high CPU at that time, and consider theories which may cause this slowdown (like Outlook accessing large emails while a virus scanner checks each email).

  9. Hover over the status circle to see the detailed information of that particular occurrence of the activity, including the client time, server time and total response time (see above).

  10. Look at the Device Details section to see if there was high traffic on the network, or a high usage of memory at the same time as the poor performance.

  11. Perform the same steps on any other application showing a slowdown at the same time. Check other factors like high network traffic or heavy usage of the device memory caused by other applications which could influence Outlook's performance.