Troubleshoot a Single Activity (Troubleshoot Activity)

The Troubleshoot Activity dashboard displays the recent trend of an activity's response time, and allows you to correlate performance with any of the multitude of attributes available.

For example, if sending emails is slow in Microsoft Outlook, use this dashboard to hunt for the common threads which correlate with the slow performance of this activity, by checking different attributes like network traffic, device type, operating system and so on. You can also check the Trends section to see recent changes over time, to find when the problem started.

The Troubleshoot Activity dashboard

The response times of activities are split into client time ( light blue), and the combination or union of the backend time ( dark blue) and the network time ( blue).

Server, network and client time

Procedure

  1. Step 1 Open a browser and sign in to Aternity.
  2. Step 2 To access this dashboard, drill down from one of the following dashboards:
  3. Step 3 Check if the problem is critical in the Activity section. View the overall activity status, the number of activities, and the average response time during the timeframe of the dashboard.

    If the activity status is major and it affects a large number of users, it needs your immediate attention.

    Check the activity status and average response time
    Field Description
    Application (only in dashboards with multiple applications)

    Displays the name of the monitored application, as it appears throughout the system. You can customize it when you add it as a managed application.

    Activity

    Displays the name of the monitored activity within the application as it appears in the dashboards.

    Activity Status

    The status of an activity is based on one response time compared to the recent expected (baselined) response time. The statuses are measured in severity: Normal , Minor , Major or Critical .

    Backend Time (light blue)

    Backend time is the time required by all the servers to process data on the backend, which is part of the overall response time of an activity. It starts when the client sends a request to the target server, when the last message of that request arrives at the target server side. It ends when the server sends out the first message of its response. See backend time.

    Network Time (blue)

    Network time is the total time (union) taken for all messages to cross the network in either direction, between the client and the target server, while performing an activity. This does NOT include the time used for processing the request on the server (backend time). See network time.

    Client Time (dark blue)

    Client time is the time used by the device itself as part of an activity to process data before sending its first message request to the server and after the last message response arrives back from the server. See client time.

    Volume

    (For managed applications only) Displays the number times that people with this combination of attributes performed activities, hence adding weight to the impact of this measurement. If the same user performs the same activity twice, it counts as two.

    Score (Status)

    The activity score is a value (0-100) which summarizes the statuses of all activity response times into a single value.

  4. Step 4 Find a common thread by checking if slow performance is connected to a specific attribute, like the same department, location, device type, and so on.

    Use the drop-down menu in the sections below the Activity area to display any of the available attributes.

    Select the attributes to troubleshoot the activity
    Note

    Some activities may have a slow response time even when its status is green. Use the score to measure short term (acute) recent or sudden changes from regular baselined performance. For example, if a mail usually opens in 1.5s, (the baseline response time), it creates a minor baseline (small departure from the baseline) and a major baseline (significant departure). If performance is suddenly (acutely) much slower, like 5s, it would be beyond the major baseline, and therefore have a red status with a low score.

    Use the actual response times (not scores) to check the performance of chronic (long term) problems. You cannot rely on measurements based on the recent baselines, as those responses would be chronically slow for some time, thereby skewing baselines to make those times look normal. In this example, if the activity for opening mails has been 5s for several weeks, Aternity adjusts its baselines to 5s, so this now looks normal, and therefore has a green status with a good score, which is misleading.

    Field Description
    Servers

    Check if the problem occurs for all users who connect to specific servers. For example, sending a mail might be slow for only one Microsoft Exchange server.

    Business Locations

    Check if the problem impacts users working from a specific location (also collated by Cities, States, Countries, and Regions). You can also view this information on a map in the Geographies section.

    For example, if performance is poor only for users in the office in North London (a business location), check the networking infrastructure of that specific site. But if the problem affects all the offices in London (under Cities), you can check the wider infrastructure which is common to all those locations, like a data center.

    If Aternity uses site-based location mapping, it reports the location as Off-site when the device is not connected to the Microsoft Active Directory. For legacy location mapping, if it cannot determine the location name, it reports it as Not Mapped. A mobile device with no location name reports as Off-site if it is on 3G or 4G/LTE, or Not Mapped if it is on WiFi.

    Device Types

    Check if the problem only affects users working on specific types of devices, like only those accessing the application on a tablet.

    OS Name or Operating Systems

    Displays the generic name and version of the operating system (like MS Windows 10, MS Windows Server 2008 R2, MacOS 10.3, iOS 10 or Android 6). Use this to differentiate between different versions of an operating system.

    For example, it displays Windows 10 Pro and Windows 10 Enterprise all as MS Windows 10, or iOS 10.2 and iOS 10.3 as iOS 10.

    To view this information and the service pack version, see OS Version.

    Data Center Locations

    (Virtual deployments only) Monitor the application's performance by:

    • Data Center Locations in Aternity lists the locations of any virtual application servers (like Citrix XenApp) and VDI hypervisors (like in VMWare vSphere) which run the application. If the application is deployed both locally and virtually, one of the locations displays as Local.

    • Virtual App Servers displays the name of each virtual application server (like Citrix XenApp) running this application.

    For each item, it also displays the number of users, the usage time and wait time, the UXI, and (for managed applications only) the activity score.

    Hypervisors

    (For VDI deployments only) Displays the hypervisor name if your application is running in a virtual desktop environment, like VMWare vSphere. You can check if the drop in performance in some virtual machines (VMs) is concentrated around a specific hypervisor.

    Departments

    Check if the drop in performance is centered around a specific department, which can point to a configuration which is unique to that group of users, by viewing the performance in the list of departments.

    Regions

    You can optionally define a region in Aternity to group together several locations under a single label, like the geographical region of EMEA, North America or even Southern Europe, South-Western US any other grouping you choose.

    Countries

    Displays the country of the current location of the device.

    States

    Displays the geographical state of the current location of the devices (or area, if state is not applicable).

    Cities

    Displays the city of the current location of the device.

    Versions

    Displays the version number for this application, which the Agent for End User Devices retrieves from the executable's Properties > Details.

    Geographies (Map)

    Displays the country/state/city of the current location of the device as a map.

  5. Step 5 Check the Trends section for any recent changes in response times (the upper graph) or activity statuses (the middle graph) over the dashboard's timeframe.

    Try to correlate a slowdown (increase in response time) with an increase in network traffic volume (the lower graph). During an activity, if an application uses resources (x% CPU or RAM), or sends x MB of network traffic, it is not the same as saying that it is because of the activity. They happen at the same time, so they are correlated (see Correlation vs. Causation). However, you can be reasonably confident that these device measurements occurred because of the activity.

    Check the evolution of the response time over the timeframe of the dashboard

    For example, if many users send emails with large attachments, this might slow performance (increase response time) of Outlook activities, as illustrated in the picture above.

    In the response time graph, check if the increase in the response time was due to a significant increase in server time (light blue), network time (blue) or client time (dark blue). For example, if you find that at certain times every day, the network time increases significantly, you can troubleshoot why there is a network slowdown at those times, and whether the problem is limited to a single location by viewing the Business Locations section of the dashboard.

    Use the volume (middle) graph to view recent changes in statuses over the timeframe, and correlate a change in statuses with a delay in the server, network or client time.

    Note

    To see the exact client, network and server times, hover over the response graph and view the values in the pop-up window.

  6. Step 6 For virtual deployments, check the remote display latency to verify if this delay significantly worsens the end user experience.

    In virtual environments, check if the cause for the delay is due to the server time (light blue), network time (blue) or client time (dark blue), latency time, or a combination of those parameters.

    Remote display latency adds to existing delays

    In virtual environments, the dashboard displays an additional Latency graph so you can correlate latency times with the other response times at the same time on other graphs.

    Find a pattern between the remote display delays and other trends
  7. Step 7 Troubleshoot the activity response time of a single business location to determine the possible cause of a slow response time.

    Select the location you want to troubleshoot in the Business Locations section.

    Isolate the information concerning a single location

    For example, select London to view all data for that location only. By selecting different fields in the drop-down list in the other sections, you can immediately see the devices with slow response times.

    View the commonalities of the devices with long response time

    By selecting the location with a long response time (SanFran Building B), you can see in this example that the problematic devices are desktops with Microsoft Windows XP 32-bit and belong to the Sales department. You can also see at a glance that the long response time is influenced by the intense network traffic.

    Drill down to the Commonalities Analysis dashboard to automatically run through hundreds of options to find a common thread.

    Drill down to receive further information
    Field Description
    Time

    The time of the measurement (date, hours, minutes). For example, Nov 27, 2015 8:00 PM.

    Application (only in dashboards with multiple applications)

    Displays the name of the monitored application, as it appears throughout the system. You can customize it when you add it as a managed application.

    Activity

    Displays the name of the monitored activity within the application as it appears in the dashboards.

    Normal

    Normal refers to the status of an activity when its performance is good, since its activity response time is within the defined baseline performance of this activity. By default, a normal response time is in the fastest 90% (90 percentile) for each activity in each location. It is usually colored green . If the activity is slower than this time (the slowest 10%), its status becomes minor (), or for the slowest 3% (97 percentile) it becomes major ().

    Minor

    When an activity has a status of Minor (colored yellow ), it indicates that this activity response time is slightly slower (a minor departure or deviation) from the defined baseline performance (minor activity threshold) of this activity.

    Major

    When an activity has a status of Major (colored orange ), it indicates that this activity response time requires attention, as it was significantly slower (a major departure or deviation) from the expected baseline performance (major activity threshold) of this activity.

    Critical

    Critical is when the status of an activity is reported to Aternity as unavailable. It is colored red .

  8. Step 8 You can limit the display of this dashboard using the menus at the top of the window.

    Choose to see all the statuses of the activities, or focus only on the response times which need attention by choosing Exclude Normal in the Status drop-down list.

    Select the timeframe and activity statuses to display
    Field Description
    Timeframe

    You can change the start time of the data displayed in this dashboard in the Timeframe menu in the top right corner of the dashboard.

    You can access data in this dashboard (retention) going back up to seven days. This dashboard's data refreshes every five minutes.

    Status

    Select whether to display only the activities which are exceeding SLA thresholds. Select one of the following:

    • Exclude Normal displays only activities which exceed their SLA thresholds.

    • All does not exclude data.