Troubleshoot a User's Device (Troubleshoot Device)

The Troubleshoot Device dashboard displays support data about a device's recent behavior, so service teams can troubleshoot the device with minimum effort.

For example, if a user calls the support desk with an issue about their computer, you can use this dashboard as your first entry point to troubleshoot and resolve the problem. All the questions you would normally ask about the device are already laid out for you: its operating system, its network connection, top processes, memory and CPU usage, recent health events, applications, even boot times are all listed. Since this is a monitored device, you can change the timeframe to view the status of the device at the time the problem occurred, making it an exceptionally powerful troubleshooting tool.

The Troubleshoot User's Device Dashboard
Field Description
Summary bar

View a summary of this device's key information, including whether it is currently reporting data to Aternity, location, IP, operating system and last boot.

For more information, see this step below).

CPU and Memory

Look for any high percentages of usage in the history of the device's CPU, physical memory or virtual memory during the timeframe of the dashboard.

For more information, see this step below).

Top Processes

(Windows, OS X, Android only) View the processes on this device which occupy the most resources during the dashboard's timeframe.

For more information, see this step below).

Battery Level

(Windows, OS X and mobile only) View the percentage battery charge for this device at any time.

Signal Strength

(Mobile, OS X only)(Mobile only) For mobile network connections (3G / 4G / LTE), view the signal strength to the mobile carrier, the type of phone network (like CDMA or GSM) and the name of the carrier. For Wi-Fi network connections in mobile devices, view the signal strength to the WiFi network, and the name of the network (SSID).

Disk Queue Length

View the number of waiting I/O requests to read or write to the hard disk or a logical disk at a given time during the timeframe.

You can also customize the view of this section with the drop-down menu to view:

  • Disk IO Read

  • Disk IO Write

  • Network IO Read

  • Network IO Write

For more information on all these sections, see this step below.

Applications

View the list of applications (desktop and web) running on this device, and key information about their performance for end users. For more information, see the step below.

Outgoing Connections

(Physical devices only) Displays any virtual sessions opened from this physical device (which would be the front line terminal) to a VDI virtual desktop, virtual application or RDP session during the dashboard's timeframe. It also displays some details of the virtual device, and the latency time of the connection.

You can drill down to view more about that other device, like troubleshooting that device, the experience of this user over all devices, and performance changes while performing specific activities.

Connected Users

(Virtual device only) View the list of front line users connected to this virtual desktop, and the latency times during the dashboard's timeframe.

You can drill down to view more about the other device, like troubleshooting that device, the experience of this user over all devices, and performance changes while performing specific activities.

Recent Health Events

(Windows) Displays the list of hardware health events for this device during the past seven days. For more information on each health event, see Troubleshoot System, Hardware and Device Issues (Desktop Health).

(Mobile) For each monitored mobile app, it displays the list of app crashes and app errors for the past seven days.

Recent Installed Applications

(Windows only) Displays the list of applications installed or upgraded as part of an auto-update during the past seven days.

Recent Boots

(Windows only) Displays the boot date and times, and the total boot time of each boot reported during the past seven days.

The total boot time on Windows device starts from the time the Windows logo appears until the desktop appears and all components are loaded. The Agent queries Windows Event Log (ID 100) for the BootTime parameter, calculated as the sum of main path boot and post boot times, located in the Diagnostics > Performance > Windows section of the log.

For more information on the boots performed on this device, see the Boot Analysis dashboard.

Procedure

  1. Step 1 Open a browser and log in to Aternity.
  2. Step 2 Select Main Menu > Troubleshoot > User or Device.
    Accessing Troubleshoot User or Device
  3. Step 3 Select the user's device to display in the dashboard.
    Select a user's device which needs troubleshooting
    Field Description
    Enter a username or device name

    Start typing the name of the user or the hostname of a device. The system offers choices to auto-complete your text.

    Select a device

    If you chose a username, the system offers a list of devices associated with that username.

    show me

    Select if you are not sure on the device to troubleshoot. The system redirects you to the Monitor User Experience dashboard so you can view the performance of all the devices of that user, and then drill-down from there back to Troubleshoot Device.

  4. Step 4 View a summary of this device's key information, including whether it is currently reporting data to Aternity, its location, IP, operating system, user, department and last boot.
    View a quick summary of the key support information about a device
    Field Description
    Stability Index

    (Windows only) This index is made up of:

    • Reliability Value: The reliability value (or stability index) is a Windows score (from 1 to 10) of a device's overall stability. As the number and severity of errors increases, it lowers the reliability value. Aternity displays the average for the previous day, or, if unavailable, it shows the most recent daily average. If a device does not report for more than a week, it is removed from the system.

    • Reliability Grade: The reliability grade of a Windows device is the colored status of its reliability value, using Aternity's standard method to derive a status.

    Network

    Displays the type of network connection of the device: LAN or WiFi, or for mobile devices it can be Mobile or WiFi.

    The Agent queries Microsoft Windows via the .NET NetworkInterface class to determine the type of connection.

    OS

    Displays the full name and exact version number of the operating system (OS), but does not include the service pack number, so you can check if an issue appears only on certain operating systems.

    Last Boot

    (Windows, mobile only) Displays the date and time of the last boot of the device.

    Business Location

    Displays the current location of the device, and whether or not it is connected via VPN (by checking for known VPN adapters which are operational).

    If the location is not defined in the system, it displays either the name of the city, or Off-Site or (for mobile devices) Not Mapped.

    Manufacturer

    Displays the name of the device manufacturer. For example, Dell, Lenovo, Samsung, Apple, and so on.

    Model

    Displays the name of the model of the device. For example, iPhone6.1, Dell Latitude D620, GalaxyTab8.

    IP Address

    (WIndows) Displays the device's internal IP address which is used to connect to Aternity.

    Subnet

    Displays the subnet configuration of the device which is used to connect to Aternity.

    Connected User

    Displays the Windows login username of the person working with the device.

    Department

    Displays the name of the department to which the user or the device belongs.

    For further information, you can drill down from the Connected User section to any of the following dashboards:

  5. Step 5 Look for any high percentages of usage in the CPU and Memory section, which displays the history of the device's CPU, physical memory or virtual memory during the timeframe of the dashboard.
    Recent usage of CPU, physical and virtual memory

    For each of line in the graph (CPU, physical memory, or virtual memory), look for:

    • High CPU usage slows down the device performance, but it is often caused by only a single application.

      If you see consistent high readings, check the Top Processes section at different points on this graph to discover if a single program is causing the high CPU usage. Also check this application on other devices, and if so, consider removing it from your policy.

    • High physical memory usage (above 80%) significantly slows down the system as the device issues many more data requests to its virtual memory (hard page faults).

      Each request to virtual memory is about 1000 times slower than a request within the physical memory, hence performance is hit hard. To reduce this, check the Applications section for too many heavy applications running simultaneously. If all are necessary, consider upgrading the device's RAM.

    • High virtual memory usage (above 80%) indicates the device is at risk of running out of memory, with many applications issuing multiple memory exception errors.

      Resolve this by clearing out hard disk space and possibly upgrading RAM.

  6. Step 6 In the Top Processes section, view the top Windows processes on this device which occupy the most resources during the dashboard's timeframe, updated every two minutes.

    You can view five types of measurements which consume the device's resources, by selecting the resource type in the section's drop-down menu.

    View the top resource hogs
    Note

    This section only displays data if at least one of the five measurements exceeds its threshold.

    For example, if the CPU usage threshold is 50% (default) and the total CPU usage for all the processes on the device is at 80%, the Top Processes section displays the five processes which consume the most CPU. You can also display any of the other four measurements reported at that time by selecting that measurement in the section's drop-down menu.

    Field Description Default Threshold
    CPU Utilization

    (Windows, Android only) View the processes occupying the highest CPU percentage during the timeframe, and view the maximum usage for each process. For example, when an intensive graphics application uses a high CPU for several minutes, or an application hangs.

    By default, Aternity collects top processes data if the total CPU usage of all processes on the device rises above 50%, or if the disk queue length is more than 1.

    Disk IO Read

    (Windows only) View the processes which performed the highest rate of read requests from the hard disk during the timeframe, and view the maximum read rate for each process. To look for the exact times when peaks occurred, view the graphs of the Disk IO Read section.

    For example, if a virus scanner slows performance by issuing many disk read requests, reschedule to off-peak times. Alternatively, if the read rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    By default, Aternity collects top processes data if the total read rate from the hard disk exceeds 1 megabyte per second (MBps), or if the disk queue length is more than 1.

    Disk IO Write

    (Windows only) View the processes which performed the highest rate of write requests to the hard disk during the timeframe, and view the maximum write rate for each process.

    For example, a movie editor can perform large disk writes, slowing down the device's performance. Alternatively, if the write rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    By default, Aternity collects top processes data if the total write rate to the hard disk exceeds 1 megabyte per second (MBps), or if the disk queue length is more than 1.

    Physical Memory

    (Windows, Android only) View the processes which utilize the most physical RAM memory during the timeframe, and view the maximum physical memory usage for each process.

    Use this to find processes which suffer memory leakage, causing other applications to slow down.

    By default, Aternity collects top processes data if the total physical RAM usage of all processes on the device rises above 90%, or if the disk queue length is more than 1.

    Virtual Memory

    (Windows only) View the processes which utilize the most virtual memory during the timeframe, and view the maximum virtual memory usage for each process.

    High usage of virtual memory slows performance significantly, because using the hard disk instead of RAM is 1000 times slower than physical memory. To resolve, increase the capacity of RAM on the device.

    By default, Aternity collects top processes data if the total virtual memory usage of all processes on the device rises above 90%, or if the disk queue length is more than 1.

    Tip

    Select a single point in time in the CPU and Memory section to view the processes which were occupying the most device resources at the time.

  7. Step 7 Use the Disk Queue Length section to view the essential I/O measurements of this device during the dashboard's timeframe.

    Use the drop-down menu on the right hand side of this section to view different key I/O measurements.

    View any I/O bottlenecks to the disk or network
    Field Description
    Disk Queue Length

    View the number of waiting I/O requests to read or write to the hard disk or a logical disk at a given time during the timeframe.

    A consistent queue for the disk indicates a bottleneck in hard disk access, which significantly impacts on system performance, either due to excess system demands on the disk, or it can be a hardware disk problem. To check if the problem is hardware, view if the speed (rate of reads and writes to the disk) is low by selecting Disk IO Read or Disk IO Write from this section's drop-down menu.

    Disk IO Read

    View the rate at which the device reads from the hard disk in MB per second at any given time during the timeframe.

    For example, if a virus scanner slows performance by issuing many disk read requests, reschedule to off-peak times. Alternatively, if the read rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    Disk IO Write

    View the rate at which the device writes to the hard disk in MB per second at any given time during the timeframe.

    For example, a movie editor can perform large disk writes, slowing down the device's performance. Alternatively, if the write rate falls to almost zero, the hard disk may be failing, or its connection to the computer may be unreliable.

    Network IO Read

    View the data downloads of this device in MB per second at any given time during the timeframe.

    For example, if its throughput or usage of bandwidth is low, and the user complains of slow network connections, consider checking the NIC hardware.

    Network IO Write

    View the data uploads from this device in MB per second at any given time during the timeframe.

    For example, if its throughput or usage of bandwidth is low, and the user complains of slow network connections, consider checking the NIC hardware.

  8. Step 8 View all the applications (Windows and web) on this device during the dashboard timeframe in the Applications section, along with their usage time, wait time, and UXI.
    All the applications which ran on this device during the timeframe
    Field Description
    Type

    Displays the type of application: Windows desktop (), web application () which you run through a web browser, or monitored mobile app (), which has been integrated with Aternity monitoring.

    Name

    The name of the application, as specified in the Description field of the executable file's properties.

    Note

    Web Browsing is an umbrella term for all web browsing in your organization on sites which are not white listed. To white list a site, add it to the system.

    Usage Time

    The usage time of an application is the total time it is running, in the foreground, and being used. This includes the wait time, the time a user spends waiting for the application to respond. For web applications, the usage time is when both the browser window and the application's tab are in the foreground.

    UXI

    Displays the application's user experience index across all users in all locations in your enterprise. The User Experience Index (UXI) is a value (0-5) which measures the overall performance and health of an application, based on several inputs: the number of crashes per hour of out the total usage time, the percentage of hang time of out the total usage time, the percentage wait time of out the total usage time. For web applications, it also uses the percentage of web page errors out of all page loads, and the average page load time. These ingredients come together to represent the overall experience of a user.

    Activity Score

    (Managed applications only) Displays the overall activity score for this application, calculated by condensing all the activity statuses into a single value. Use this for acute (recent) problems in performance.

    View more information on each application by hovering your mouse over the measurements

    This table lists the fields from the hover windows in alphabetical order:

    Field Description
    Application

    Displays the name of the monitored application.

    Average Page Load Time

    (Web applications only) Displays the average time required to load the web page in a cloud application. The response times of activities are split into client time ( dark blue), and the combination or union of the server time ( light blue) and the network time ( blue).

    Crashes per Hour of Use

    Displays the total number of application crashes which occurred, divided by the usage time. This is one of the elements used when calculating the UXI.

    Hang Time Rate

    Displays the percentage of hang time of out the total usage time. This is one of the elements used when calculating the UXI.

    Page Error Rate

    (Web applications only) Displays the percentage of web page errors (HTTP error 40x or 50x) out of all page loads in web applications. This is one of the elements used when calculating the UXI.

    Usage Time

    The usage time of an application is the total time it is running, in the foreground, and being used. This includes the wait time, the time a user spends waiting for the application to respond. For web applications, the usage time is when both the browser window and the application's tab are in the foreground.

    User Experience Index

    The User Experience Index (UXI) is a value (0-5) which measures the overall performance and health of an application, based on several inputs: the number of crashes per hour of out the total usage time, the percentage of hang time of out the total usage time, the percentage wait time of out the total usage time. For web applications, it also uses the percentage of web page errors out of all page loads, and the average page load time. These ingredients come together to represent the overall experience of a user.

    Wait Time

    An application's wait time is defined as the time users spend waiting for the application to respond when it is actively running and in use (part of the usage time). The total wait time is calculated as the time covered by the following components (which may overlap): the hang time when an application is not responding, or when the mouse pointer has a busy icon. For web applications, the wait time is the web page load time when both the browser window and its tab are in the foreground.

    Wait Time Percent

    Displays the percentage wait time of out the total usage time.

    For further information, you can drill down to:

  9. Step 9 You can jump straight to related dashboards using the quick jump menu at the top of the screen:
    Use the quick jump bar to jump to related dashboards about this device
    Field Description
    User Experience

    Jump to the Monitor User Experience dashboard.

    Activity Resource Analysis

    Jump to the Activity Resource Analysis dashboard.

    Device Details

    Jump to the Device Details dashboard for this device.

  10. Step 10 To show the data for specific periods of time, use the Timeframe menu at the top of the dashboard.
    Field Description
    Timeframe

    Choose the start time of the data displayed in this dashboard.

    You can access data in this dashboard (retention) going back up to 30 days.

    This dashboard displays raw data in real time, refreshing every time you access it or whenever you manually refresh the browser page.