Usage
Contents
Easily diagnose and fix common performance and resource issues.
Objectives
Goals
For those who don't have much technical knowledge or understanding of computer resources:
- Allow them to quickly identify and resolve resource and performance problems, without having to dig into technical details.
- Allow them to get an overall picture of the system state, at a glance. Is everything OK, or is it on fire?
For those who have some technical knowledge, but are not domain experts:
- Allow them to understand how much of the system's capacity is being used.
- Allow them to tell whether common issues are present: runaway processes, swapping,
Non-goals
- Not a development tool
Use cases
- "Netflix keeps buffering! why?" or "This web pages is slow to load."
- Indicate if there's another app consuming network bandwidth.
- "Why is my computer slow?" or "What just happened to my system to make it slow down?"
- Indicate if there's a CPU/memory issue
- Show a brief history of usage to spot when a problem occurred
- "Do I have a runaway process taking all my CPU or memory?"
- "I got a warning that I'm almost out of disk space."
- Allow the user to find out what's using space and allow them to free it up.
- Cover the most typical partition schemes/layouts.
- "My battery is running out really quickly!"
- Show which apps, devices and services are using the most power.
- Show the health of the battery.
- "Do I have space on my disk?"
- Show what files are consuming space and what kind of files
- "How much spare capacity does my system have?" (CPU, memory, disk I/O)
- "Why is my computer so hot?" "Why is the fan running so much?"
- show some thermal history/energy consumption
Constraints
Usage shouldn't have a significant impact on resource usage - so it doesn't present a distorted picture, and so it doesn't strain the system in scenarios where resources are limited. In practical terms this means:
- If there are graphs:
- They shouldn't update live (at most they should update once a second).
- They should be small.
- The area that gets redrawn as they are updated should be limited.
ListBoxes need to be avoided if there is a lot of dynamic content that's getting updated frequently, since this can be resource intensive. TreeViews are more efficient in these scenarios and might well be a better option.
Technical notes
CPU
- Activities that are parallelised could saturate all the cores. Processes are also moved between cores to make the most of the total available CPU capacity. In that sense CPU is a single combined resource. While it is not used evenly, it can "run out" in certain situations.
- Individual cores are interesting though:
- Processes encounter limits for the core on which they are running.
- Not all CPU cores are always active, which is relevant if you want to know how much spare capacity you have.
- The system saves power when it puts CPUs into a deeper sleep state.
- At the moment we don't protect certain processes, like the shell, so the system can be affected by heavy CPU usage by apps.
- CPU % used isn't straightforward on modern Intel CPUs, since they change frequency very quickly (10s of times a second). This is why Sysprof shows CPU frequency alongside CPU % used.
Memory.
- If a process tries to claim memory that isn't available, the kernel will just kill it.
- If you run out of RAM your machine will start swapping, which is very slow, so the whole system starts to lag and eventually hang.
- Generally speaking if you've used 100% of your memory and your computer is slow, it's a memory issue and you need to free some up. However, it's worth noting that RAM is used for cache which will be reallocated if needed by a process - so this usage either needs to be ignored or differentiated in some way.
Disk I/O
- Disk read/write isn't parallelizable, so processes have to compete for it. A single process can hog the disk. Examples: indexing lots of data, downloading a large file.
- Spinning disks also have the seek time issue - the time it takes to move to the correct position - which gets worse with data fragmentation. They also have to be spun up if they aren't already rotating.
Network
- Not all users will have a good understanding of network bandwidth - they might not realise that some activities/apps use a lot of bandwidth.
- Network issues are also related to the available bandwidth.
Power
- Power usage is related to CPU usage - if an app is using a lot of CPU it is also burning through your battery.
Temperature?
- While not under the guise of "usage" things like the thermal history of a device, especially handheld/mobile devices may be worth considerting
Discussion
Need to consider certain locales and deployments where hardware limitations are likely to be encountered (eg. poor quality hardware in hot/humid conditions).
How are users supposed to know that Usage exists? Is it possible to notify them when issues are detected for CPU/memory/etc, in the same way that we do for disk space?
Right now resource usage is only recorded while Usage is running. More historical data would be useful for power, network, etc
Relevant Art
Windows
Mac
Sysprof
Deepin
Tentative Design
Comments
I have created an alternative design proposal with some details on the decisions, available for comments - RobertRoth
I think the "Disk space used/available by type of data" goal should also make it easier for people to solve disk space problems. That might be emptying the thumbnail, browser, etc. caches, or finding duplicate files (fdupes is a command-line tool that I use often for that case, but which could do with a GUI equivalent) (-- BastienNocera 2013-11-18 13:41:50)