Advogato: Runtime power saving on Linux - not all CPU use is equal

Modern processors have a variety of low power modes that can be entered while idle. These "C states" are numbered, currently ranging from C0 (a running CPU) to C4 (a deep runtime sleep state). The problem with these states is that the deeper the sleep, the longer it takes the CPU to wake up. In order to avoid excessive reductions in performance, the kernel must keep track of the processor usage pattern and avoid putting the CPU into a deep sleep mode when it's likely that it'll be needed for more work in the near future.

Traditionally, the kernel has had a fixed timer tick - that is, a fixed number of times a second, the timer will generate an interrupt, wake the kernel and allow processes to be rescheduled. This limits the maximum amount of time the CPU can remain idle, as this timer will fire even when the system is otherwise idle. A tick rate of 1000Hz is desirable for reducing latency, but will also result in the maximum sleep period being 1ms. This is less than ideal.

2.6.21 introduced dynamic tick functionality. Now, rather than having a fixed tick interval, if the system is idle the kernel looks at all outstanding timers and schedules a wakeup in time to answer the next timer. This allows much longer periods of sleep without compromising latency. However, for this to be useful it's desirable to have as few timer wakeup events as possible. The longer the CPU is going to be asleep, the deeper the sleep state that can be used.

Intel have recently released Powertop, an application that tracks the causes of wakeups. These wakeups can fall into two categories - kernel and userspace. Pure userspace timers will usually be due to an application having a timer to handle some sort of trivial activity, like blinking a cursor or polling for state. There are a couple of ways to improve this:

Just remove the timer - does it actually need to exist at all?
If you're polling for state, consider whether it would be possible to move to an event driven model. For example, right now screensavers poll the X server in order to obtain information about whether the session is idle. The X server has to keep track of this information anyway, so a simple extension could be added to notify applications when the user has been idle for a certain period of time.
Make sure you're only waking up when you need to. For instance, you might want to periodically check whether any new email has appeared. If there's no route to the internet, don't bother.
If you must use timers, try to schedule them to go off simultaneously. It's better to wake the kernel up once and do twice as much work than it is to wake it up twice. If you're a glib application, g_timeout_add_seconds() will round to the nearest second, so use it if possible.

Things in the kernel can be trickier. Kernel interrupts may be appearing because of some kernel code specifying a timer, but an alternative is that a userspace application is poking them. Many of the same considerations as userspace apply here:

If you don't need especially high-precision timing, use round_jiffies(). It'll result in synchronisation of many wakeups, and reduce the overall number.
If you're being woken up by userspace, figure out why. HAL polls storage devices every couple of seconds, generating several interrupts. This is necessary because most storage devices don't notify the system when media insertion occurs. Conversely, alsa sends notification events whenever the hardware state changes, and so it's not necessary for userspace to poll. If you can provide useful information to interested parties without them having to repeatedly ask, then do it.
If your hardware is idle, then do what you can to quiesce it. The Appletouch pad can be put into a mode where it doesn't send packets until touched, but once touched will continue sending packets. Watch for a stream of contentless interrupts, and put the hardware back to sleep.
If the hardware can generate interrupts when something happens, then use them - don't poll unnecessarily

Fixing these issues can range from the trivial (removing unnecessary timers) to the complicated (teaching gstreamer about alsa notifications, teaching the mixer applet to listen to signals from gstreamer). It's all helpful, though. Ideally you want your processor to be averaging at least 20ms in the deepest C state before it's woken up again. There's a lot of low-hanging fruit out there, and every fix improves battery life.

(This article was originally posted here)