Hamster Emporium: Space is time: how your CS theory class lied to you

blog
recent
archive
twitter

projects
Mac OS X
Keyboard
backlight
CSC Menu
Valgrind
Fringe Player
pssh
Peal
Frankenmouse

Hamster Emporium archive

<< updated: Valgrind for Mac OS X | archive | updated: Valgrind for Mac OS X >>

Space is time: how your CS theory class lied to you (2008-10-14 1:04 AM)

In your computer science algorithms course, you learned about space-time tradeoffs. An algorithm that requires lots of time can often be changed to take less time but more space. A wide range of performance optimizations work this way, from caching to memoization to loop unrolling.

But the "tradeoff" is a lie. Space is time.

Every use of space incurs a time cost. In your theory class, the time cost of space is swept under the big-O rug. On your big-iron machine running a single computational workload or CPU benchmark, the time cost of space is small compared to the other time costs involved. But in the real world of consumer-grade devices, with limited memory and power, the time cost of space is tremendous. A performance optimization that tries to trade less time for more space often ends up requiring more time and more space.

The gcc compiler uses a garbage collector to manage its memory. To save time, gcc does not even start to collect any garbage until its memory size is quite large. "That's fine", you might say, "my new machine has gigabytes of memory". But the kernel needs a big chunk of memory just to keep track of the rest of the memory. And you're running a web browser, and an email client, and an IDE, and music and chat and clock and search and sync and backup and everything else you didn't have a decade or two ago. And your build system runs multiple gcc commands in parallel because your machine has multiple cores. Now your memory capacity isn't so big after all, the system starts paging to disk, and your compiler performance falls off a cliff and the web browser is sluggish too. In this memory-constrained environment, trying to use less time (skipping GC) and more space (accumulating garbage) has backfired badly.

Space is time. An optimization that is faster on a well-endowed device may be much slower everywhere else. Assume your customer's machines have less memory than yours, and design and test accordingly.

At one modern extreme, the iPhone has only 128 MB of memory. Ever seen iPhone Safari "forget" a web page and re-download it after you switched tabs or apps? The system ran out of memory and Safari had to throw the page away. On the iPhone, your favorite space-time tradeoff in your own program may sacrifice the user's web page, requiring a repeat download across a slow network. Good for your program, perhaps, but bad for the user.

Space is time. An optimization that makes your program faster may make the user's system slower overall. Play well with others.

Most of Mac OS X is compiled with -Os instead of -O3, to reduce code size. Mac OS X's memory allocator is slower than other allocators under some workloads, because it tries to avoid hoarding unused memory where other processes can't use it. Mac OS X uses dynamic shared libraries exclusively, then combines multiple shared libraries into a single shared cache, then carefully re-processes that shared cache, all to save space across multiple processes. Many ideas for faster cross-library calls or accelerated Objective-C method dispatch or JIT-based optimization have been abandoned because they need too much space and do not save enough time.

CPU-focused optimization can be just as evil as the infamous premature optimization. Space is time.

Greg Parker
gparker-www@sealiesoftware.com
Sealie Software