## Algorithms to live by…

“…algorithms are not confined to mathematics alone. When you cook bread from a recipe, you’re following an algorithm. When you knit a sweater from a pattern, you’re following an algorithm. When you put a sharp edge on a piece of flint by executing a precise sequence of strikes with the end of an antler—a key step in making fine stone tools—you’re following an algorithm. Algorithms have been a part of human technology ever since the Stone Age.

* * *

In this book, we explore the idea of human algorithm design—searching for better solutions to the challenges people encounter every day. Applying the lens of computer science to everyday life has consequences at many scales. Most immediately, it offers us practical, concrete suggestions for how to solve specific problems. Optimal stopping tells us when to look and when to leap. The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. Sorting theory tells us how (and whether) to arrange our offices. Caching theory tells us how to fill our closets. Scheduling theory tells us how to fill our time. At the next level, computer science gives us a vocabulary for understanding the deeper principles at play in each of these domains. As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.” Even in cases where life is too messy for us to expect a strict numerical analysis or a ready answer, using intuitions and concepts honed on the simpler forms of these problems offers us a way to understand the key issues and make progress. […] tackling real-world tasks requires being comfortable with chance, trading off time with accuracy, and using approximations.”

…

I recall Zach Weinersmith recommending the book, and I seem to recall him mentioning when he did so that he’d put off reading it ‘because it sounded like a self-help book’ (paraphrasing). I’m not actually sure how to categorize it but I do know that I really enjoyed it; I gave it five stars on goodreads and added it to my list of favourite books.

The book covers a variety of decision problems and tradeoffs which people face in their every day lives, as well as strategies for how to approach such problems and identify good solutions (if they exist). The explore/exploit tradeoff so often implicitly present (e.g.: ‘when to look for a new restaurant, vs. picking one you are already familiar with’, or perhaps: ‘when to spend time with friends you already know, vs. spending time trying to find new (/better?) friends?’), optimal stopping rules (‘at which point do you stop looking for a romantic partner and decide that ‘this one is the one’?’ – this is perhaps a well-known problem with a well-known solution, but had you considered that you might use the same analytical framework for questions such as: ‘when to stop looking for a better parking spot and just accept that this one is probably the best one you’ll be able to find?’?), sorting problems (good and bad ways of sorting, why sort, when is sorting even necessary/required?, etc.), scheduling theory (how to handle task management in a good way, so that you optimize over a given constraint set – some examples from this part are included in the quotes below), satisficing vs optimizing (heuristics, ‘when less is more’, etc.), etc. The book is mainly a computer science book, but it is also to some extent an implicitly interdisciplinary work covering material from a variety of other areas such as statistics, game theory, behavioral economics and psychology. There is a major focus throughout on providing insights which are actionable and can actually be used by the reader, e.g. through the translation of identified solutions to heuristics which might be applied in every day life. The book is more pop-science-like than any book I’d have liked to read 10 years ago, and there are too many personal anecdotes for my taste included, but in some sense this never felt like a major issue while I was reading; a lot of interesting ideas and topics are covered, and the amount of fluff is within acceptable limits – a related point is also that the ‘fluff’ is also part of what makes the book relevant, because the authors really focus on tradeoffs and problems which really are highly relevant to some potentially key aspects of most people’s lives, including their own.

Below I have added some sample quotes from the book. If you like the quotes you’ll like the book, it’s full of this kind of stuff. I definitely recommend it to anyone remotely interested in decision theory and related topics.

…

“…one of the deepest truths of machine learning is that, in fact, it’s not always better to use a more complex model, one that takes a greater number of factors into account. And the issue is not just that the extra factors might offer diminishing returns—performing better than a simpler model, but not enough to justify the added complexity. Rather, they might make our predictions dramatically worse. […] overfitting poses a danger every time we’re dealing with noise or mismeasurement – and we almost always are. […] Many prediction algorithms […] start out by searching for the single most important factor rather than jumping to a multi-factor model. Only after finding that first factor do they look for the next most important factor to add to the model, then the next, and so on. Their models can therefore be kept from becoming overly complex simply by stopping the process short, before overfitting has had a chance to creep in. […] This kind of setup — where more time means more complexity — characterizes a lot of human endeavors. Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting. […] The effectiveness of regularization in all kinds of machine-learning tasks suggests that we can make better decisions by deliberately thinking and doing less. If the factors we come up with first are likely to be the most important ones, then beyond a certain point thinking more about a problem is not only going to be a waste of time and effort — it will lead us to worse solutions. […] sometimes it’s not a matter of choosing between being rational and going with our first instinct. Going with our first instinct can *be* the rational solution. The more complex, unstable, and uncertain the decision, the more rational an approach that is.” (*…for more on these topics I recommend Gigerenzer*)

“If you’re concerned with minimizing maximum lateness, then the best strategy is to start with the task due soonest and work your way toward the task due last. This strategy, known as Earliest Due Date, is fairly intuitive. […] Sometimes due dates aren’t our primary concern and we just want to get stuff done: as much stuff, as quickly as possible. It turns out that translating this seemingly simple desire into an explicit scheduling metric is harder than it sounds. One approach is to take an outsider’s perspective. We’ve noted that in single-machine scheduling, nothing we do can change how long it will take us to finish all of our tasks — but if each task, for instance, represents a waiting client, then there is a way to take up as little of their collective time as possible. Imagine starting on Monday morning with a four-day project and a one-day project on your agenda. If you deliver the bigger project on Thursday afternoon (4 days elapsed) and then the small one on Friday afternoon (5 days elapsed), the clients will have waited a total of 4 + 5 = 9 days. If you reverse the order, however, you can finish the small project on Monday and the big one on Friday, with the clients waiting a total of only 1 + 5 = 6 days. It’s a full workweek for you either way, but now you’ve saved your clients three days of their combined time. Scheduling theorists call this metric the “sum of completion times.” Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can. Even if you don’t have impatient clients hanging on every job, Shortest Processing Time gets things done.”

“Of course, not all unfinished business is created equal. […] In scheduling, this difference of importance is captured in a variable known as weight. […] The optimal strategy for [minimizing weighted completion time] is a simple modification of Shortest Processing Time: divide the weight of each task by how long it will take to finish, and then work in order from the highest resulting importance-per-unit-time [..] to the lowest. […] this strategy … offers a nice rule of thumb: only prioritize a task that takes twice as long if it’s twice as important.”

“So far we have considered only factors that make scheduling harder. But there is one twist that can make it easier: being able to stop one task partway through and switch to another. This property, “preemption,” turns out to change the game dramatically. Minimizing maximum lateness … or the sum of completion times … both cross the line into intractability if some tasks can’t be started until a particular time. But they return to having efficient solutions once preemption is allowed. In both cases, the classic strategies — Earliest Due Date and Shortest Processing Time, respectively — remain the best, with a fairly straightforward modification. When a task’s starting time comes, compare that task to the one currently under way. If you’re working by Earliest Due Date and the new task is due even sooner than the current one, switch gears; otherwise stay the course. Likewise, if you’re working by Shortest Processing Time, and the new task can be finished faster than the current one, pause to take care of it first; otherwise, continue with what you were doing.”

“…even if you don’t know when tasks will begin, Earliest Due Date and Shortest Processing Time are still optimal strategies, able to guarantee you (on average) the best possible performance in the face of uncertainty. If assignments get tossed on your desk at unpredictable moments, the optimal strategy for minimizing maximum lateness is still the preemptive version of Earliest Due Date—switching to the job that just came up if it’s due sooner than the one you’re currently doing, and otherwise ignoring it. Similarly, the preemptive version of Shortest Processing Time—compare the time left to finish the current task to the time it would take to complete the new one—is still optimal for minimizing the sum of completion times. In fact, the weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty. It offers a simple prescription for time management: each time a new piece of work comes in, divide its importance by the amount of time it will take to complete. If that figure is higher than for the task you’re currently doing, switch to the new one; otherwise stick with the current task. This algorithm is the closest thing that scheduling theory has to a skeleton key or Swiss Army knife, the optimal strategy not just for one flavor of problem but for many. Under certain assumptions it minimizes not just the sum of weighted completion times, as we might expect, but also the sum of the weights of the late jobs and the sum of the weighted lateness of those jobs.”

“…preemption isn’t free. Every time you switch tasks, you pay a price, known in computer science as a context switch. When a computer processor shifts its attention away from a given program, there’s always a certain amount of necessary overhead. […] It’s metawork. Every context switch is wasted time. Humans clearly have context-switching costs too. […] Part of what makes real-time scheduling so complex and interesting is that it is fundamentally a negotiation between two principles that aren’t fully compatible. These two principles are called responsiveness and throughput: how quickly you can respond to things, and how much you can get done overall. […] Establishing a minimum amount of time to spend on any one task helps to prevent a commitment to responsiveness from obliterating throughput […] The moral is that you should try to stay on a single task as long as possible without decreasing your responsiveness below the minimum acceptable limit. Decide how responsive you need to be — and then, if you want to get things done, be no more responsive than that. If you find yourself doing a lot of context switching because you’re tackling a heterogeneous collection of short tasks, you can also employ another idea from computer science: “interrupt coalescing.” If you have five credit card bills, for instance, don’t pay them as they arrive; take care of them all in one go when the fifth bill comes.”