Benartzi (and Lehrer’s) The Smarter Screen: Surprising Ways to Influence and Improve Online Behaviour


Jason Collins


January 10, 2018

The replication crisis has ruined my ability to relax while reading a book built on social psychology foundations. The rolling sequence of interesting but small sample and possibly not replicable findings leaves me somewhat on edge. Shlomo Benartzi’s (with Jonah Lehrer) The Smarter Screen: Surprising Ways to Influence and Improve Online Behavior (2015) is one such case.

Sure, I accept there is a non-zero probability that a 30 millisecond exposure to the Apple logo could make someone more creative than exposure to the IBM logo. Closing a menu after making my choice might make me more satisfied by giving me closure. Reading something in Comic Sans might lead me to think about it in a different way. But on net, most of these interesting results won’t hold up. Which? I don’t know.

That said, like a Malcolm Gladwell book, The Smarter Screen does have some interesting points and directed me to plenty of interesting material elsewhere. Just don’t bet your house on the parade of results being right.

The central thesis in The Smarter Screen is that since so many of our decisions are now made on screens, we should invest more time in designing these screens for better decision making. Agreed.

I saw Benartzi present about screen decision-making a few years ago, when he highlighted how some biases play out differently on screens compared to other mediums. For example, he suggested that defaults were less sticky on screens (we are quick to un-check the pre-checked box). While that particular example didn’t appear in The Smarter Screen, other examples followed a similar theme.

As a start, we read much faster on screens. Benartzi gives the example of a test with a written instruction at the front of the test to not answer the following questions. Experimental subjects suffered double rate of failure when on a computer - up from around 20% to 46% - skipping over the instruction and answering questions they should not have answered.

People are also more truthful on screens. For instance, people report more health problems and drug use to screens. Men report less sexual partners, women more. We order pizza closer to our preferences (no embarrassment about those idiosyncratic tastes).

Screens can also exacerbate biases as the digital format allows for more extreme environments, such as massive ranges of products. The thousands of each type of pen on Amazon or the maze of healthcare plans on are typically not seen in stores or in hard copy.

The choice overload experienced on screens is a theme through the book, with many of Benartzi’s suggestions focused on making the choice manageable. Use categories to break up the choice. Use tournaments where small sets of comparisons are presented and the winners face off against each other (do you need to assume transitivity of preferences for this to work?). All sound suggestions worth trying.

One interesting complaint of Benartzi’s is about Amazon’s massive range. They have over 1,000 black roller-ball pens! An academic critiquing one of the world’s largest companies built on offering massive choice (and with a reputation for A/B testing) is somewhat circumspect. Maybe Amazon could be even bigger? (Interestingly, after critiquing Amazon for not allowing “closure” and reducing satisfaction by suggesting similar products after purchase, Benartzi suggests Amazon already knows this issue).

The material on choice overload reflects Benartzi’s habit through the book of giving a relatively uncritical discussion of his preferred underlying literature. Common examples such as the jam experiment are trotted out, with no mention of the failed replications or the meta-analysis showing a mean effect of changing the number of choices of zero. Benartzi’s message that we need to test these ideas covers him to a degree, but a more sceptical reporting of the literature would have been helpful.

Some other sections have a similar shallowness. The material on subliminal advertising ignores the debates around it. Some of the cited studies have all the hallmarks of a spurious result, with multiple comparisons and effects only under specific conditions. For example, people are more likely to buy Mountain Dew if the Mountain Dew ad played at 10 times speed is preceded by an ad for a dissimilar product like a Honda. There is no effect when an ad for a (similar) Hummer is played first. Really?

Or take disfluency and the study by Adam Alter and friends. Forty students were exposed to two versions of the cognitive reflection task. A typical question in the cognitive reflection task is the following:

A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?

The two versions differed in that one used a small light grey font that made the questions hard to read. Those exposed to the harder to read questions achieved higher scores. Exciting stuff

But 16 replications involving a total of around 7,000 people found nothing (Terry Burnham discusses these replications in more detail here). Here’s how Benartzi deals with the replications:

It’s worth pointing out, however, that not every study looking at disfluent fonts gets similar results. For reasons that remain unclear, many experiments have found little to no effect when counterintuitive math problems, such as those in the CRT, are printed in hard-to-read letters. While people take longer to answer the questions, this extra time doesn’t lead to higher scores. Clearly, more research is needed.

What is Benartzi’s benchmark for accepting that a cute experimental result hasn’t stood up to further examination and that we can move on to more prospective research? Sixteen studies involving 7,000 people in total showing no effect, one study with 40 people showing a result. The jury is still out?

One feeling I had at the end of the book was that the proposed solutions were “small”. Behavioural scientists are often criticised for proposing small solutions, which is generally unfair given the low cost of many of the interventions. The return on investment can be massive. But the absence of new big ideas at the close of the book raised the question (at least for me) of where the next big result can be.

Benartzi was, of course, at the centre of one of the greatest triumphs in the application of behavioural science - the Save More Tomorrow plan he developed with Richard Thaler. Many of the other large successful applications of behavioural science rely on the same mechanism, defaults.

So when Benartzi’s closing idea is to create an app for smartphones to increase retirement saving, it feels slightly underwhelming. The app would digitally alter portraits of the user to make them look old and help relate them to their future self. The app would make saving effortless through pre-filled information and the like. Just click a button. But you first have to get people to download it. What is the marginal effect on these people already motivated enough to download the app? (Although here is some tentative evidence that at least among certain cohorts this effect is above zero.)

Other random thoughts: