Based on review of the NYT Bestseller “Everybody Lies” by Seth Stephens-Davidowitz
By Gaurav Sood
Do you have an inkling that perhaps the world doesn’t operate based the rules that you apply to yourself and those around you? That maybe there are deeper and subtler pretexts to contemporary behaviours of people, formation of cultures, and societal norms that may be explained with more certainty to a specific historical event?
Maybe your job requires management of a political event, speech, press release, or economic policy change. Would you want a statistical basis for your strategy and a way to verify it had the intended impact on your audience? How can you be one step ahead of social trends in real-time to evaluate, understand and predict the implications of your strategy?
This and many curiosities of the human psyche and our world are overturned in the book ‘Everybody Lies’ by Seth Stephen-Davidowitz. A Harvard trained economist, former Google data scientist and New York Times columnist, Seth argues that much of what we thought about people has been dead wrong. The reason? People lie – to doctors, lovers, friends and themselves.
There are a range of well documented cognitive biases that most of us stumble upon, that researchers have to account for. However, without sufficient data to prove there is a bias, we are subject to antiquated survey methods that suffer from many of the biases due to in-person survey reporting. One contributing cause of a ‘lie’ is a social desirability bias. Respondents will reduce the severity or frequency of a response into a socially acceptable range, or, blatantly respond in a way to increase their desirability and connection with the interviewer.
Take an example of a startup that improves care management for the elderly by creating a holistic wellness score using a range biometric, psychological, movement and outcome data. In an initial group survey of executives to assess sales, each executive said they could see the value, like the technology and would use it. However, at the end of the meeting one executive who happened to be male said that while he liked it, he was pretty fit and healthy and would probably not buy this or use it. Slowly, each of the executives around the room came up with a variation of this reason and dropped out from buying the product. At the end of the meeting, a female executive came up directly to the startup founder and said that despite being healthy, she would like to continue to trial and purchase the product. How often does this happen? A lot from what we can understand.
Seth examines a range of ‘new’ data that has never existed or been available for research purposes, which can redefine the way we develop hypothesis about the world around us. He uses data from Google, Facebook, Wikipedia, IRS, StormFront, Microsoft, PornHub and others to investigate a range of human behaviours and psychology. The data science tools and techniques applied by Seth are certainly established, however, he has ingeniously validated several questions by triangulating datasets to delineate correlations from causation. He shares what approaches he took and what didn’t work. For those who may venture down this path, Seth’s personal site shares the data and notes for aspiring data scientists and companies that may want to evaluate his models and build upon these insights.
Applying this to healthcare has many avenues, let us touch on four key insights.
Human Computer Language Interfaces
Seth looked at Obama’s presidential speech and corresponding Google search activity to see what type of messaging activated desired behaviour from the US population. This has immediate application to the design of text and voice-based human computer interfaces (HCI) used in patient engagement systems that are being developed to create patient activation and behaviour change. The promise of these newly available data sources is to understand effective messaging strategies across all types of communication channels and industries. In health it may lead to harnessing their power for better A/B testing in AI chatbots being used by insurers and providers to improve personalised care and risk stratification (e.g. Boundlss).
Predictive Performance Models
Health providers, policy makers, insurers and doctors face a major challenge in measuring and assessing the efficacy of treatment. There are many variables that influence the clinical decisions taken at any point in time. Even if there are improvements in treatment and outcomes in a specific patient cohort, there are still problems in the dispersion of this information for it to become institutionalised and then applied in a data-driven (probabilistic) way to optimise the treatment for another patient based on their match to an underlying cohort.
Seth delves into the field of sabermetrics that uses statistical analysis to understand, optimise and predict the performance of baseball players since the days of Earnshow Cook in the 1950s. The advancements in this field have been possible due to the end-to-end capture of baseball game and player information that can be harnessed by analytical models. PECOTA analysis is a relatively recent development in this field by Nate Silver, that aims to forecast player performance in baseball based on nearest neighbour analysis, essentially using their “doppelgänger”.
With the increasing capture of human data with wearables, mobile applications, EMRs, and HIEs, PECOTA would provide great case studies of applying predictive models towards healthcare optimisation. The opportunities in healthcare, or education and skills development for career planning are tremendous.
The potential for healthcare to apply similar data science methods relies on ‘collaborative big data’. Healthcare and research suffers the curse of siloed systems and varied coding standards that prevent interoperable data exchange. Furthermore, a lack of clear partnering and commercialisation incentives can often dissuade companies from developing collaborations, that impede healthcare from moving at Internet speed.
Seth’s foray into mining new internet datasets to develop insights on human behaviour, is an exemplar for many health organisations on better understanding their customers and patients. While shedding light on the sources, depth and applications of these data, Everybody Lies may hopefully propel more managers to develop collaborative data partnerships that may lead to improvements in value-based care throughout the patient care journey.
While the basic revelations on human behaviour and insecurities unravelled in Everybody Lies are to some degree universal, the research is primarily focused on data from the USA. Thus, there is an exciting possibility for researchers to conduct studies in their own countries. For those countries with public health systems, as well as emerging and developing countries, I envisage the insights from internet data can enable better design of policy and systems over the use of traditional survey methods. This may lead to better population health management and measurement as internet based technologies gradually replace traditional polling techniques.
In ending, many internet companies are already doing A/B testing to build products based on experiments. However, the quality of experiments and speed to design useful products can only be improved by a deeper and more intuitive understanding of human nature.
On a personal note, if you are wondering how this applies to you and those around you going through different life challenges, here are three take-aways:
Seth believes his book may further the standards for rigour in social sciences, being based on hard validated data-driven methods in the future. Everybody Lies manages to unravel micro-trends through statistical approaches that were only considered at a macro level in future. Inspired by Freakanomics, and a long list of leaders in research, economics and data science, he distils years of collaboration and research into an exciting page-turner, suited for the curious. Everybody Lies deftly highlight the potential to use data science to unravel human behaviour across industries better. No doubt this is important in healthcare, where it is great data that drives decisions.
We certainly look forward to his sequel, temporarily called “Everybody (Still) Lies”.
Sign up to receive the latest from DigitalHealthX