The mistake healthcare makes when measuring engagement

Kelly Bryant

While social media may play a role in the success of any smartphone app, you probably wouldn’t rely on Facebook “likes” to decide the success or failure of a behavior change program. Yet, the healthcare technology industry still uses outdated and unsuited metrics spawned by social gaming, regardless of the intended purpose or audience of the product. That’s right, the gold standard we measure all smartphone apps by, daily active users (DAU) and monthly active users (MAU), came about as a means of measuring the success of social games like Farm Ville and Cafe World.

What is “engagement”?
DAU, which represents the average number of users who interact with a product in any way each day, is usually divided by MAU to determine the “stickiness” of a product, or how many people use the app consistently. The closer DAU and MAU, the more people are using an app on a daily basis. These metrics, along with a handful of similar numbers, are often collectively referred to as “engagement.” It doesn’t seem like an inherently flawed system, but equating app engagement with potential healthcare outcomes is a highly problematic.

The problem with measuring “active users”
For starters, DAU and MAU were developed to gauge the success of apps that rely on advertising, according to Artem Petakov, Noom’s President and Co-Founder. Certainly, when your business model simply requires ad impressions, the quality of user engagement is irrelevant — all that counts is whether users see ads and whether they keep coming back. But if your business requires any other type of user action, engagement alone won’t tell you much. “Pure engagement is a very blunt instrument. If you could only tell me one thing about an app’s metrics, I’d probably ask you [DAU over MAU], but if I knew anything more about the product, I would ask for a more relevant business metric,” says Petakov.

On the other hand, a behavior change intervention requires more than passive engagement. You can have very engaged users who aren’t doing the things required to be healthier. “100% engagement with 10% success is a waste of resources.” says Noom’s Chief Medical Officer, Dr. Kit Farr. “10% engagement with 100% success is an efficient and successful intervention.” Dr. Farr points out that it’s often been assumed the more interaction one has with a patient, the better the outcome. We’ve seen in our own metrics time and again that this is a fallacy — and a potentially harmful one at that.

For example, if we as software developers rely on engagement alone to tell us whether our product development is moving in the correct direction, it’s possible to move overall engagement up, even as we move quality engagement down. (Think: You implement repetitive notifications prompting people to use the app; you see more engagement, but in fact people are just getting annoyed and turning off notifications or eventually uninstalling.) But there is a greater threat, as Petakov points out. “We’re seduced by vanity metrics; they drive those in healthcare to interventions that seem really good, but don’t actually work.” While evidence-based research plays a significant role in many aspects of the healthcare industry (pharmaceuticals, care protocols), this level of scrutiny is not yet widely being applied to software, such as behavior change technology.

Applying the evidence-based paradigm to mHealth products
While the enormous quantities of data that apps capture could help physicians, lifestyle coaches, and developers design more effective protocols, stakeholders still count on basic engagement numbers to determine if a program is “working,” both at the individual and population-wide level. As a result, these programs are a black box; user data goes in, and outcomes (either good, bad, or neutral) come out, but as a rule, developers aren’t capturing and learning from this data. With limited resources — both human and financial — it’s harmful both to healthcare businesses and their end users to develop interventions based on vanity metrics rather than real outcomes.

It’s been suggested that health and fitness apps don’t inspire healthier behaviors. Possibly this has less to do with an inherent flaw in the medium and more to do with directing product development according to ineffective metrics. Whether you choose to purchase behavior change products from an external vendor or develop them internally, it’s imperative that product development and progress are judged by meaningful engagement metrics — not just “engagement” alone.

How does one determine what meaningful engagement is? Petakov lays out a framework for others to learn the way Noom does:

  1. Complete an outcomes-oriented randomized control trial (RCT). Whether the goal is to publish the results or simply learn, complete a study that will unequivocally measure patient progress toward your targeted outcomes.
  2. Collect every data point. This ought to go without saying, but far too often, behavior change programs collect information in formats that are difficult — if not impossible — to analyze in bulk (for example, open text input fields make it significantly harder to analyze meal logging data). It’s important that your system collect data-based inputs.
  3. Analyze structured data points and work backward. Start with your most successful users. What do they have in common? What are the behaviors that actually drive health outcomes, and what’s simply noise?
  4. Create metrics that track meaningful engagement. Once you know the behaviors that matter, develop metrics that reflect those behaviors and track them religiously.
  5. Design and test micro-interventions. A/B testing is a common practice in web and mobile development, and once you know the metrics that truly matter, you can test each new change to your product.
  6. This process can help healthcare businesses focus their software on the interventions that actually make people healthier. “If you have a limited amount of patient mindspace, you need to be able to surgically select the most necessary behaviors,” says Petakov. If you want your interventions to work, you need to be selective about doing only the things that have the highest rate of success with the greatest outcome.