How to evaluate clinical outcomes: Three questions employers should ask vendors

Published on

May 2, 2025

July 30, 2018

No items found.

At Virta Health, our values include being evidence-based and prioritizing data and science over opinion in our decision-making. But how does this apply to the data we provide employers? Here are three questions we think employers should be asking healthcare providers and vendors offering health solutions to make smarter data-driven decisions (and some examples of data that doesn’t stand up to scrutiny).

An Ocean of Products, a Drop of Evidence

From health apps to workplace wellness solutions, and even clinical care, we live in an ocean of product offerings often with only a drop of evidence of true benefit. Did you know there are an estimated 318,000 mobile health apps but a 2017 industry report found only 571 published studies testing these apps in trials? A 2016 review focused on apps for weight loss found only 65 publications. The workplace wellness industry has a reputation for under-delivering on its promises. Companies are not seeing a return on investment for worksite health promotion programs; that was the conclusion of two systematic reviews published in 2011 and 2014. There are also numerous examples of medical interventions assumed to be beneficial, from diagnostic screenings to surgeries, showing no benefit or even harm on closer examination. Changing medical recommendations based on more definitive evidence has been termed “medical reversal”. Numerous nutritional recommendations are in various stages of reversal, including the supposed safety of trans-fat consumption, the supposed danger of dietary cholesterol and the misguided promotion of low-fat diets.

Defining Levels of Clinical Evidence

Evidence-based medicine has created a hierarchy that ranks evidence from clinical trials and meta-analysis as more convincing than evidence from laboratory models and case reports.

Virta’s founders had been conducting clinical trials for decades before the company began, and there are currently 27 randomized controlled trials (RCTs) and 8 meta-analyses that support aspects of our nutritional recommendations for type two diabetes (T2D) treatment. For our first sponsored trial, Virta and our partner Indiana University Health conducted a prospective, open-label, non-randomized, controlled trial of nearly 500 patients with T2D or pre-diabetes that combined nutritional ketosis with our technology-enabled continuous remote care. All patients have now completed two years of the five year trial. Journals have published the first three of many anticipated peer-reviewed papers from the trial, including reporting one year outcomes for diabetes status and cardiovascular markers.

Unpublished data, such as marketing claims, company white papers, and testimonials are not considered in evidence-based medicine. But since such information is sometimes all that is available from providers and healthcare solutions, especially for new technologies, I’ll discuss it here too.

‍Here are three questions employers should ask providers.

#1. Are you measuring what matters?

The Streetlight Effect occurs when people only search for something where it's easiest to look—like someone who lost their keys only looking under the streetlight to find them. We need to be wary of latching onto a metric because it can be measured rather than because it's meaningful to health. With the availability of miniature accelerometers, measuring steps has taken-off as a corporate wellness metric without much evidence of its value. A 2013 Cochrane Systematic Review found very limited evidence that pedometer programs resulted in health benefits. A 2017 systematic review found a very modest weight loss effect (less than 4 pounds). The idea that 10,000 steps a day is beneficial actually began as a 1960s Japanese marketing campaign not based on any science and is usually justified by a overly simplified and discredited “eat less, exercise more” view of human metabolism. Exercise in general, while very beneficial for health, is usually an ineffective approach to weight loss.

A key metric for Virta’s treatment is hemoglobin A1c (HbA1c), a standard metric reflective of “glucose levels over time” where a level greater than or equal to 6.5% is considered diagnostic for diabetes. In our clinical trial, we showed a statistically significant mean decrease from 7.5 to 6.2% with 70% of patients reaching a sub-diabetic level. (Because HbA1c isn’t perfect, we also reported another half-dozen metrics related to diabetes status that support the HbA1c result.) In addition to averages, the Virta one year T2D outcomes paper provided a scatter-plot graphic of all patient HbA1c outcomes, as well as the statistic that 94% of patients who were on insulin were able to reduce or eliminate their insulin after 1 year on Virta. Our commercial contracts reflect our confidence in these outcomes with a substantial portion of our fees at-risk based on improved A1c status at one year for each patient.

‍Ask: “Are you measuring a metric tied to health outcomes and are you willing to report those results and stand behind them?”

#2. Have you conducted a clinical trial?

Clinical trials, conducted by medical and research professionals trained in human subjects research, differ profoundly from other means of data collection including customer surveys and records reviews. Trials require approval by an institutional review board (IRB) or ethical review board, and often require patient informed consent and advance registration—see for example our Virta-IUH trial clinicaltrials.gov registration. Among many advantages of trial data is the concept of prospective versus retrospective analysis. In a prospective analysis, the primary and secondary outcome measures are defined before the trial is conducted. In a retrospective analysis, an investigator can interrogate the data from any number of perspectives—an approach that can illuminate interesting patterns but can also become a data dredging or so-called “p-hacking” exercise looking for any good news.

One retrospective approach is called a post-hoc subgroup analysis. For example, a workplace wellness company, in their unpublished marketing materials, retrospectively focused on members with elevated risk factors on first measurement, highlighting that “63% lowered their triglyceride (TG) levels” but made no mention of how much levels dropped or what happened to those who started with lower TG levels. Nor were any statistics provided. One feature of measuring variable biomarkers is that outliers (very high or low values) often don’t repeat—so that follow-up measurements are closer to the population average. This feature of data alone could explain the lower follow-up TG levels. Examining all enrolled participants in a prospective design with statistics would avoid this flawed analysis.

Clinical trial design is particularly challenging for studying behavioral interventions including “real world” nutrition. Pharmaceutical trials lend themselves to RCTs where patients (and even investigators) can be blinded to whether the patient is receiving the active ingredient or the placebo. Blinding is not possible with a therapy requiring behavior change. There is also a substantial risk in long-term RCTs that patient behavior will change over time with patients in the control arm self-administering aspects of the intervention if they believe it is beneficial. We therefore undertook a non-randomized design that mimics real-world patient choice for our first trial. While comparisons between intervention and control groups require greater care in interpretation, we avoided important confounders related to patient recruiting and behavior.

Clinical trials require substantial resources, commitment, and expertise but are becoming easier to administer with online tools that streamline design, IRB communications and patient consent.

‍Ask: “Have you conducted a clinical trial to test your product or service and is it registered at clinicaltrials.gov prospectively stating its outcome measures?

3. Have you published your results?

“Sunlight is said to be the best of disinfectants.” - Louis Brandeis.

Despite all of its many challenges, peer review holds all scientists publishing results in journals to a higher standard because they expect that flaws in their methods, reasoning, and writing will be uncovered. Under ideal circumstances, shoddy data collection and poor statistical methods will result in rejection, lack of clarity in writing will require revision and the final published product will stand the test of time. While not always this ideal, there is no doubt that peer reviewed publications are head and shoulders above most corporate marketing materials. For example, a population health company, claimed on their website that 75% of their customers “lost between 3.2 and 33.6% of their total body weight”. But what does this really mean? Fortunately, the underlying results were published in a peer-reviewed journal. The true result was much less exciting - showing that the overall mean weight loss at four months was a disappointing 3.23% and only 28.6% of participants lost >5% of their body weight. By contrast, weight loss in Virta’s published T2D trial averaged 12% (~30 pounds) at one year with 86% of patients losing >5% of their body weight.

Peer-reviewed publication does take time - usually many months. Reviewers have their own biases and blind-spots. Multiple rounds of rejection, resubmission, and revision can delay publication of results by over a year in many cases. Fortunately, more journals are allowing scientists to post their submitted manuscripts online in a “preprint archive”. The most popular such service for biomedical research, biorxiv.org (pronounced bio-archive) at the Cold Spring Harbor Laboratory, now receives over 1,000 new manuscripts a month, nearly a 100X increase since its launch in late 2013.

‍Ask: “Have your results been published in a peer-reviewed journal or is a submitted manuscript available from a preprint server?

‍Research at Virta is focused on providing evidence for the efficacy, safety, sustainability, clinical scope and real-world impact of the Virta intervention. We are currently submitting papers on secondary one year outcomes for peer review including non-alcoholic fatty liver disease (NAFLD) and sleep quality, and analyzing 2-year trial data. Watch for more publications soon and make sure to ask us the tough questions like these:

Are you measuring a metric tied to health outcomes and willing to stand behind results?
Have you conducted a prospective clinical trial of your intervention and is it registered?
Have your results been published in a peer-reviewed journal or submitted and available?

About Virta

‍Virta is a clinically-proven treatment to safely and sustainably reverse type 2 diabetes and other chronic metabolic diseases without the risks, costs or side effects of medications or surgery.

Virta is a nationwide specialty medical provider licensed in all 50 states. Patients work with a Virta-employed physician and health coach via telemedicine, along with individualized education, biometric feedback and an online community.

This blog is intended for informational purposes only and is not meant to be a substitute for professional medical advice, diagnosis, or treatment. Always seek the advice of your physician or other qualified health provider with any questions you may have regarding a medical condition or any advice relating to your health. View full disclaimer