The research “impact” problem. How we got here and strategies to drive… | by Josh LaMar (He/Him)

Uncategorized

[ad_1]

Hopefully, you’ve read some things that either resonated with you so strongly you want to cry, (me too), or angered you because it’s not your experience, (calm down, I’m being provocative to illustrate the extreme scenario).

Mentioned already is the lack of clarity between Product vs. Business Impact and the nuanced differences of Experience Metrics and Business Metrics in how they inform Impact. Conflating these terms and using the same language to mean different things only exacerbates the communication problems between UX People and Business People, who already come from different interpretive universes.

Aside from the communication and terminology issues, three additional causes reinforce each other to perpetuate the Impact Problem:

  1. Competing Epistemological Systems
  2. Threats to Validity in Interpretation
  3. Systemic Incentivization and Gatekeeping

The philosophical branch of Epistemology asks two key questions, “How do we know what we know?” and “How can we be certain of what is True?”

Quantitative and Qualitative data come from two different systems of evaluating the nature of the evidence we use to say that a statement is True:

  • Qualitative data is non-numeric in nature, can be observed, but not measured in the same ways Business People think about. Truth is determined through narrative, thematic analysis, and coherence of explanations within the context of data collection as pertains to the research questions.
  • Quantitative data is numeric in nature and can be measured and quantified. Truth is determined through statistical significance, reproducibility, predictive power, and correspondence to observable facts about the world.

Perhaps some of this sounds basic here, but the differences between these two data types have deep implications for the criteria of evaluating the veracity of statements or beliefs-aka how we know what is True. And at the end of the day, agreeing on how we evaluate Truth is the key to unlock the Impact Problem.

Cross-System Evaluation

Many of the situations qualitative Researchers face when attempting to prove their impact stem from the the challenges that ensue when taking data from one system and the criteria of measuring Truth from the other.

Let’s start with some examples to show what this looks like in a real corporate setting, and yes, all of these have happened to me:

  • This is why qualitative researchers tire of hearing other disciplines say, “You only talked to 10 people, that isn’t statistically significant, therefore, I don’t need to act on it.”
  • After sharing an incredibly painful video of a participant completely lost and frustrated with your product, another discipline will invalidate that user and their experience by saying it was, “Just one person.”
  • It’s also why we groan, sometimes audibly, when another discipline talks about WHY customers are doing X and Y yet basing their statements on quantitative data, (which can only tell us how many people did something, not why they did it).
  • Saying that 80% of customers have a problem, when it was only 8 out of 10 of a representative sample that we tested qualitatively.

Perhaps you, too, are testy about using percentages only in the context of statistical significance and only with populations and not with samples. You’re not alone.

You can’t evaluate what Truth is, from an epistemological standpoint, with data from one system and the criteria of measuring Truth from the other. The two competing systems are constructivism and positivism/logical empiricism.

  • Constructivism, associated with qualitative data, views Truth as subjective and context-dependent
  • Positivism/Logical Empiricism, associated with quantitative data, tends towards an objective notion of Truth

Is there truly an ultimate, objective Truth? Or does each human experience the world in their own subjective way?

These grand philosophic questions are beyond the scope of this article, however, I pose them in order to get to the root of the Impact Problem:

Qualitative Researchers are being tasked with proving Business Impact using the rubric of quantitative data, which exists in a different epistemological universe from the initial qualitative data that was collected to impact the Product in the first place.

Not only does this cognitive dissonance make logical sense, it doesn’t make pragmatic sense, either.

What should you do when you encounter situations of cross-system evaluation? How do you avoid them from happening in the future?

Recommendations & Discussion

  • Be honest about what types of conclusions we can and cannot draw from the type of data we’ve collected.
  • Educate others when they apply the criteria of Truth of one system to the data of another.
  • Triangulation is your friend. The most robust recommendations come from across multiple studies, data types, and sources.
  • Telling a compelling story across studies, data types, and disciplines will bring your stakeholder team together around the common goal of impacting the Product and Business.

A deeper implication is the underlying equality of quantitative and qualitative data. Business People treat quantitative data as king, but both data types and criteria of Truth evaluation systems are needed to come to a holistic picture of Truth.

Competing epistemologies of evaluation of Truth criteria lead to blind spots, or “Threats to Validity,” when interpreting data. Even if we interpret the data appropriate to the epistemological system, problems can still arise in interpretation.

I want to further distinguish two additional types of “Impact,” because the way they are evaluated is unique and, like the nuanced distinction between Product vs. Business Impact, they are often conflated:

  • Prospective Impact is based on regression analysis, looks ahead, into the future and assumes that what was true in the past will continue to be true. Business People, who deal in the currency of quantitative data that inform Business Metrics, think of Impact prospectively.
  • Retrospective Impact is understood by looking backwards to measure before/after effects of a change. UX People, who deal in the currency of qualitative ideas that inform Experience Metrics, think of Impact retrospectively.

Problems with both approaches must be acknowledged. (See also Prospective and Retrospective Studies).

Prospective Impact & Black Swans

Prospective Impact uses regression analysis and other predictive models to forecast future trends based on historical. It assumes observed and measured patterns from the past will continue into the future.

After 20 years in the industry, I’ve finally figured out how Business Metrics, and thus ROI, are calculated:

  1. Identify the variable(s) or KPIs you care about.
  2. Run a regression analysis and project into the future.
  3. If it goes up and to the right, you’re making the right decision: Success!

Advantages
Prospective Impact is based on quantitative Business Metrics and can be used to justify just about anything. The benefit of prospective impact lies in strategic planning for the future by identifying risks and growth areas. It can also help Businesses make decisions about resource allocation, product development, and market expansion.

Quantitative Researchers have a leg up here because their data more closely resembles the “rigorous” data that Business People are looking for as a way to show Impact through statistical significance.

Limitations
The biggest threat to validity is the assumption of continuity: that historical data trends will persist into the future. However, the predictive ability of Prospective Impact is only as good as the data it’s based on.

Changes in customer behavior, market conditions, global pandemics, and technological revolutions such as the explosion of AI can’t be accounted for. And therein lies the problem, which Nassim Nicholas Taleb calls a “Black Swan.”

Two black swans
Photo by Rodrigo Rodrigues | WOLF Λ R T on Unsplash

A Black Swan, as defined by Nassim Nicholas Taleb in his book by the same name, is an unpredictable event, like the COVID-19 Pandemic, that seemingly comes out of nowhere and has significant and widespread consequences.

“A black swan is an event, positive or negative, that is deemed improbable yet causes massive consequences.” — Nassim Nicholas Taleb

Additional examples of Black Swan events include the 2008 Financial Crisis, Brexit in 2016, and the American Capitol Riot on January 6, 2021.

Taleb criticizes the use of regression analysis and other statistical models that assume a normal distribution of data for their reliance on the continuity of past trends into the future. This creates a vulnerability in financial forecasting and planning leading to the underestimation of the rarity of Black Swan events. We then use hindsight to show that we could have predicted the event, minimizing the unpredictability of the event in the first place and overestimating our ability to predict future events.

Business Metrics, and thus Business Impact are based in a system that has massive problems as it comes to predicting the unpredictable. Since Black Swans cannot be predicted, we cannot be certain of our ability to predict the future. And yet, we continue to act as if we can.

Regression analysis and Business Metrics are not a foolproof way to predict Impact.

And, if you follow Taleb’s line of reasoning, regression analysis should be done away with altogether. But I’ll let you read his book on your own to come to your own conclusion-it’s a compelling read.

Retrospective Impact & Confounding Factors

Ok, so if we can’t project forward the data from the past to show impact, what about a more conservative way of measuring impact: looking back at changes in the past.

Measuring Retrospective Impact involves analyzing historical data to understand before/after effects of a specific change, such as usage metrics of your product before and after the launch of your new feature you recommended.

Advantages
Looking retrospectively allows for the creation of an evidence-based decision making framework based on actual data and impact. You can use these learnings to offer insight into what worked and didn’t work in your product, allowing you to inform future decisions and experimentation.

Limitations
Isolation of Variables: Product Metrics may be influenced by other factors, such as changing multiple things in your product at the same time… You haven’t done that, have you? 😉

With so many A/B (or “Split tests”) running at the same time, it’s hard to pinpoint exactly which feature (or set of features) truly impacted customer behavior, (Only qualitative research can answer this question, by the way).

An additional challenge is the data and conditions to “prove” impact. I put “Prove” in quotes because we’re not using a true Experimental Design here: no randomized and controlled trials or the ability to claim causation.

The Post Hoc Fallacy is the assumption that simply because one thing follows another in sequence that the former caused the latter. At best, when we’re trying to show impact, we are hoping to show a positive correlation between our recommendations and Product or Business KPIs. Actually proving causation takes a lot more data and work and it’s not the typical case for Big Tech.

Notice I’m not using “proof” anywhere else-we can’t really prove impact in the first place, it’s a fallacy on its own. I use the phrase “inform impact” elsewhere in this article for this reason.

And this leads us to our next challenge. But first some recommendations.

Recommendations & Discussion

  • Recognizing the limitations of both Prospective and Retrospective analyses is crucial to developing a nuanced understanding of Business and Product Impact.
  • Be honest about threats to validity of our studies and critically evaluate what counts as “proof” and what doesn’t. Do this with others. Ideally, those coming from a different epistemological background as you.
  • Define how success will be measured ahead of time, including the metrics that will be measured and hypothesized will be affected. You will be putting yourself in a good position to evaluate them later on down the road. And the team will be bought in to tracking those metrics in advance (To go deeper, check out these articles on Logic Models: one and two).
  • Both Prospective and Retrospective approaches can be rigorous and lead to Truth. It’s just that in the Tech Industry, we often lack the level of time, budget, and control to do them well. Acknowledging this is also important.
  • Combine both approaches for a more comprehensive view of your Product success Metrics-this is really an argument for a Mixed Methods approach to all key Business decisions.

As a discipline, qualitative Researchers need to have a good picture of what impact can and should look like, on our own terms, using data we’ve collected, and interpreting it using our criteria of evaluating Truth.

If Researchers are required to prove Impact using the mental model of Business Metrics, we may need to up-level our skill at translating between the systems, being careful not to cross-system evaluate.

In addition to being asked to prove impact using data qualitative researchers didn’t collect or have access to, the data that is needed to inform impact is being gate kept by people who aren’t incentivized to support them.

Philosophic issues aside, the system also needs to change.

The owners of the quantitative data are often the same people in charge of making the changes, so they just grab the data they need to prove they are right and do what they want.

Obviously, there is a huge threat here of confirmation bias when you only seek out data that supports your perspective and ignore data that may be contrary to your view.

Further, data owners aren’t incentivized to share their data or to spend time helping prove someone else is right. If you’re in a competitive (or “Toxic”) company culture, you have to do everything you can to get ahead, be seen, and shine above your peers. This competitiveness also leads to mental health issues for you and your colleagues.

Thus, the qualitative researcher is reliant on the gatekeepers who own the data in order to prove their impact.

By the time the data exists, many months have passed since the initial Recommendation. The Product may have also changed considerably over that time, and in the interim, everyone else has also tried to impact the Product.

It’s challenging to inform impact one way or the other-inconclusive findings because of the threats to validity mentioned above.

And everyone is fighting to be the person who gets the credit, and thus the good review, and the promotion.

Credit for Impact and Promotions

Research deals in the currency of ideas, which are hard to visualize. Design has an output that qualitative Research doesn’t have and it’s tangible because it can be seen, thus, Design has it easier when it comes to impact: they literally designed the feature, so they have an easier time when credit for Impact is attributed.

Impact = Credit for Positive Business Metric Movement

Time and again, someone else got the credit for my Recommendations, (and they have the Patent cubes to prove it!). Does it matter to the customer who made the recommendation as long as the product gets better? No, of course not.

But it matters to the researcher who is frequently relegated to the back of the room while the pissing match of American Corporate business politics rears its ugly head as the loudest person, with a heavy dose of toxic masculinity, takes the credit for the whole team of people who did the work.

I was at Microsoft when the yearly review changed from stack ranking employees to having a question about how you helped impact the success of others in the yearly review. This small change had a massive impact on the culture to empower others to collaborate in healthy ways instead of competing with each other.

Recommendations & Discussion

  • Don’t fall into the trap of confirmation bias, seeking only the data to prove yourself right. Instead, seek out data and perspectives that don’t align with your own. A good study should be able to equally prove or disprove your hypothesis.
  • Seek common ground between disciplines and collaborate with your stakeholder team to ensure that you are measuring the most important Business KPIs and defining and measuring your own Experience KPIs.
  • Befriend your Data Scientist. I’m not joking. Take them out for drinks and understand how they are being incentivized and what they need for a promotion. Then give them the one thing they don’t have access to: WHY your customers do what they do.

The system for evaluating what impact looks like for Qualitative Researchers needs to change to be more reflective of the ideas and narratives that qualitative research provides.

We need to incentivize collaboration over competition in yearly reviews. When disciplines are incentivized to help each other out, there is more opportunity for holistic discussions about what is True for Users and the Product and the opportunity for deeper discussions about what success looks like for the Experience and the Business.

[ad_2]

Source link