In Ray Bradbury’s Something Wicked This Way Comes, a mysterious carnival comes to town and promises to grant the secret desires of the residents…at a price. Researchers have these desires, too. We are searching for the next method, the next data set that will yield wonderful findings and allow us to have an impact on the world.
At least for the past five years, I’ve been hearing about Qualitative Comparative Analysis (QCA). It’s spoken about in hushed tones as researchers scheme about how to make better sense of their qualitative data. It speaks to our most base desires….. “You can have it all….with small sample sizes!”
For a while, I was entranced. I studied Kahwati et al. I read Schneider and Wagemann’s Set-Theoretic Methods for the Social Sciences. At my previous job, I even sent two of our associates to a three-day workshop to learn how to do it with grand plans to implement QCA in all of our evaluations.
“Here,” I imagined myself boldly proclaiming to our stakeholders, “We have established CAUSALITY!”
And yet, nothing really happened on my end. More broadly, is anything happening with QCA? I did a quick twitter search for #qca and found 3! tweets about using QCA, one of which I wrote. I found a big fat donut, literally zero, posts on Medium about QCA. On the surface, it definitely seems like this might be a methodological dead end.
In this blog, I’ll talk about what QCA is, how it’s supposed to be used, what PubTrawlr says about people publishing on QCA, and what I believe are some of the barriers to use.
Qualitative Comparative Analysis is a methodological approach that uses set theory and logic to identify causal conditions. Basically, by identifying the presence or absence of certain conditions, and the combinations in which these occur, research can identify conditions that are necessary and/or sufficient to lead to the outcome they are studying.
Let’s take a few examples. Running is a sufficient condition for exercise. If I am running, I am exercising. There are other ways that I could be exercising, like biking and lifting weights, so it is not necessary for me to run to exercise. It is sufficient, though.
Breathing oxygen is necessary for human life. I can’t be living unless I’m breathing. Now, there are other necessary conditions for life, like food, rest, and so on (I guess we could go all Maslow hierarchy here.) So, while breathing is necessary for life, it is not sufficient. Basically, QCA looks to arrange variables to find out:
- What variables have to be in place to get to an outcome
- What variables could be enough to get to an outcome.
Setting up the inputs to go into these models takes work. There are lots of nuances in how the initial variables are converted from qualitative information into quantitative levels for the analysis. This is where crisp and fuzzy sets come in. Crisp sets are straight binary (present/absent) conditions. Fuzzy sets have more ordinal gradations in the data, going from 0-1.
The converted variables get put together into a “Truth Table,” that lays out the conditions and the outcome of interest (see below). The ultimate goal here is to reduce combinations of variable to find the one(s) that actually made a difference.
As a relevant aside, you’re not going find really good videos about QCA on Youtube, unless watching recording lectures is your thing. Someone should really get on that.
On a call back in 2015, researchers at UNC recommended that us readiness researchers use QCA to determine what parts of readiness were more important than the others. And, I’ve seen QCA pop into a few research posters and presentations at implementation conferences (notably a presentation of the study that became this article by Vera Yakovchenko) However, there had not really been a sustained message I’ve come across that presented why QCA could be powerful path to answer the types of questions that are of use to social scientists. So why do I keep hearing about it?
I popped “Qualitative Comparative Analysis” in PubTrawlr and got the following result. That’s a pretty substantial spike over the past year! Further, most of the articles are published in PlosOne and Frontiers in Psychology.
And, it looks like QCA is covering lots of new topic areas. I ran an LDA topic model and found 20 distinct clusters. In the topic model graph below, we can see meat consumption, water contamination, HIV, maternal and child health, and some larger scale policies with USPHS. QCA seems to have had some legs beyond the original political science applications.
But still, there’s a few ifs here. If QCA gets at causation, and if it is well suited for qualitative data, and if it can be applied to small sample sizes, why isn’t it used more often? I was on a webinar with Larry Palinskas and asked his thoughts about why this hasn’t taken off. He wrote about QCA in an article on mixed methods in The Annual Review of Public Health. His responses:
- Get some training in its application
- Do it as a team approach, and then use it to provide an assessment of decision-making
Well, that gives us some meta-advice on how to learn in general, but doesn’t speak to QCA barriers in general. In a way, this non-advice reinforces some of my thinking. QCA isn’t straightforward. Lacking expert advice on these barriers, I have my own thoughts on what is preventing QCA from breaking through.
- Logical notation is intimidating. I’m guessing most academics working in the social science haven’t thought deeply about logic since undergraduate Introduction to Philosophy (mine was more ethics than logic). So, just like my eyes sometimes glaze over when encountering some tough notation when research NLP models, the same here.
- Constructing a Truth Table and defining sets is non-intuitive. Crisp sets are pretty straightforward: a variable is there or it isn’t. But how often in implementation contexts do we deal with binary conditions? Not often. So, it becomes necessary to qualitatively specify and convert narrative information into different levels. This is trickier than it looks, and in many cases requires breaking down variables into pretty distinct, behaviorally observable parts, which could multiply the number of variables that we are dealing. I tried this in a few other settings when constructing Innovation-Configuration Maps. It’s tough work. Which leads directly into:
- There is a higher chance of finding something whether there isn’t anything. Also known as Type I error, the so-called cardinal sin on analysis. Simon Hug talks about it at length in this article. Basically, it boils down to how the strong assumptions of QCA make it prone to inflating findings. The measurement error that no doubt exists in getting truth tables set up can carry over into false findings.
- No outcomes! This is maybe a barrier that is too specific to me. A large portion of what I do is formative evaluation, so most of our targets are process outcomes. While we could treat processes as outcomes, it just hasn’t been a priority on projects that are mostly focused on contextual influence. I imagine this might be a challenge in other areas. You sort of need a clear, compelling, and plausible causal chain with associated variables to do the work. And sometimes, we don’t have that.
As someone who loves qualitative data and wants to see it used more robustly, while QCA seems flashy, it’s hard to say that grants our wishes yet. But, like almost everything in this world, I’m just a novice at this, so I look forward to studying and learning more and then finding the right application case. Just hoping it’s not a trip forward on the carousel.