Robin Hanson replied here to my original post challenging him on health care here.

On Straw-Manning

Robin thinks I’m straw-manning him. He says:

Scott then quotes 500 words from a 2022 post of mine, none of which have me saying all medicine is useless on all margins. Even so, he repeats this claim several times in his post. If Scott had doubts, he could have asked me. Or, consulted our 2016 book The Elephant in the Brain (52K copies sold; I’m sure he knows of it):

» “Our ancestors had reasons to value medicine apart from its therapeutic benefits. But medicine today is different in one crucial regard: it’s often very effective. Vaccines prevent dozens of deadly diseases. Emergency medicine routinely saves people from situations that would have killed them in the past. Obstetricians and advanced neonatal care save countless infants and mothers from the otherwise dangerous activity of childbirth. The list goes on. …

» “We will now look to see if people today consume too much medicine. … we’re going to step back and examine the aggregate relationship between medicine and health. … We’re also going to restrict our investigation to marginal medical spending. It’s not a question of whether some medicine is better than no medicine—it almost certainly is—but whether, say, $7,000 per year of medicine is better for our health than $5,000 per year, given the treatment options available to us in developed countries.…

» [Re] the medicine consumed in high-spending regions but not consumed in low-spending regions, … the research is fairly consistent in showing that the extra medicine doesn’t help. … Still, these are just correlational studies, leaving open the possibility that some hidden factors are influencing the outcomes. … To really make a strong case, then, we need to turn to the scientific gold standard: the randomized controlled study.”

We seem pretty clear to me there. There’s also my 2007 article Cut Medicine in Half where I say:

» “In the aggregate, variations in medical spending usually show no statistically significant medical effect on health. … the tiny effect of medicine found in large studies is in striking contrast to the large apparent effects we find even in small studies of other influences.“

Obviously, if I thought medicine was useless at all margins, I’d have said to cut it all, not just cut it in half.

I acknowledge he’s the expert on his own opinion, so I guess I must be misrepresenting him, and I apologize. But I can’t figure out how these claims fit together coherently with what he’s said in the past. So I’ll lay out my thoughts on why that is, and he can decide if this is worth another post where he clarifies his position.

The marginal unit of health care doesn’t come clearly marked. If we want to cut the marginal unit of health care (for example, following Robin’s recommendation to cut health care in half) we need to cut specific things. If you would otherwise get ten treatments in a year, you need to cut out five if you want to halve health care like Robin suggests. Which five? You could make the decision centrally (the medical establishment decides some interventions are less valuable than others, and insurance stops covering those) or in a decentralized free-market way (customers get less insurance, increasing the cost of medical care and causing them to make harder trade-offs about when to get it), but somebody has to make this decision at some point. On what basis do they make it?

One possible reasonable position might be “obviously the cancer stuff and the antibiotics are important, so definitely keep those. Find stuff which seem frivolous, and then cut that.”

My impression is that Robin has very clearly rejected this position. For example, from here:

It is clearly not the case that marginal care contains mostly treatments that doctors know to be less useful and more frivolous, while the serious situations where doctors know medicine is very valuable are usually in common care. In fact, doctors do not seem to see a difference between common and marginal care. So, if common care is much more useful on average than the useless-on-average marginal care, it must be because each patient somehow knows something that doctors cannot see about when he really needs to see a doctor. Now how likely is that?

We just don’t know which treatments are useful vs. bad? But don’t we know, for example, that antibiotics are good? Robin again:

But what about those miracles of modern medicine we have all heard so much about? Did not the introduction of antibiotics, for example, dramatically reduce death rates for key diseases? Well, not much actually.

Public health measures? Sanitation? Clean water? Robin again:

If medicine for treating individuals is not quite the miracle we have heard, does public health make up the difference? Have not we all heard how the introduction of modern water and sewer systems greatly improved our ancestors’ health? Well, a century ago the U.S. cities with the most advanced water and sewer systems had higher death rates than the other cities. Also, we can look today at how the death rates of individual households correlates with the water sources and sewer mechanisms used by those households. Even in poor countries with high death rates, once we control for a few other variables like social status we usually find that water and sewer parameters are unrelated to death rates. Well we must live longer now for some reason, right? Yes . . . but the fact is that we just do not know why we now live so much longer.

(Robin says good things about sanitation elsewhere, so maybe he’s changed his mind. This is my whole problem; he says a lot of seemingly contradictory things.)

Okay, then how do we halve medical care? In his CATO Unbound article, he said it didn’t matter which parts you cut, because “most any way to implement such a cut would likely give big gains.”

Am I straw-manning him again here? Doesn’t he obviously think we should spend some time figuring out which medical treatments are good and effective (cancer care? vaccines?) so we don’t accidentally cut those?

In a podcast, he mocked other economists who said that you needed to be really careful and work hard to figure out which parts of medicine were good, so you could make sure to cut only the useless parts:

I did a Cato Unbound forum about 10 years ago where my starting essay was cut medicine in half, and a number of prominent health economists responded there. None of them disagreed with my basic factual claims about the correlation of health and medicine and other things, but, still, many of them were reluctant to give up on many medicine. They said, “Well, yes, on average, it doesn’t help, but some of it must be useful and, uh, we shouldn’t cut anything until we figure out what the useful parts are,” and I make the analogy of that with a monkey trap.

In many parts of the world, there are monkeys that run around, and you might want to eat one. To do that, you need to trap one, and a common way to trap a monkey is you take a gourd, that is, a big container that’s empty, and you put a nut on the inside of that gourd, and the monkey will reach into the gourd and put his fist around the nut and try to pull his hand out because then mouth is too small to get his hand up, and he will not let go of that nut.

Robin thinks it’s a “monkey trap” to try to cut the good parts of medicine but not the bad. This seems consistent with his claim that you can’t distinguish good from marginal care, and with his claim that he’s not sure antibiotics or public sanitation are good.

It seems to me that if we were to cut medicine in half, figuring out which half to cut would be among the most consequential decisions in history. If we did it foolishly - for example, cut out all treatments that start with letters A - M - then we would lose antibiotics, appendectomies, AIDS medications, etc. I would expect even small mistakes in this process to cause more deaths than 9-11, the Iraq War, or other things we think of as greatly consequential.

But Robin doesn’t seem to think this matters very much, and his antibiotics and public health comments make it sound like this is because he’s not particularly sure that any kind of medicine works. This is the context in which I cited his casino quote (source) in the original article:

Imagine someone claimed that casinos produce, not just entertainment, but also money. I would reply that while some people have indeed walked away from casinos with more money than they arrived with, it is very rare for anyone to be able to reasonably expect this result. There may well be a few such people, but there are severe barriers to creating regular social practices wherein large groups of people can reasonably expect to make money from casinos. We have data suggesting such barriers exist, and we have reasonable theories of what could cause such barriers. Regarding medicine (the stuff doctors do), my claims are similar.

In the context of everything else, I can’t help but interpret this as suggesting that medical care is net neutral, or even (like casinos) net negative, and that just as there is no specific slot machine that you know will work at a casino, and you would do best avoiding all of them, so there’s no one medical treatment that we know is positive in expectation, and cutting any of it would be fine.

Maybe this is all straw-manning, all of this is taken out of context, and the only place that Robin says his true opinion is in his book. But in that case I feel like this is a pretty extreme failure of communication that’s not entirely my fault. Also, other people seem to interpret it the same way I do:

If I thought medical care was mostly effective and just needed to be trimmed around the margin, and my readers were posting that I thought medical care was “useless”, or that “ALL health spending is wasteful”, or that “medicine is net neutral for health” - I would be horrified and try to clear it up as quickly as possible. Somehow in fifteen years this hasn’t happened. I guess if I can get Robin to make this clarification - even if it turns out I’m totally wrong and misunderstanding everything he says - then maybe this post will have been worthwhile.

So in the interests of getting a clearer understanding, I’ll pose Robin a trilemma:

  1. Either we can’t distinguish between good and bad medical interventions, but the average intervention is net positive in expectation (in which case it seems like we should keep the amount of medicine we have now, since we assess each treatment equally and they’re all net positive)

  2. Or we still can’t distinguish between good and bad medical interventions, but the average intervention is, after you count the monetary cost, net neutral or negative in expectation (in which case one should be equally skeptical of everything, including antibiotics and cancer treatment, and I don’t understand how saying this is a straw man)

  3. Or we can distinguish between good and bad medical interventions, and we should throw out the bad ones and keep the good ones (in which case why does Robin keep saying the opposite, why does he call this a “monkey trap”, etc? And wouldn’t it be better for Robin to frame his position as “medicine generally works well, but there are some interventions that aren’t evidence-based enough”, which is the consensus medical position?)

If this is a false trichotomy, Robin should tell me how!

Let’s Do Near Mode!

I should mention that, despite disagreeing about health care, I have a huge amount of respect for Robin. He’s developed or popularized many of the ideas that still shape my thinking. One of them is “near mode vs. far mode”, his take on construal level theory. I find it helpful at times like this to try to go as Near Mode as possible.

For example, in one of the papers I linked above, Robin writes:

Unfortunately, even if you believe everything that I have said, your behavior will probably not change much as a result. You will still spend nearly as much on medicine for yourself and your family, and spend much less effort on the more effective ways to increase lifespan. After all, your sick family would consider it the worst kind of betrayal if you did not “do something,” and give them all the medicine that your doctor recommends (Hanson, 2002). Alas, the problem of the fear of death muddling our thinking is so much worse than we imagined.

I interpret this as him saying that if you were smart, had the courage of your convictions, and weren’t so obsessed with signaling, then you, the literal reader, would cut your individual health care expenses right now after learning about his theory.

I, like many people, would like to spend less money on health care without my health being negatively affected in any way. The Nearest way possible to approach this is to think about how Robin’s theory suggests that I act. Here are some categories of health problem that I might one day have to think about:

  1. A heart attack or stroke (going to the hospital)

  2. Cancer (going to an oncologist, complying with their recommendations)

  3. A bacterial infection, eg pneumonia, sinusitis, meningitis, etc (going to a doctor / urgent care / ER, taking antibiotics if recommended)

  4. A chronic disease like Type II diabetes (going to the doctor, following their recommendations about glycemic control, taking medicine if recommended)

  5. New-onset unexplained but serious-seeming symptoms, eg sudden intense abdominal pain, or suddenly feeling very dizzy (getting checked out by a doctor or hospital).

  6. New-onset unexplained but mild-seeming symptoms, eg mild abdominal pain or suddenly feeling slightly dizzy (getting checked out by a doctor or hospital).

  7. Acting erratically, hallucinating, saying things that don’t make sense (going to a psychiatrist or mental hospital).

  8. Feeling very depressed or anxious, so much so that it’s hard to get through the day or do your usual work (going to a psychiatrist)

  9. A middle-aged person with a family history of cardiovascular problems who hasn’t gotten a checkup in a while (going to a doctor, taking statins/ACEIs/etc if recommended)

  10. Just sort of feeling blah all the time, eg tired, joints ache, etc (going to the doctor and getting checked out).

So my second question for Robin is: how do you recommend I proceed? Do I avoid going to the doctor for some specific subset of these categories, like 5-10? Are all the categories equal, such that I should flip a coin each time I get an illness, and only go to the doctors if the coin comes up heads? Get certain care in some categories, flip the coin for others? This isn’t intended to be a rhetorical question. I’m hoping it clarifies what it means to “cut the marginal unit of health care”, what a reader who didn’t have “the fear of death muddling [their] thinking” would do, and how much Robin believes we can distinguish between good and bad treatments.

Actually, we can get even Nearer. My wife and I recently took my four-month-old son to the pediatrician. The pediatrician said he had a mis-shapen head, and referred him to a head specialist for a second opinion. The specialist said yup, looks pretty mis-shapen, and referred us to a helmet-maker. The helmet-maker said yeah, definitely mis-shapen, and wants us to pay $300 for a helmet to correct it so my son doesn’t get stuck permanently looking like Frankenstein when he grows up.

(there are some studies, neither obviously wrong nor obviously unimpeachable. They say the helmet works, but not necessarily better than “repositioning therapy”. Our son refuses repositioning therapy, so for us it’s the helmet or nothing.)

That helmet is probably our “marginal” health care expense, in the sense that it’s less obviously important than the other two things we’ve used healthcare for this year (childbirth, a scare with our son’s breathing). So, if we’re trying to cut the marginal health care expense, should we skip the helmet? Maybe we should skip it - I never see any adults with obviously mis-shapen heads out there, and surely they didn’t all get $300 helmets as kids. Maybe it’s all a racket.

(for pictures of people with mis-shapen heads, see here, dead dove, do not eat)

How would Robin recommend I make this decision, if not by consulting the studies? Should I just base it off the RAND experiment? Are we sure that there were any babies with mis-shapen heads in RAND at all? Did RAND even use “has non-Frankenstein head shape” as an endpoint? Should I still go off RAND? Or should I just trust Johns Hopkins and all the specialists when they say this is good?

This is as Near as I can get. What now? Is Robin’s advice just aimed at some hypothetical dumb person who constantly gets healthcare for stupid reasons and never considered stopping? Or should I personally try not to get health care in situations like this one? If I value my son looking like a normal human being when he grows up at (let’s say) $30,000, then it seems like I should only need a 1% credence that this therapy works before I spend $300 on it (and my real credence is much higher than that). Is this Pascal’s Wager? But doesn’t all medicine have those kinds of odds? What do I do?

On Cancer, Heart Attack, Etc Survival Rates

Moving through more specific sections of his post.

Hanson:

First, my claim of a near zero marginal health gain from more medicine on average is consistent with some particular kinds of medicine having a positive marginal gains. We name some plausible candidates in our book. Cancer and heart attacks could also be among them. Or maybe just childhood cancer.

I don’t understand how this isn’t the CATO economist position. “Keep the good stuff that we know works, and look for likely-not-to-work forms of treatment around the edges that we can cut”. I certainly have some guesses about forms of medicine that don’t work - most surgeries for back pain are in this category. Do they add up to half of all medicine? I’m not sure, but if Robin agrees that this is the discussion, we can compare our lists and try to figure it out.

Second, to be relevant to my claim these treatments need to be of the sort that many people get but many others do not. I’m willing to presume that cancer and heart attack treatment fall into this category, but Scott doesn’t show this.

Again, if Robin’s claim is that medicine is only useless on the margins, we’re much closer to agreement. But I don’t know how that meshes with saying that maybe antibiotics don’t help, or that we can’t possibly distinguish marginal from core, or that health spending is mostly signaling (as opposed to a mix of people correctly spending money on health because they know it’s great and will help them, plus some extra from people not being scientists and not knowing which treatments are good or bad).

Third, Scott is well aware that many others attribute much of these changes to the population getting generally healthier over time, and thus better able at each age to deal with all disease, and also to earlier screening, which catches cases that would never get very bad. He judges:

» Although some of this is confounded by improved screening, this is unlikely to explain more than about 20-50% of the effect. The remainder is probably a real improvement in treatment.

But he seems well aware that many other specialists judge differently here.

But the link is to a blog post where I examine this and find many studies showing, I think very clearly, that it really is medicine and not screening! Yes, other people think differently, but the link you’re using is a post about why they’re wrong!

But also, why is Robin objecting to this! I thought he was admitting that cancer treatment is maybe potentially good! This is why I find this conversation so frustrating. Mention a medical treatment to Robin, even one of the “good ones” like cancer or antibiotics, and he’ll try to argue that maybe the evidence it works is being misinterpreted, and in fact it’s unclear how well it works. Then I say he thinks this stuff might not work, and he accuses me of straw-manning him.

I would have no objection if there were, in fact, some evidence that cancer treatment was useless, and he was trying to bring it to my attention. But all he’s doing is linking my post showing that it’s not true, plus the article I started my post with as an example of the false narrative I’m trying to correct.

On Insurance Experiments

I don’t see Hanson responding to my main point, which is that the insurance experiments show signs of having their power fail at random points in the causal chain, rather than showing anything about medicine. Just to rehash this for people who forgot:

  • The Karnataka experiment couldn’t show that insurance made people more likely to give birth in hospitals, or more likely to have a doctor tell them that their blood pressure was too high, or basically any outcome related to how much care they were getting, let alone whether that care worked.

  • The Oregon experiment found people got more diabetes drugs, but not that they had less diabetes. However, if you do a power calculation based on the increase in diabetes drugs and the known effect of diabetes drugs, we find that the experiment wouldn’t have detected it even if it was there.

  • …and the same is true of hypertension and most of the other things they measured.

  • The RAND outcomes I found were mostly things that doctors had no medicine for in the 1970s when the study was conducted. For example, they measured the effect of health care on obesity, but there were no good obesity drugs in the 1970s.

Instead, he discusses small quibbles with how I describe certain results. I’ll go through these quibbles, but I want to make it clear I don’t think they matter very much, and I would much rather talk about the main point. Going through the quibbles:

Scott sees the first three as too underpowered to find interesting results. He found the results of RAND “moderately surprising”, but thinks “it’s a stretch to attribute [p = 0.03 blood pressure result] to random noise”, even if its the only result out of 30 at p<0.05.

I find it unfair to present this claim without presenting my reasoning, which is that there’s a whole other paper, How Free Care Reduced Hypertension In The RAND Health Insurance Experiment, which does various sanity checks to this result, finds that it holds up, and finds related claims with lower p-values.

Scott calls Karnataka a “study where the intervention didn’t affect the amount of medical care people got very much” as “they were unable to find a direct effect of giving people free insurance on those people using insurance, at all, in the 3.5 year study period!” But I see the study as reporting big utilization effects:

» The average annual insurance utilization rate at 18 months (3.5 years) is 13.46% (2.56%) in the free-insurance arms versus 7.72% (0.64%) in the control arm. On average this effect amounts to a 74.35% (400%) increase in insurance utilization at 18 months (3.5 years).

I said in my original post that utilization rates increased when spillovers were taken into account, but did not directly increase for the insured individuals. He is quoting the first half of a section that then goes on to say “Spillovers play an important role in boosting utilization. At 18 months and 3.5 years, ITT estimates of the direct effects of insurance access are not significant.”

So this was exactly what I said in my post, except that Robin takes out my explanation and quotes only half of the section, so that I look like a moron who didn’t read the paper.

This seems to me a non-trivial constraint on medical effectiveness:

» We cannot rule out clinically-significant health effects, on average equal to 11% (8.8%) of the standard deviation for each health outcome in ITT (CATE) analyses.

(They can rule out larger effects.)

Again, this isn’t just about the effects of medicine. The outcomes Hanson is talking about include many things like giving birth in a hospital, or having surgery, or being told you have arthritis. If insurance can’t improve the likelihood of these things, it’s failing to connect with the medical system at all, not some kind of evidence that medicine doesn’t work.

On the Goldin paper:

Now while their OLS estimate of the effect of treatment on mortality is only significant at the 1% level (and that exaggerated by selection bias), their OLS estimate of the effect of more insurance on mortality looks much stronger. At least if we could believe their Table IV which gives an estimate there of -0.026 and a standard error 0.001, for a ratio of 26! But as they never even discuss this crazy huge significance in the text, I have to suspect that this is just a table typo.

Other people have pointed this out and I’m not sure what’s going on here. Cremieux thinks all of Goldin might be a Lindley’s paradox situation (suggesting it isn’t a real effect), and I’m trying to clear this up with him and figure out if he’s right. I think my case is still strong if we stick with the lower treatment effect or ignore Goldin entirely.

As Scott knows, we have a huge problem of selective publication and specification search (“p-hacking”), especially in medicine, which is why I’m suspicious of the few “quasi-experimental studies” that find big health gains from medicine.

In 2004, the International Committee of Medical Journal Editors released a statement saying they would no longer accept non-pre-registered studies that started after July 1, 2005. Since then, all studies have had to pre-specify their protocol, making p-hacking much harder. You can see the results on trial composition here:

…and on p-hacking levels here:

Obviously this doesn’t mean that there’s no possible way medical studies could ever be biased. But I worry that people act like “studies can be p-hacked” is some sort of secret knowledge that elevates them above domain experts. Medical evidence is processed by groups like NICE and the Cochrane Collaboration that have been worrying about p-hacking since 2004, trying to factor its existence into their recommendations, and organizing pretty successful campaigns among journals and other stakeholders to minimize it. This doesn’t mean everything is perfect, but I think we’re beyond the level where you can say “what about p-hacking” and use it to throw out every clinical study in favor of for three social science experiments that explicitly admit they don’t have enough power to test these kinds of questions.

I know that typical regressions of health on medicine find no effect, and also that medical errors and prescription drugs cause huge numbers of deaths. Thus I focus on our few best studies: randomized experiments.

The first link goes to a study that does not try to quantify the number of deaths from medical error. The second goes to a claim that that prescription drugs are the third leading cause of death. For the latter, I would recommend reading Medical Error Is Not The Third Leading Cause Of Death and ”Medical errors are the third leading cause of death” and other statistics you should question. These numbers usually come from massively overcounting medical errors from studies not intended to quantify them, from calling any death that happens after a medical error a result of a medical error, and from ignoring the many more sober estimates of medical error fatality rate that have been published. This isn’t to say that medical errors aren’t real and serious, just that I don’t think many people now continue to defend that particular claim.

While many studies claim to show otherwise for specific treatments, those tend to be quite biased, pushing me to focus on our best studies: randomized trials of aggregate medicine. I say that they still consistently fail to find clear effects.

I’ve tried to explain how thoroughly I disagree with this claim, but let me try one more time.

Suppose we want to know something simple, like whether being shot with a gun can kill someone.

One option is that we get trials where we shoot a thousand people with guns, shoot another thousand people with placebo guns (blanks), and see how many in each group die. Maybe we could do this a hundred times, for every different type of gun. Maybe we’d even find that some guns (eg BB guns) don’t kill people, and we could replicate that a dozen times. I believe this method would be very decisive.

But maybe this would be “biased”. Maybe the only unbiased way to test this, for some reason, is to give a thousand people a special voucher that they can use to buy guns, and leave another thousand people as a control. Then we wait two years, and see whether the voucher group gets convicted of murder more often than the control group.

And maybe in fact we do this, and we find that there are three more convicted murderers in the voucher group, but this isn’t statistically significant.

Do we conclude that “being shot with a gun can’t kill you”? Or “the marginal gun can’t kill you?”

No. Among many other possible ways this could go wrong, we might find that only three extra people in the voucher group tried shooting someone. This exactly corresponds to our three extra deaths, consistent with a 100% death rate from being shot. But if you don’t ask this question, and you just stop at “well, there were only three extra murders, which isn’t statistically significant”, then it looks like getting shot with a gun can’t kill you.

I don’t understand why you would prefer the second form of study over the first, especially if you are going to summarize its results as “guns aren’t dangerous”.

(…the marginal gun isn’t dangerous? Some guns are dangerous, but we can’t tell which ones? Some guns are dangerous, we can tell which ones, and we should just focus on those?)

Maybe I’m still misunderstanding Robin. I look forward to him clarifying his position further.

In case my own position isn’t clear: I think lots of medicine is useless, and that most doctors would agree with this. We over-order tests when we don’t need them, we do a lot of ineffective stuff to please patients (starting with antibiotics for viral illnesses, but sometimes going up to surgeries that have only placebo value), and we do lots of treatments that we know fail >90% of the time, like certain kinds of rehab for drug addiction (we tell ourselves we’re doing it because the tiny number of people who do benefit deserve a chance, but a rational health bureaucrat who wants to save money might not see it that way). Does all this add up to half? I’m not sure. But I think we can work on cutting back on this stuff without saying things like “maybe medicine is just about signaling” or “how do we know if any of it works?” or “you can’t trust clinical trials because they’re all biased”, and that it very very much matters which parts of medicine we cut.

I also don’t think the insurance studies tell us anything one way or the other here, and I have no confidence that the things they cast doubt upon are the things we should really be doubting.