Highlights From The Comments On The Lab Leak Debate
Original post here. Table of contents below. I want to especially highlight three things.
First, Saar wrote a response to my post (and to zoonosis arguments in general). I’ve put a summary and some my responses at 1.11, but you can read the full post on the Rootclaim blog.
Second, I kind of made fun of Peter for giving some very extreme odds, and I mentioned they were sort of trolling, but he’s convinced me they were 100% trolling. Many people held these poorly-done calculations against Peter, so I want to make it clear that’s my fault for mis-presenting it. See 3.1 for more details.
Third, in my original post, I failed to mention that Peter also has a blog, including a post summing up his COVID origins argument.
Thanks to some people who want to remain anonymous for helping me with this post. Any remaining errors are my own.
1: Comments Arguing Against Zoonosis
— 1.1: Is COVID different from other zoonoses?
— 1.2: Were the raccoon-dogs wild-caught?
— 1.3: 92 early cases
— 1.4: COVID in Brazilian wastewater
— 1.5 Biorealism’s 16 arguments
— 1.6: DrJayChou’s 7 arguments
— 1.7: How much should coverup worry us?
— 1.8: Have Worobey and Pekar been debunked?
— 1.9: Was there ascertainment bias in early cases
— 1.10: Connor Reed / Gwern on cats
— 1.11: Rootclaim’s response to my post
2: Comments Arguing Against Lab Leak
— 2.1: Is the pandemic starting near WIV reverse correlation?
3: Other Points That Came Up
— 3.1: Apology to Peter re: extreme odds
— 3.2: Tobias Schneider on Rootclaim’s Syria Analysis
— 3.3: Closing thoughts on Rootclaim
4: Summary And Updates
1: Comments Arguing Against Zoonosis
1.1: Is COVID different from other zoonoses?
Simon stats wrote:
It is important to consider how different SARS-CoV-2 is to other zoonoses. I have a challenge for zoonosis proponents to find me a zoonosis with all of the following features
- Spillover occurred after 2000 when sequencing became much cheaper
- There were more than a hundred human cases
- There are zero infected animals.
This characterises SARS-CoV-2, but no other zoonosis meets these criteria. Why?
hmm answers the challenge:
2013-16 West African Ebola outbreak; almost 30k cases, no animal intermediate.
This is probably the most notable zoonotic episode of the last 20 years apart from SARS-2, I’m surprised you missed it.
There are also 7 other Ebola outbreaks that match your criteria.
We’ve been studying Ebola for over 40 years and have yet to determine the animal reservoir. It took 20 years to identify the reservoir for HIV-1’s progenitor. Sometimes finding the reservoir is easy, sometimes it’s hard. Typically it is easy when you have lots of cases and the virus is not very efficient at human-to-human transmission, because that necessitates lots of separate zoonotic events, which necessitates lots of infected animals. For something that spreads fast (i.e., the kind of virus likely to start a pandemic), you don’t need a big reservoir, so you have a smaller target. For example, we did find the reservoir for the 2009 flu pandemic, but it took 7 years: https://elifesciences.org/articles/16777
HKU1 might also fit these criteria. It’s a coronavirus discovered in 2004 that seems to have spilled over in China and spread globally (it’s fine; it just causes yet another subtype of common cold). The exact animal reservoir has never been identified, although Wikipedia says it “likely originated from rodents”.
The classic case where we did find infected animals was SARS (which came from civets). It took six months of careful research. Most of the civet farms and civet wet markets were negative. Even at farms with some positive civets, other civets were negative. Twenty years later, it’s still not obvious civets were the definitive intermediate host, rather than some other animal that got caught in the crossfire.
Meanwhile, a few weeks after COVID was discovered, China killed all the animals in the market without testing any raccoon-dogs for COVID. Then they told all nearby raccoon-dog farms to kill all their raccoon-dogs too. Then they banned Chinese scientists from researching the origins of COVID. Probably this is part of why we eventually found an intermediate host for SARS1 and not COVID.
Simon objects that although it was hard to find the exact civets responsible for SARS, we did later find that lots of civet handlers had SARS antibodies (even though they didn’t remember getting sick).
There’s different types of evidence for infected civets, some of which comes from higher seroprevalence among civet traders.
In May 2003, Guan et al (2003) identified SARS-CoV-like virus in animals in a live-animal market in Shenzhen, Guangdong Province, China. Guan et al (2003) also tested for antibodies among workers in the market. They note that “8 out of 20 (40%) of the wild-animal traders and 3 of 15 (20%) of those who slaughter these animals had evidence of antibody, only 1 (5%) of 20 vegetable traders was seropositive.” This suggests that the majority of the infections of the 11 people with close contact with animals were zoonotic. Among 508 animal traders, 66 (13%) tested positive for IgG antibody to SARS associated coronavirus by ELISA, while the control groups including hospital workers, Guangdong CDC workers, and healthy adults at clinic had an antibody prevalence of 1–3%.
I agree this is an important point. During the debate, Peter said that if we had tested lots of raccoon-dog handlers using the Guan et al methodology, we might have found they also had COVID antibodies. Unfortunately nobody did, and it’s too late, because by now the raccoon-dog handlers have probably gotten COVID the normal way.
1.2: Were the raccoon-dogs wild-caught?
Simon statswrote:
What you say about raccoon dogs here is mistaken. Raccoon dogs are not a plausible intermediate host for sars-cov-2 on the basis of information that has been known since 2021. There are several considerations.
1. Xiao et al (2021) - https://www.nature.com/articles/s41598-021-91470-2%E2%80%8B%E2%80%8B%E2%80%8B , which includes a co-author of Worobey et al (2022), a leading zoonosis paper states in table 1 that the raccoon dogs were wild caught in Hubei, not farmed as you assert in the piece. This alone rules out raccoon dogs as plausible hosts for two independently sufficient reasons. Firstly, there is unanimity in the literature that the bat ancestral virus to SARS-CoV-2 is in southern Yunnan or South East Asia. Everyone agrees with this, including Shi Zhengli. If a species was wild caught in Hubei, then there would be no explanation of how it acquired the ancestral bat virus, given that Hubei is 1000 miles from southern Yunnan.
Secondly, a mystery of sars-cov-2 is how it acquired the furin cleavage site that makes it so transmissible. There are 850 known sars-like coronaviruses, and only one with a furin cleavage site. According to private messages exchanged by proponents of zoonosis, the furin cleavage site could not have been acquired in the market because the density of animals was too low (only 3-4 per cage). When avian influenza acquires a furin cleavage site that occurs on farms with thousands of chickens densely packed, i.e. not in the wild and not when there are a handful of animals in cages in a market. https://usrtk.org/covid-19-origins/visual-timeline-proximal-origin/
2. Wang et al (2022) https://academic.oup.com/ve/article/8/1/veac046/6601809 also confirms that the raccoon dogs were wild caught in Hubei. What’s more, Wang et al (2022) tested 15 wild raccoon dogs of suppliers of Wuhan markets, including the Huanan market, in January 2020 and found them to be negative for SARS-CoV-2. On average, 38 raccoon dogs were sold across the four markets in Wuhan from 2017 to 2019. So, the 15 raccoon dogs likely comprised nearly the whole inventory of raccoon dogs that would have been supplied to the Huanan market at the time […]
Xiao et al (2021) has a list of species sold at the Huanan market. I would encourage you to read that list and suggest which animals you think are plausible, and I will tell you why they are not actually plausible.
Xiao (2021) Table 1 only says that some raccoon dogs in Wuhan had wounds, suggesting they were wild-caught. It makes no claims that all raccoon-dogs were wild-caught. There are dozens of raccoon-dog farms in the same province as Wuhan.
IIUC Wang (2022) says that 38 raccoon dogs were sold in Wuhan per month , not 38 during the whole two-year study period, so the claim that the traders in Wang represent the whole supply fails. [EDIT: Possibly I misunderstood this, see here]
The raccoon dogs were tested for active infection, not serology, so you would have had to catch a raccoon-dog in the act of having COVID to see anything. Remember, during SARS, scientists kept testing the farms that supplied the wet markets where it was spilling over, and most farms were negative.
The wildlife trade in China is complicated, and sometimes involves a permeable barrier between farm and wilderness. Farmed and wild animals are kept in the same pens, packed in the same crates, and sold at the same stalls.
Suppose you know that one of the animals in the middle crate on the right was caught in some safe, disease-free way, 500 km away, three months ago. How confident does that make you feel?
To answer the question about which animals in the Xiao paper are plausible: at least civets and bamboo rats. SARS spread back and forth in some kind of weird net between civets, raccoon dogs, and a bunch of made-up-sounding animals like “ferret-badgers” and “greater hog-badgers”. For all we know, COVID could have done likewise.
If all of this sounds desperate and wishy-washy, imagine an alien who comes to Earth, hangs out at Area 51, and catches COVID. She theorizes that she got it from humans. She’s heard that the humans at Area 51 came from schools, so she abducts fifteen humans from a nearby school and gives them COVID throat swabs. None of them are positive, so she announces that humans can’t be a COVID intermediate host. Other aliens suggest further testing, but she has already vaporized Earth, just in case, so the further testing never gets done.
Simon added:
Even the strongest proponents of the raccoon dog hypothesis have walked back their bold claims that raccoon dogs are the host.
I asked a scientist whose name is on some of the original raccoon-dog papers if this was true. He said:
I secretly root for other intermediate hosts. Bamboo rats or civets would be really fascinating and have flown under the radar. But it’s been really hard to bet against raccoon dogs. First we learn they can transmit and the virus didn’t change when transmitted between them (Freuling 2020)? Then turns out they’re sold in the market (Xiao 2021)? Then it turns out they’re freaking everywhere in the genetic data from the market, the most common mammal detected? Then it turns out the market animals aren’t from northern China fur farms? It’s been a tough road for those betting against them….
1.3: 92 Early Cases
There was a long multi-branching thread of arguments centered around 92 early cases, for examplehere:
My understanding of the situation: the first officially-confirmed case of COVID started December 11, 2019. Later in the pandemic, in 2021, the World Health Organization wanted to figure out if that was really the first, or whether there had been earlier ones. They scoured Chinese hospital records for illnesses that might be COVID during the two months before the official discovery (ie early October to early December) In particular, they asked Wuhan hospitals for records of any cases of fever, flu, respiratory illness, and pneumonia. The hospital gave them 76,253 cases, because China is big and flu is common. This was slightly more cases than usual, but there was a normal flu spreading too, so the researchers didn’t find this very compelling.
Then they narrowed these cases down to those that were “clinically compatible” with COVID, and ended up with 92.
Then they went over those 92 more carefully, including “review by the external multidisciplinary clinical team” and blood draws from the former patients. They were able to track down 67 of the 92. The clinical team decided none of those 92 cases really resembled COVID, and the blood draws were all negative. They published this as the results of their study:
The retrospective search for cases compatible with COVID-19 illness identified 76 253 episodes with one of four indicator conditions. A rise in one of these conditions, [acute respiratory illness] (as well as [flu-like illness] and fever), was seen in this group of individuals in the over-60-year age group in early December. The clinical assessment of the 76,253 individuals revealed 92 cases clinically compatible with COVID-19. It is possible that the application of stringent clinical criteria, resulting in the identification of only 92 clinically compatible cases, may have decreased the possibility of identifying a group or groups of cases with milder illness. All the 92 cases were rejected as cases of SARS-CoV-2 infection on further clinical review.
None of these cases (where blood could be obtained) was positive on SARS-CoV-2 serological testing carried out more than 12 months later. The use of retrospective serological testing so long after the illness cannot be relied on to exclude the possibility of SARS-CoV-2 infection at the time of the presenting illness, given the possible drop in SARS-CoV-2-specific antibody over time and the associated reduced sensitivity of commercial assays. The possibility that earlier transmission of SARS-CoV-2 infection was occurring in this community cannot be excluded on the basis of this evidence.
In other words “we looked for early COVID, we didn’t find any, but we can’t promise we didn’t miss anything”.
On Twitter, Giles Demaneuf makes an interesting point. The researchers took the samples in 2021, when China was in Zero COVID. When the Wuhan outbreak was finally contained in early 2020, 4.4% of Wuhanites had contracted COVID. So isn’t it surprising that 0/67 of the former patients who the researchers tested were had antibodies to COVID? The chance that 67 randomly-selected people in a population with 4.4% prevalence rate are all negative is only about 5%. Is this evidence of foul play?
No. See the conclusions section of the report, which said: “The use of retrospective serological testing so long after the illness cannot be relied on to exclude the possibility of SARS-CoV-2 infection at the time of the presenting illness, given the possible drop in SARS-CoV-2-specific antibody over time and the associated reduced sensitivity of commercial assays”.
You have a lot of COVID antibodies just after getting COVID. By a year or so afterwards, you might not have enough to detect. So it’s not surprising the WHO study didn’t detect any.
Why did they even try looking for antibodies? There seem to be two reasons not to: first, they should have known antibodies would decay after a year. Second, even if some of them did have antibodies, how would we know they weren’t just infected in spring 2020 like everyone else?
They don’t say. My guess: antibody decay is very variable. Some people’s antibodies might last more than a year. So if they found that way more than 4.4% of people had antibodies, that would be surprising and suggest that most of them had had COVID in autumn 2019. But instead they found that nobody had antibodies, which is consistent with one or two of them getting sick when everyone else got sick, and having their antibodies decay at the normal rate. But also, I think the antibodies were just intended to supplement the clinical review, and not be a very important part of their determination.
I think this study is moderately strong evidence that there wasn’t much COVID going around before December 2019. Doctors looked for cases, they winnowed them down into the cases that looked most like COVID, but when they examined those cases closely, they didn’t look enough like COVID to be interesting. I don’t think the antibody tests add or subtract much from this assessment.
I would be fine if someone else said they don’t think the WHO report provides much evidence either way. The main thing I want to insist on is that there’s no conspiracy to hide 92 previously-undiscovered cases. They searched really hard for potential cases, they subjected the most plausible candidates for further review, and then they decided those ones were not, in fact, COVID.
(You can read all of this here. It’s not a very good description and I’d be interested if someone has a more thorough writeup of the research.)
This was just one of many efforts that researchers made to try to identify pre-December-2020 COVID cases. For example, 30,000 people donated blood in autumn 2019, and the hospitals still had most of it. So they tested the blood samples for COVID antibodies and didn’t find any. I don’t think antibodies decay in stored blood samples (I might be wrong). There are 12 million people in Wuhan, so if even a few hundred people had COVID during that time, one of them should have turned up. None of them did.
Finally, during COVID’s officially-recognized existence, its numbers doubled about once every 3.5 days. Again, if COVID existed a month earlier than previously believed, then it would be 256x more common than expected. This would be hard to miss! Nobody found evidence from excess mortality that COVID was 256x more common than expected.
I’m using the version of the doubling time argument because it’s simple enough for me to understand, and I don’t have to worry about anyone trying to hide something in their complex model. It’s not exactly true, but it’s true enough to rule out COVID starting much before November 2019. If you want the fancy official version, it’s in Pekar 2021 and looks like this:
This alone isn’t fatal to lab leak. It’s perfectly possible for the lab to leak (let’s say) November 5th, the virus spreads a bit, and then a month later someone goes to the wet market, coughs on a vendor, and starts the officially recognized pandemic.
But if that were true, you’d expect (let’s say) 30 cases by early December. Let’s say the wet market vendor was exactly Case # 30. She infected the other wet market vendors, starting a pandemic with an obvious center at the wet market and lots of infected wet market vendors and patrons. What about Case # 29? If they were (let’s say) a barista, how come they didn’t infect people at their coffee shop? How come there wasn’t a second obvious cluster radiating out from a coffee shop, lots of coffee-shop-linked cases, etc? How come there weren’t 30 equally-sized clusters?
In order to avoid this, you either need to claim that the wet market was a perfect superspreader location, or that the pattern with lots of cases in the wet market and few-to-none anywhere else was a result of ascertainment bias. Saar made both those arguments during the debate, but I thought Peter rebutted them effectively.
1.4: COVID in Brazilian wastewater
Nicholas Halden (blog) writes:
What should we make of this study, which found the presence of covid in Brazilian wastewater in late 2019?
Consider the doubling times.
The study says that scientists working in late 2020 found COVID in samples of Brazilian wastewater from November 27, 2019. This was long before the first detected case of transmission in Brazil on March 13, 2020.
Between November 27, 2019 and March 13, 2020 is about 16 weeks, so 32 COVID doubling times. 32 doubling times with no lockdown is enough time for COVID to infect every single person in Brazil. If COVID had infected everyone in Brazil before the first recognized case, we would have noticed.
(again, COVID doubling time isn’t exactly invariably 3.5 days, but here we’re talking about numbers big enough that the exact details don’t matter very much)
So if COVID was in Brazil on November 27, it must have fizzled out instead of going pandemic. How likely is that? If one person had COVID, it’s not too unlikely - not all COVID cases transmit it forward. If (let’s say) twenty people had COVID, it’s very unlikely - at that point, the law of large numbers takes over; in a freak coincidence, every single patient would have to fail to infect anyone else. So almost certainly fewer than 20 people in Brazil had COVID in November 27.
So which is more likely - that somehow 20 people had COVID long before the virus was officially detected, and on a totally different continent, yet somehow a scientist looking through wastewater found the water from exactly those people and managed to detect the virus? Or that there was a sampling error, which happens all the time in these kinds of things?
Peter wrote a blog post on some of these issues. He found that there were positive tests from wastewater samples as early as March 2019, which doesn’t fit anyone’s timeline, including lab leakers’. And most of these positives (including the Brazilian sample) contained later strains of the virus with mutations it picked up late in 2020. So these were almost certainly false positives from contamination.
1.5: Biorealism’s 16 arguments
Biorealism has a list of sixteen arguments, which he liked so much that he posted it three times in the ACX comments, twice on Less Wrong, twice on Manifold, and about a dozen times on Twitter under multiple account names. Some posts were slightly different from others, but a typical version is:
Importantly, Miller incorrectly claimed the N501Y mutation would result from passage in hACE2 mice (mixed them up with BALB/c mice). The major papers Miller relied on have been seriously challenged since the debate. See Stoyan and Chiu (2024), Weissman (2024), Bloom (2023) and Lv et al (2024). Overall the circumstantial evidence makes lab v plausible:
Peter admitted getting this wrong during the debate. I think this very minor point about mice mutations was approximately his only mistake in 15 hours of debating, and he admitted it as soon as he noticed. Biorealism somehow heard about this (obviously not through watching the debate, as we’ll see in a moment), then left about 20-30 comments starting with it, under various accounts, on various platforms, as if it somehow discredited Peter. This is making me somewhat less charitable to him and his 16 arguments than I would be otherwise.
1. Chinese researchers Botao & Lei Xiao observed lab origin was likely given the nearest known relatives to SARS-CoV-2 were far from Wuhan. Wuhan Institute of Virology (WIV) sampled SARS-related bat coronaviruses where the nearest relatives are found in Yunnan, Laos and Vietnam ~1500km away. They refuse to share their records.
The ancestral viruses of SARS were found equally far from where SARS spilled over into humans, so we know it’s possible (and likely) for viruses to travel that far.
2. Patrick Berche, DG at Institut Pasteur in Lille 2014-18, notes you would expect secondary outbreaks if it arose via the live animal trade. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234839/
There are constant outbreaks of weird coronaviruses in animal handlers. See eg this paper, which estimates about 60,000 of these per year. None of these ever go anywhere, because the farmers are in rural areas that aren’t dense enough to sustain a high R0, and the epidemic fizzles out after a single digit number of cases. Any early outbreaks of COVID would have vanished into this long and mostly unnoticed list.
3. Molecular data: Only sarbecovirus with a furin cleavage site. Well adapted to human ACE2 cells. Low genetic diversity indicating a lack of prior circulation (Berche 2023).
Restriction site SARS-CoV-2 BsaI/BsmBI restriction map falls neatly within the ideal range for a reverse genetics system and used previously at WIV and UNC. Ngram analysis of the codon usage per Professor Louis Nemzer
https://twitter.com/BiophysicsFL/status/1667232580255490053?t=IJgitS5cw364ioclzVWxaA&s=19
The SARS2 backbone is very low in CG and CpG. While the 12-nt insert that gives it the FCS is extremely high in both. Almost as if it was some kind of chimera of a consensus sequence and a codon-optimized polybasic cleavage site?
https://twitter.com/BiophysicsFL/status/1752800486837678377?t=EpIRgyybJVaPgeMP5xdstA&s=19
Most of this was discussed extensively in the second session of the debate, which I recommend.
The CGG-CGG arginine codon usage is particularly unusual but used in synthetic biology.
I asked a synthetic biologist about this. He said:
» “Nope. I would literally never do this if I was designing a small insert (maybe I wouldn’t notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn’t engineer aviral genome using human codon usage. An engineer would not do it.”
4. DEFUSE full proposal: virus 20% different from SARS1, consensus seq assembled with 6 segments, without disrupting coding seq, BsmBI order, FCS. SARS2: 20% different than SARS1, 6 evenly spaced fragments w BsmBI and BsaI restriction sites, FCS.
Jesse Bloom, Jack Nunberg, Robert Townley, Alexandre Hassanin have observed this workflow could have lead to SARS-CoV-2. Work often begins before funding sought or goes ahead anyway.
Re: 4 - Also scattered across second section of debate, also not going to retread
5. Market cases were all lineage B. Lv et al (2024) indicates there was a single point of emergence and A came before B. So market cases not the primary cases. See also Bloom (2021), Kumar et al (2022). Peter Ben Embarek said there were likely already thousands of cases in Wuhan in December 2019.https://t.co/50kFV9zSb6
https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/34398234/
https://academic.oup.com/bioinformatics/article/38/10/2719/6553661
There was a Lineage A sample in the market, lab leak proponents just try to ignore/dismiss/conspiracize it away. The first two known Lineage A cases were very close to the market. Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi) found some weird COVID variants in Shanghai that might or might not mean anything; you can see some discussion of the implications here, but I don’t think they’re strong evidence either way. If A was first, it means some really weird stuff coincidences have to happen to give us the spread rates and genetic clock data we get, but they’re not necessarily weirder in the zoonosis hypothesis than the lab leak one.
The claim that there were “thousands of cases in Wuhan in December 2019” is very easy to disprove by doubling rate arguments like the one above, by the blood bank study mentioned above, by the WHO’s failed case search, and by many other lines of argument.
6. Evidence for lineage A in the market is based on a low quality sample according to Liu et. al. (2023).
I really think lab leakers need to decide whether they think China is a sinister actor trying to cover up the truth, or whether they should trust every offhand comment by Chinese government officials as gospel. Dr. Liu doesn’t explain in what sense he thinks the Lineage A sample is “low-quality”, and the Western scientists who I asked about this said they didn’t understand this complaint and that the sample was fine. A Western team re-analyzing the same sample describes it as “conclusively contain[ing] Lineage A.” I think most lab leakers have switched from trying to deny the genetics to claiming that this was “contamination”, which also doesn’t make sense (the sample is genetically very early). Note that aside from this sample, the first two Lineage A cases discovered were both very close to the wet market.
7. Bloom (2023) shows market samples do not support market origin. There is also no evidence of transmission in the claimed susceptible animals elsewhere. https://academic.oup.com/ve/advance-article/doi/10.1093/ve/vead089/7504441
Discussed extensively in my article as well as the first section of the debate.
8. Lineage A and B only two mutations apart. François Ballox, Bloom and Virginie Courtier-Orgogozo note this is unlikely to reflect two separate animal spillovers as opposed to incomplete case ascertainment of human to human transmission (Bloom 2021).
Discussed extensively in my article as well as the first section of the debate.
9. Sampling bias. George Gao, Chinese CDC head at the time, acknowledged to the BBC stating they may have focused too much on and around the market and missed cases on the other side of the city. David Bahry outlines the documented bias. Michael Weissman has shown this mathematically.
https://journals.asm.org/doi/10.1128/mbio.00313-23
https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556
Re: Dr. Gao, see above comment about Chinese officials. See the section Ascertainment Bias below for why I disagree with this specific claim, which also addresses the Michael Weissman argument.
10. Spatial statistics experts show the Worobey claim the market was the early epicentre was flawed.
https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954
Re: 10 - See Confirmation Of The Centrality Of The Huanan Market Among Early COVID-19 Cases, a response to the paper you cite:
The centrality of Wuhan’s Huanan market in maps of December 2019 COVID-19 case residential locations, established by Worobey et al. (2022a), has recently been challenged by Stoyan and Chiu (2024, SC2024). SC2024 proposed a statistical test based on the premise that the measure of central tendency (hereafter, “centre”) of a sample of case locations must coincide with the exact point from which local transmission began. Here we show that this premise is erroneous. SC2024 put forward two alternative centres (centroid and mode) to the centre-point which was used by Worobey et al. for some analyses, and proposed a bootstrapping method, based on their premise, to test whether a particular location is consistent with it being the point source of transmission.
We show that SC2024’s concerns about the use of centre-points are inconsequential, and that use of centroids for these data is inadvisable. The mode is an appropriate, even optimal, choice as centre; however, contrary to SC2024’s results, we demonstrate that with proper implementation of their methods, the mode falls at the entrance of a parking lot at the market itself, and the 95% confidence region around the mode includes the market. Thus, the market cannot be rejected as central even by SC2024’s overly stringent statistical test.
I think this response is pretty strong. In one analysis, they show that even though the other paper’s methodology is worse than theirs, if you apply it correctly (instead of inappropriately excluding various cases like the paper’s authors did), the center of all early cases in Hubei province lands on the wet market parking lot. In another analysis, they show that the other paper’s recommended tests wouldn’t have correctly pointed to the offending water pump in the famous John Snow cholera outbreak, but theirs would have.
Still, I think it’s useful to supplement fancy statistics with normal common sense, so I recommend just looking at the map of early cases:
…and deciding whether you think the assumptions behind a specific statistical test are likely to debunk the idea that cases are centered around the wet market.
11. Wuhan used as a control for a 2015 serological study on SARS-related bat coronaviruses due to its urban location.
I don’t know why this point is supposed to matter. If you mean that Wuhan isn’t directly exposed to bats, nobody ever said it was. The zoonotic theory is that wildlife carted in from other areas of China started the pandemic in the wet market.
12. Superspreader events also seen at wet markets in Beijing and Singapore (Xinfadi and Jurong).
This was discussed very extensively in the debates, both in section 1 and section 3. Wet markets weren’t “superspreader locations” - in fact, the disease spread no more quickly there than anywhere else. They were the first place in those cities that the pandemic started, due to contaminated animal products. If anything, this supports zoonosis. See also my discussion with Saar on this point below.
13. WIV refuse to share their records with NIH who terminated subaward in 2022. Wider suspension over biosafety concerns. https://www.bloomberg.com/news/articles/2023-07-18/us-suspends-wuhan-institute-funds-over-covid-stonewalling
Although WIV has not been especially forthcoming, some of their databases were leaked in various ways and showed that they did not have any viruses capable of transforming into COVID.
14. PLA involvement at WIV and MERS research prior to SARS-COV-2. MERS features several similarities with SARS-CoV-2.
I can’t even tell what conspiracy theory you’re trying to propose with this one; if you spell it out I can try to explain why it might be false.
15. SARS1 leaked several times and SARS-COV-2 has leaked from a BSL-3 lab in Taiwan.
Agreed that SARS leaked several times. It also spilled over from animals several times. During the debate, a lab leak rate of once per lab per 500 years was proposed (everyone agreed to steelman this by 10x for WIV numbers); I would be interested to know whether anything about the study of SARS challenges that number.
16. Unpublished infectious clone identified from Wuhan contradicting arguments such reverse genetics systems would be published.
https://www.biorxiv.org/content/10.1101/2023.02.12.528210v1.full
I asked some scientists about this paper and here’s what they told me. Wuhan University sequenced some rice. In the middle of the sequence, there’s an unexpected sequence from a common coronavirus, HKU4. The most likely explanation is that someone else in Wuhan was working on the coronavirus and there was cross-contamination. Plausibly this is Wuhan Institute of Virology, who is known to work with coronaviruses. This is cool detective work, but it’s not clear what it’s supposed to prove. I think some lab leakers are using it to prove that WIV can do reverse genetics, but they admitted this already in a published paper so that’s not too helpful. I think others are using it to prove WIV had “secret viruses” in their catalogue, but the rice virus wasn’t secret, it was HKU4, which is common and which WIV has already published papers about.
1.6: DrJayChou’s 7 Arguments
Once again, I cannot stress enough how much better a take you might have on this debate if you watch it.
-
“The first known case predates the market outbreak by a month” - this is not the consensus position. I cannot say for sure what Dr. Chou means by this, but I suspect he’s referring to one of the many claims to this effect that Peter effectively debunked during the debate (Connor Reed, Mr. Chen, the 92 cases, Brazil, etc).
-
“Genetic analyses put the realistic start date around Sept/Oct” - see the section on Brazil above for the many reasons this is impossible. Pekar, the most-cited genetic analysis, puts the origin in November. Dr. Chou doesn’t cite his sources, so I don’t know what he’s referring to, but it certainly hasn’t entered the knowledge of the reality-based community.
-
“The wet market cases were concentrated around a mahjong room”. CTRL+F “mahjong room” in the original post. The mahjong room itself tested negative, and the “epicenter” mechanism isn’t fine-grained enough to be useful (CTRL+F “Central Park” in the original post for a discussion of why this is).
-
“No animals at the market (or in Wuhan) tested positive.” No raccoon-dogs were tested. In SARS1, which we know was zoonotic, they also never get positive animal tests at several wet markets where they knew spillovers had occurred. Again, imagine the alien, coming to Earth and taking a dozen randomly selected cats (not even humans this time!) and finding none of them had COVID.
-
“No raccoon-dogs anywhere on the planet have tested positive, beyond those being forcibly infected to do experiments”. False, this paper discusses an outbreak of COVID among raccoon-dogs on a farm in Poland.
-
“They aren’t capable of catching or spreading COVID”. False, here’s a paper on the subject which says that “Raccoon dogs are susceptible to and efficiently transmit SARS-CoV2”.
-
“The clustering around the wet market in Wuhan . . . was just a product of oversmoothing”. Here is a map of December 2020 COVID cases. I recommend ignoring the contour lines and just looking at the dots. How could dots be oversmoothed?:
- “At the time of the wet market outbreak, COVID was already spreading around the world”. Dr. Chou doesn’t give a source for this, but I think it’s referring to the Brazilian data already discussed earlier.
I think he had more of these somewhere else on the subreddit, but I’m not feeling like this is extremely worth my time.
1.7: How much should coverup worry us?
GStew writes:
I personally agree it was not a lab leak but a pretty important was lost in the debate (or at least poorly factored in). Namely China was hiding evidence. While this may impact priors…the bigger impact is that, if it was a lab leak we only know what information was released (which almost certainly would be anything that boosted their preferred narrative) and do not have all the evidence that was presumably withheld (which would be all the evidence they could suppress that went against the preferred narrative).
This was discussed a bit, and Peter’s position is that China was a bad actor, but it wasn’t specifically trying to suppress lab leak / favor zoonosis. As often as not, it was trying to suppress zoonosis, or just swat anyone who spoke up about anything.
During SARS, the international health community criticized China for having wet markets where zoonotic spillovers could happen. China promised to clean them up, then mostly didn’t (for example, the raccoon-dog vendor at Wuhan was fined a few times, but kept operating). China’s first priority was to prevent people from accusing them of failing to clean up wet markets. For example, here’s what happened to Li Wenliang, the first person to raise the alarm about a mysterious new epidemic centered around the wet market, (source):
On 3 January 2020, police from the Wuhan Public Security Bureau investigating the case interrogated Li, issued a formal written warning and censuring him for “publishing untrue statements about seven confirmed SARS cases at the Huanan Seafood Market.” He was made to sign a letter of admonition promising not to do it again. The police warned him that any recalcitrant behavior would result in a prosecution.
Because of that, a lot of what we know about the possible zoonotic origins of the epidemic is in spite of China, not because of them. For example:
-
China killed all the animals at the market after the pandemic started, without telling anybody which ones they were. We know more about them because a Chinese researcher had been documenting them for an unrelated project about tick-borne diseases. He sent his work to Western journals, and after a mysterious delay they eventually published them.
-
China denied that there were raccoon-dogs the market. In addition to the researcher’s data, we know they were lying because we had virologist Eddie Holmes’ travel photos of them.
-
We have WIV’s catalog of viruses because they tried to publish it in a Western journal just before the pandemic, the journal rejected it, and then years later they realized what they had.
My impression is that China (realistically Wuhan City Government, I don’t think Xi would have been involved at this early stage) made a vague attempt to cover up the wet market early on - but that it wasn’t their Department Of Covering-Up’s finest work. For example, when the WHO asked for files on early cases, China gave them what they wanted, and then Western scientists were able to plot their addresses and find that they centered on the wet market.
Is it possible that China was trying to cover up a lab leak, and, in order to fool outsiders, pretended to be covering up the wet market, while actually feeding international observers datasets massaged to make the wet market look more likely? Anything is possible. But as a sign of the Chinese government’s level of competence, remember that they didn’t put a travel ban on Wuhan until January 23, ie after many Wuhanites had left to visit family for the Lunar New Year holiday. So they would have to be executing their brilliant fake-cover-up-to-detract-from-the-real-coverup scheme while also being too stupid to prevent Wuhanites from taking the train to Beijing.
Two more short points:
First , when the debate came to the question of China’s cover-up competence, Peter presented this photo:
This is the Wuhan Institute of Virology’s coronavirus research group, out for a team dinner at a local restaurant on January 15th 2020 (ie a month after the pandemic started). This isn’t the most rational probabilistic evidence in the world. But we’ve already seen people take the rational probabilistic evidence twenty different directions. So let’s ask the same question Peter did - do these look like people who secretly know they just started the worst pandemic in modern history?
If they secretly knew they’d just started the worst pandemic in modern history, wouldn’t they at least be wearing masks?
I think China, WIV, etc, were as clueless as the rest of us, at least at the beginning of the pandemic when a lot of this origins evidence was being collected. They tried to shove the raccoon-dogs under the bed, to prevent anyone from accusing them of bungling their SARS commitments. But they weren’t really up to anything else.
A more thorough argument would go over specific pieces of evidence, examine when they were collected (ie whether it was before or after China started caring enough about COVID to get their competent people involved), and how China could have rigged each.
Second , Peter (privately) discussed a Chinese conspiracy theory of his own with me by email.
Here’s an article from a random Chinese blog (you’ll have to Google Translate it). It describes China’s preferred theory of COVID origins: it was started by imported lobster from Maine (really!) The lobster arrived in the wet market, the wet market got sick, and diabolical Americans trying to hide their own complicity blamed it on raccoon-dogs and lab leaks and what-have-you.
The article includes this graphic:
It’s a map of which vendors at the wet market got COVID and where their stalls are. In many ways, it matches the maps that China gave to Western scientists. In other ways, it’s better - it includes information that Western scientists only inferred months after this article came out. But also, unlike the maps provided to Western scientists, it says the raccoon-dog vendors got COVID - something China has previously denied, and which would significantly raise the odds of a natural origin.
Is this China’s internal record of what really happened at the wet market? Did they fail some kind of critical communication about how classified it should be, so that a guy in their propaganda department accidentally released it publicly in a stupid article about lobsters? That would be so embarrassingly weird that Peter didn’t even try bringing it up in the debate. But in a response to a question about coverups, sure, let’s get conspiratorial.
1.8: Have Worobey and Pekar been debunked?
Worobey and Pekar are the two most prolific pro-zoonosis scientists, and many of the points in Peter’s argument were based on them. Several people criticized my writeup for not mentioning that these were “debunked”, for example:
Worobey and Pekar have about a million papers, each of which makes many different points, so I don’t know for sure what these are referring to. But a few other people make more specific claims, and I’ll respond to them here:
-
Pekar’s paper on the two lineages originally estimated 99-1 odds of double spillover. Someone found a coding error that reduced it to 6-1 odds, Pekar admitted the error, and the paper has been updated. Other people have made other criticisms which I haven’t investigated in depth and am agnostic on. I don’t think my argument depends too much on the details of this paper. The argument for B earlier than A is that it infected twice as many people and has more genetic diversity. It’s possible these things happened by chance and A preceded (and mutated to) B. In that case, I still think the most likely scenario is that A was released at the wet market, infected a customer or two, mutated to B, and infected a vendor. A then spread among the neighborhoods near the market, and B spread among market vendors.
-
Worobey’s paper includes an erratum saying he messed up the names of some of the supplementary data files. One of the data files also had the wrong number of samples somehow, but giving it the right number of samples didn’t affect results in any way. Lots of lab leak proponents have been tweeting that “Worobey finally admitted his paper was wrong!”, but I think they just mean this erratum.
-
Several people accused Pekar of ignoring intermediate lineages. Peter addressed this by finding these were mostly sequencing errors. There’s a very new paper about potential intermediate lineages which might change this debate; my provisional assessment is that it’s boring but I’m waiting to see if other people have more thoughts on it.
-
Several people linked to Biorealism’s 16 arguments as examples of Pekar/Worobey being “debunked”. I tried to address these above.
-
Several people claimed there was ascertainment bias in their papers. I try to address this below.
-
If there are other claims about Pekar and Worobey being “debunked”, I don’t know them.
In general, I find claims about “debunking” annoying even when they’re made by Important People who theoretically have the authority to make pronouncements. I think they’re even more annoying when they’re made by self-styled rebels who admittedly disagree with the scientific consensus.
1.9: Was there ascertainment bias in early cases?
observeraltwrites:
The judges put huge weight on early cases being near the market. Michael Weissman’s recent paper showing ascertainment bias in early case data is also significant as Miller relies on the sampling being random. Chinese CDC head at the time George Gao acknowledged this to the BBC last too. They focused too much on and around the market and missed cases on the other side of the city.
Here’s the Worobey map everyone is debating:
Before going further, I recommend reading page 8 of the supplementary text of Worobey’s paper, titled “Robustness Of Statistical Test Results To Ascertainment Bias”, or pages 14-17, “Additional Data Related To Case Ascertainment Biases”, which explain all the reasons he thinks this isn’t true. I promise you aren’t the first person to think that maybe Worobey could be contaminated by ascertainment bias.
If that still doesn’t help, Worobey talks more about his strategy for avoiding ascertainment bias here. Most important, he counted only cases from December; the market connection was discovered December 30 and added to diagnostic criteria January 3. This doesn’t mean bias is impossible - some of these points are people who caught COVID on December 31, but only got diagnosed January 4 after the new diagnostic criteria were added. But most cases are pre-criteria. And Worobey looked at various subsets of pre-criteria cases and found they were all at least as market-focused as the overall set. For example, he looked at the earliest COVID records in one Wuhan hospital system:
10 of these hospitals’ 19 earliest COVID-19 cases were linked to Huanan Market (∼53%), comparable both to Jinyintan’s 66% (of 41 cases) (4) and to the WHO-China report’s 33% of 168 retrospectively identified cases within Wuhan across December 2019 (1). Regarding cases at the Wuhan Central Hospital and HPHICWM, patients with a history of exposure at Huanan Market could not have been “cherry picked” before anyone had identified the market as an epidemiologic risk factor. Hence, there was a genuine preponderance of early COVID-19 cases associated with Huanan Market.
Likewise, a study conducted January 2 (so not impacted at all by the January 3 criteria) found that 27 of 41 known patients had market links.
Likewise, the first five cases were all detected in the market, and it doesn’t even make sense to talk about ascertainment bias for these.
What is the Weissman paper that observeralt is talking about? It argues: if the pandemic started at the market, each seemingly non-market-linked case must ultimately derive from a market-linked case. Therefore, we should expect non-market-linked cases to require more steps than market-linked cases. Therefore, they should be further away. But if we look at the map above, we see that not-market-linked cases are closer to the market than market-linked cases. So something must be wrong, and that something might be ascertainment bias.
(at least this is my interpretation of Weissman’s argument, which is more mathematical; read the paper to make sure I’m getting it right).
This is a weirdly spherical-cow view of an epidemic, worthy of a physicist. It’s easy to think of reasons the linked-cases-should-be-closer rule might not hold. For example, suppose that on their lunch break, market vendors go have lunch at restaurants surrounding the market. They infect people in these restaurants, who then infect their friends and family. But these people never went to the market themselves. Now there are a bunch of non-market-linked cases immediately surrounding the wet market.
But also - of all markets in Wuhan, Huanan sold the most weird wildlife. Suppose someone in the boonies gets a craving for raccoon-dog one day, their local convenience store doesn’t have it, so they hop on a bus and go downtown to the city’s main wet market. Then they get infected with COVID. Now there’s a wet-market-linked case in the boonies.
In other words, we should expect two modes of spread: general geographic diffusion from the epicenter, and people from far away who made specific trips.
If this still doesn’t seem obvious to you, consider - usually when COVID first arrived in America or Brazil or wherever, they were able to trace it back to a specific person from Wuhan who visited the country. If I was the first person in America to get COVID, I could usually say “Oh, it must have been my business meeting with Mr. Chin from Wuhan”. At the same time, if someone from the next town over from Wuhan got COVID, they probably couldn’t trace it back to a specific Wuhanite - everyone from Wuhan is coming and going so often that my town is just full of COVID in general.
So I don’t think Weissman’s paper proves anything, and I think the general pattern of blue and orange dots suggests ascertainment bias wasn’t playing a role.
So why does George Gao say that there was ascertainment bias? I looked for the direct source of the Gao quote and couldn’t find it; if someone else is able to, please let me know, since I’d be interested in exactly what he thinks about this.
1.10: Connor Reed / Gwern on cats
Gwern wrote:
Yes, I don’t understand this (paraphrased) claim by Peter:
> He also told the Mail that his cat got the coronavirus too, which is impossible.
‘Impossible’, thus implying the man was lying? I was under the impression that, quite aside from cats having tons of coronaviruses in general (FCoV being a particularly serious threat to young cats, which also seems to be a remarkable case study of the harms of the FDA), that it was not just not ‘impossible’ for domestic pet cats to get the coronavirus too, it was routine for them to get COVID-19, and even other cat species in zoos have tested positive and this was true very early in the COVID-19 pandemic and quite well publicized and well known (eg April 2020 https://www.nationalgeographic.com/animals/article/tiger-coronavirus-covid19-positive-test-bronx-zoo ). This was a topic of interest to me at the time because I like cats and have a cat and was wondering what the implications of me being inevitably infected might be for my cat, and so I remember this quite well despite my general attempt to remain ignorant of as many COVID-19 matters as possible… And double-checking now to see if all of these reports were somehow false positives or faked, I continue to see everyone like the CDC stating that it is still totally possible and routine for cats in close contact with infected humans (you know, like a pet cat) to be infected with COVID-19: https://www.cdc.gov/healthypets/covid-19/pets.html
Given that Peter has supposedly spent years autistically researching every last detail and this detail in particular in order to discredit that British dude, I’m experiencing sudden Gell-Man Amnesia here about the rest of his claims, as well as the supposed experts evaluating Peter’s claims if they didn’t flag that (I have not checked).
This is in the context of Connor Reed, a British man who claimed to have gotten COVID on November 25 - which, if true, would be surprisingly (though not impossibly) early according to the zoonosis narrative. Peter argued his story didn’t hold up, and one of his points centered around his claim that his cat might have caught COVID from him and died.
Unfortunately, I mis-quoted Peter. I said Peter argued it was impossible for his cat to get COVID-19 (false). His actual statement was that it’s extremely rare for a cat to die of COVID-19.
Peter, Gwern, and I then proceeded to get very confused about the exact claims and timeline, which I think is because Connor said totally different things in different interviews:
-
In an interview with Wales Online on 2/4/2020, he said that “my kitten caught the feline coronavirus and developed pneumonia and died, but I don’t think I caught it from her. I think that was just coincidence.”
-
In an interview with the Daily Mail on 3/4/20, he said that his kitten died, after a two-day illness, on the ninth day of him (Connor) having COVID. He said “I don’t know whether it had what I’ve got, or whether cats can even get human flu” (speaking as if it was him in the past, who thought he had flu because he hadn’t heard of COVID yet).
This is a weird inconsistency! In the Wales interview, the cat got it before him (at least that’s how I interpret “I don’t think I caught it from her”). In the Mail interview, he got it nine days before the cat.
In The Wales interview, it’s “the feline coronavirus”. In the Mail interview, he doesn’t know what the cat got and speculates that it might have been COVID. But also, if it was “the feline coronavirus”, how would he know? Wouldn’t you need a vet to diagnose that? But in the Mail interview, he said he didn’t leave the house for a week around the time his cat was sick. So how did he go to the vet?
It gets worse. In the Mail interview, he gave a day-by-day account of his sickness. On Day 12, he goes to Zhongnan University Hospital. He says:
As soon as I get there, a doctor diagnoses pneumonia. So that’s why my lungs are making that noise. I am sent for a battery of tests lasting six hours.
And then says that he went home either that day, or the next.
Day 13: I arrived back at my apartment late yesterday evening. The doctor prescribed antibiotics for the pneumonia but I’m reluctant to take them
But in an interview with FOX News said:
He said he went to the hospital after he struggled to breathe and experienced a bad cough, both of which are signs of the pneumonia-like illness.
“I was stunned when the doctors told me I was suffering from the virus. I thought I was going to die but I managed to beat it,” he told the outlet, adding he was hospitalized at Zhongnan University Hospital for two weeks following his diagnosis.
In his earlier story, he was at the hospital for less than a day. Now it’s two weeks.
But also, the doctors “told [him he] was suffering from the virus”, but this is impossible - the virus hadn’t been discovered yet. The whole point of Saar bringing him up is that he’s a supposed anomalous case before the official pandemic. So how did the doctors tell him this?
In the Mail interview, he tells a different story of how he learned he’d had COVID:
Day 52: A notification from the hospital informs me that I was infected with the Wuhan coronavirus. I suppose I should be pleased that I can’t catch it again — I’m immune now.
Day 52 would be January 11th. So I think he’s saying that, a month after he recovered, the hospital “informs” him it was coronavirus. Charitably, maybe they kept his samples (really?), then re-tested them after COVID was discovered, found he had it, and told him. But, at a time when the eyes of the world medical establishment were fixated on Wuhan and its new pandemic, didn’t they think to tell anybody that they’d confirmed a case two weeks before any other known cases? Just called Connor and said “Hey, you’re the first ever COVID patient, congrats” and did absolutely nothing else? And then he didn’t show up in any of those WHO searches for early cases?
There’s one more weird inconsistency. Connor said in his interview that he thinks he might have gotten COVID at “the fish market”:
Maybe I caught the coronavirus at the fish market. It’s a great place to get food on a budget, a part of the real Wuhan that ordinary Chinese people use every day, and I regularly do my shopping there. Since the outbreak became international news, I’ve seen hysterical reports (especially in the U.S. media) that exotic meats such as bat and even koala are on sale at the fish market. I’ve never seen that.
This sounds to me like a reference to the Huanan Seafood Central Market, ie the wet market with the raccoon-dogs where the first confirmed cases were found. He says “the fish market” like he expects us to know which one he means, and adds that “since the outbreak became international news”, he’d seen “hysterical reports” in US media about it. US media was covering the Huanan Market because that’s where the pandemic was first found; it didn’t cover any other fish market in Wuhan.
During the debate, Saar objected that Connor lived on the opposite side of Wuhan from the wet market; it would have taken him about an hour to get there. It would be weird to “regularly” do your shopping somewhere an hour away. Saar speculated that Connor meant somewhere closer to his home.
I can’t deny that it’s weird to do your regular shopping at a market an hour away, but it really sounds like he’s referring to the wet market where all the cases started here.
But also, isn’t it weird that the first ever coronavirus case is a white person? And that he’s 25 years old, yet was hospitalized with COVID (about 1% of people in their 20s with COVID require hospitalization)?
I think the best explanation for all of this is that Connor was making this all up. He told whatever story sounded cool at the time, and all of his stories ended up contradicting each other or making no sense. This would also explain why he said he had COVID at a time when, by the standard narrative, it either didn’t exist yet or was confined to a single-digit number of people.
1.11: Rootclaim Response
Saar and Rootclaim wrote a response to my earlier post. You can read it at COVID origins debate: response to Scott Alexander. I’ll post the introduction and first summary, you can go to the link for the rest of the case, and I’ll respond to parts I disagree with below.
We were initially excited to have Scott cover the story, hoping that someone with an affinity to probabilities would like to dig into our analysis and fully understand it. Sadly, Scott seemingly hadn’t enough time to do so and our exchange focused on fixing factual mistakes in earlier drafts of his post and explaining why rules-of-thumb in probabilistic thinking that he proposed do not work in practice. We did not get to discuss the details of our analysis, resulting in a post that is essentially a repeat of the judges’ reports with extra steps.
His post has two main messages:
It’s hard to get probabilistic inference right – we fully agree with this and ironically his post is a great example, containing many probabilistic inference mistakes, some of which are listed below. While we agree it’s hard, our experience taught us that it is far from impossible.
Zoonosis is a more likely hypothesis due to being better supported by the evidence – This is completely untrue, but to fully understand it one has to commit to learning how to do probabilistic inference correctly, which Scott could not free enough time to do.
Instead of explaining the whole methodology and how it applies to Covid origins, which will take too long, we will focus on the main mistake in all the analyses in Scott’s post – believing that the early cluster of cases in the Huanan Seafood Market (HSM) is strong evidence for zoonosis. Scott prepared a very useful table comparing the probabilities various people gave to the evidence about Covid origins (discussed later in more details). It nicely shows how the zoonosis conclusion stands on this single leg, and once it is removed, lab-leak becomes the winning hypothesis (Scott specifically will flip to 94% lab-leak).
Having explained this many times in many ways, we realize by now that it is not easy to understand, but we promise that those who make the effort will be rewarded with a glimpse of how much better we can all be at reasoning about the world, and will be able to reach high confidence that Covid originated from a lab.
Given this point’s importance, we will explain why HSM is negligible as evidence, using three levels of detail: a simple version, a summarized version and a detailed version.
Simple Version
The zoonosis hypothesis fully depends on the claim that it is an extreme coincidence that the early Covid patients were in HSM – a market with wildlife – unless a zoonotic spillover occurred there.
The rest of the evidence strongly supports the lab-leak hypothesis, so if this claim is mistaken, lab-leak becomes the most likely hypothesis.
There are multiple cases where a country has had zero Covid cases for a while, and then a cluster of cases appears in a seafood market. In all these outbreaks, there is no contention that the source is not zoonotic, as it is genetically descended from the Wuhan outbreak.
Since zero Covid periods are fairly rare, it is impossible to have so many market outbreaks unless there is something special about these locations. We discuss below what that may be, but whatever it is, it likely also applies to HSM, which is the largest seafood market in central China.
This collapses the ‘extreme coincidence’ claim, which as explained above, turns lab-leak into the leading hypothesis.
My strongest disagreement is with his Point 3 - the inference from other seafood-market-based COVID spread events. Saar writes:
A common objection to this method is that these outbreaks are caused by cold-chain products brought into these markets. However, this still fails to explain why markets form these early clusters and not the many other places where cold chain products are delivered to. Additionally, this only demonstrates the importance of cold wet surfaces in preserving SARS2 infectivity, further strengthening the hypothesis in method 1 that a crowded location with many wet surfaces like HSM is highly conducive for rapid SARS2 spread. Last, it also opens the possibility that the HSM outbreak was also caused by cold-chain products. This would reduce the significance of Wuhan being the outbreak location (as the product could have come from anywhere), but since the other evidence for lab-leak is so strong, Wuhan can be given no weight and still lab-leak would be highly likely – Rootclaim’s conclusion will only drop from 94% to 92%.
Most of these outbreaks have been traced back to either a migrant worker (eg a fisherman from a country with COVID sells fish at the market of a country with Zero COVID) or a cold chain product. For example, here’s Dai et al on the Xinfadi outbreak, the most important event of this type:
According to a joint publication by the Beijing CDC and 13 research institutions, the outbreak at Xinfadi Market was likely to be initiated by fomite transmission from contaminated foods imported via cold-chain logistics (Pang et al., 2020; Beijing Daily, 2020b). Based on the epidemiological investigations at the Xinfadi Market, the researchers preliminarily concluded that booth #S14 in the aquaculture product selling area on the basement floor of the primary trading hall was the source of the initial transmission. Specifically, five customers were tested positive for IgG/IgM antibodies against SARS-CoV-2 in serological screenings, all of whom visited booth #S14 on May 30 and 31, 2020. On May 30, 2020, the owner of booth #S14 procured imported and fully packaged salmon from a company’s cold storage warehouse, then cut and processed the salmon for sale at the Xinfadi Market. Laboratory tests showed that sample swabs from five salmon fish from this supplier were tested positive by examining all salmon in the original sealed packages (n = 3582) in the cold storage facility. Viral genome sequencing showed that the viral strain isolated from one of the positive salmon swabs was homologous to that isolated from the infected persons and environmental samples at the Xinfadi Market (Beijing Daily, 2020b). The joint study reported that an ancestral strain isolated from the Xinfadi Market in Beijing was markedly different from the strains identified in two preceding outbreaks in China and the sequences obtained in March 2020 in Beijing. Phylogenetic analysis assigned the ancestral Xinfadi strain to clade B.1.1. Given the fact that the ancestral sequences were mainly identified in Europe, the strain was more likely to be imported to Beijing rather than derived from strains previously circulating in China (Pang et al., 2020).
I know China has a bias towards believing frozen food COVID explanations, but this all sounds pretty convincing to me.
Why is it more often markets than other places with cold chain products? Partly it’s the migrant workers - a lot of seafood markets are right next to seaports, and the contact tracing eventually traces back to a fisherman who came in through the seaport - I don’t think this is any more mysterious than epidemics often starting via airport or any other transportation hub. But even just keeping the focus on cold chain products, - there have also been outbreaks in seafood distribution warehouses, on docks, and in a seafood processing work area. Markets have many more people than any of those locations, and maybe (total speculation) cutting on cutting boards could aerosolize bits of fish.
The strongest evidence that the Wuhan / Huanan Seafood Market epidemic wasn’t caused by migrant workers or imported seafood products is that there was no previous COVID-infected source of workers or seafood. If there had been, we would have noticed when the outbreak there spread (see Section 1.4 on Brazil).
Responses to a few of Saar’s other points below:
How many locations other than markets provide an interface with wildlife? Were markets actually identified in advance to be high-risk spillover locations or only in retrospect?
I think scientists had called wet markets as an especially dangerous potential transmission location in advance. See for example Infectious Diseases Emerging From Chinese Wet Markets, published in 2006, which says:
» “In Chinese wet-markets, unique epicenters for transmission of potential viral pathogens, new genes may be acquired or existing genes modified through various mechanisms such as genetic reassortment, recombination and mutation. The wet-markets, at closer proximity to humans, with high viral burden or strains of higher transmission efficiency, facilitate transmission of the viruses to humans.”
In 2004, a paper on an emerging bird flu expressed hope that it would not spread too widely, but concluded that:
» “Even in the event of yet another lucky escape, more measures must be taken to limit the amplification of viruses with pandemic potential in the wet markets around the world.”
In 2007, Reuters published an investigation: Chinese Markets May Be Breeding Ground For Deadly Viruses, which said things like:
» “We face similar threats from other viruses and such epidemics can happen because we continue to have very crowded markets in China,” said Lo Wing-lok, an infectious disease expert in Hong Kong. “Even though official measures are in place, they are not faithfully followed. We are not talking about just civet cats, but all animals,” he added.”
Wet Markets, A Continuing Source Of SARS And Influenza, published 2004, is admittedly focusing on the next SARS1 outbreak instead of on SARS2, but gets bonus points for mentioning both wet markets and labs as likely causes of the next pandemic:
» “Will SARS reappear? This question confronts public-health officials worldwide, particularly infectious disease personnel in those regions of the world most affected by the disease and the economic burden of SARS, including China, Taiwan, and Canada. Will the virus re-emerge from wet markets or from laboratories working with SARS CoV, or are asymptomatic infections ongoing in human beings? Similar questions can be asked about a pandemic of influenza that is probably imminent. Knowledge of the ecology of influenza in wet markets can be used as an early-warning system to detect the reappearance of SARS or pandemic influenza.”
Saar mentions that there are several other possible sources like restaurants or farms. I think Peter demonstrated during the debate that pandemics are unlikely to start in rural areas, so farms aren’t that important. Restaurants mostly source their products from wet markets. During SARS1, some pandemics started in restaurants because they kept the civets in cages next to the diners (like how some Western restaurants keep lobsters). After SARS1, restaurants stopped doing that and became a less likely spillover location.
Saar again:
Scott quotes Peter, who implies that under the lab-leak hypothesis, we would expect the confirmed early cases to be centered around the WIV. However, cases are not expected to center on the lab. The lab is not spraying viruses into the air or hosting thousands of locals daily. If a worker gets infected, they spread the virus to their friends and family at completely different locations.
In most places with an outbreak of known origin, epidemics show some geographic clustering. This has been true ever since the very beginning of epidemiology, when John Snow successfully traced a cholera outbreak back to its origin at a contaminated water pump by taking the center of the map of cholera cases.
This isn’t a 100% law of nature; an infected lab worker might get lucky and not pass it to any of his lab co-workers. Still, we might expect him to infect his family, the stores he went to, or the restaurants he went to.
If he lived near his workplace, these might also be near the lab. If he didn’t - let’s say he lived on the other side of town and had a long commute - he would start a cluster near his house, or his favorite store, or his favorite restaurant. Then the people there would infect their families/co-workers/stores/restaurants. The cluster would start somewhere! Sure, some people would infect nobody close to their work or home, and instead just infect one person a hundred miles away who they breathed on during a trip - but this is the exception, not the rule.
So you wouldn’ t expect a totally random distribution of cases all around Wuhan. There would be one center, or maybe several centers.
But none of the claims that COVID was quietly spreading for months before the wet market have pointed to some alternate center of cases. If COVID was spreading for months before the lab, it somehow spread in a completely diffuse geographical pattern, with people exactly as likely to infect people far away from them as close to them - until it reached the wet market in December, and then spread in the normal center-radiating-outward way that every other infection spreads.
All the evidence trying to support a spillover at the market is based on complex models with many single points of failure, built from unreliable and biased data. Therefore, it is difficult to give this evidence significant weight as there is always a possibility of errors in the data or its interpretation. More on this in the UFO comment below.
Disagree. “First known case was at a wet market” is as simple as it comes. Certainly it’s less complex than “the virus has a 12 nucleotide insertion at the furin cleavage site, and even though those sometimes happen by natural recombination probably this one didn’t, and even though it looks out of frame maybe there was some weird thing going on with serine that made it in frame this one time only”, which is Saar’s star piece of evidence.
I understand Saar thinks he can come up with lots of objections to “seen near wet market is suspicious for wet market origin”, then claim that getting over those objections requires “complexity”. But if Peter had no dignity, he could also come up with lots of objections to “seen in same city as Wuhan Institute of Virology is suspicious”. He could say that maybe the civet farms of Hubei province were uniquely blah blah blah, and then Saar would have to prove that the civet farms weren’t uniquely blah blah blah, and then he could say “Oh, sure seems like you have a complex model with lots of unique points of failure, it all depends on fifty facts about the regulation of civet farms.”
To illustrate what a market looks like in a real zoonotic pandemic, consider this study from SARS1. The researchers went to a random market and sampled the wildlife sold there. 4 of 6 civets sampled were positive, and 3 of them were phylogenetically distinct (i.e. infected in completely different places).
A scientist I talked to says the 3 phylogenetically distinct lineages were most likely sampling errors. Still, this seems irrelevant to me since, again, no raccoon-dogs were tested.
Scott explains that Covid’s closest known relative, BANAL-52, is rare and so it’s highly unlikely the WIV would’ve had it available as the starting point to engineer Covid . . . This is a basic mistake. SARS2 is not based on BANAL-52 but a relative of it. There is nothing unlikely here.
No BANAL-52 relative close enough to create COVID from has ever been discovered.
By mentioning BANAL-52, I was trying to be maximally charitable to the lab leak side. In order to create COVID, they would need a virus very close to COVID. But in years and years of searching, nobody has ever discovered a virus like this. Therefore it must be rare. As a way of bounding how rare, let’s see how rare the closest virus ever discovered is. That’s BANAL-52. It is very rare. Therefore, the COVID ancestor must be rarer than that.
I don’t know how strong this argument is, because maybe there are millions of rare viruses capable of becoming pandemics, such that getting any one of them is very easy, even though each one individually is rare. The version of this I find convincing is that it should be a probabilistic cost to say that WIV did gain-of-function on a seemingly undiscovered and so-far-very-hard-to-discover rare virus instead of on any of the usual SARS-like viruses that people do their gain-of-function research on.
Overall, all attempts to portray [Connor Reed] as an unstable, delusional person were unsuccessful. He is an ordinary person who very accurately described Covid-19 symptoms in real-time and claims to have received a positive test result. The timing and location matches the lab leak hypothesis and is impossible for the HSM claim. Therefore, they must discredit him.
It is worth noting here the biased evidentiary standards used by zoonotic proponents. Reed’s testimony about his sickness, given on camera to multiple outlets, is deceitful and should be ignored. Yet, an anonymous voice testimony in one Chinese publication is definitely identified as Mr. Chen (another possible pre-HSM case) and should be considered reliable.
See above for why I don’t trust Connor Reed.
I’m not sure why Saar attributes Mr. Chen to “an anonymous voice testimony in one Chinese publication”. When I looked for Chen information, I got this thread, where it’s attributed to two Chinese hospital doctors, cross-checked with the Chinese COVID data repository, and double-cross-checked with the supplementary table in a peer-reviewed paper published by a team of Wuhan doctors.
To understand how ridiculous the claim is that the HKU1 insertion looks just as engineered as SARS2’s, here are their alignments. Hopefully that should be enough.
COVID:
HKU1:
I’m not a virologist, but I question how this comparison works.
Surely HKU1 got its insert on some specific day. If you take the virus the day before, and then the other virus the day after, there will be no differences except the insert, and it will look just like COVID (ie an insert without many other mutations).
The fact that the COVID comparison has few mutations, and the HKU1 insert has many mutations, just shows that whatever older virus we chose to compare HKU1 to is more distant from HKU1 than BANAL-52 (or whatever) is from COVID.
Or am I missing something here?
[The evidence that China tried to cover up zoonosis from the start] is untrue. They clearly said from the start this is a zoonotic spillover at HSM, and at least part of the government went to immense efforts to identify the animal, close farms, etc. (and of course couldn’t find any infected animal).
Only in late 2020 did they start suspecting an import from cold-chain products after having multiple outbreaks that seem related to cold-chain products.
From a Vox article from March 2023:
From the start, the Chinese government interfered with efforts by both Chinese and international experts to study the pandemic, including its origins. Reporting by the AP found that even as WHO officials were publicly praising China’s cooperation, behind the scenes they were complaining about lack of access and a refusal to share data.
Within months of the beginning of the pandemic, the Chinese government imposed restrictions on academic research into the origins of the novel coronavirus … China’s intransigence wasn’t unusual — countries are rarely eager to confirm that they’re the source of a deadly disease — but it went beyond the norm. International investigators weren’t permitted to see the market until more than a year after the pandemic began and a WHO-affiliated team was allowed a highly choreographed and controlled visit.
The resulting report that came out of the Wuhan visit, which dismissed the possibility of a lab origin, pointed the finger at some kind of zoonotic spillover while concluding that it was unlikely that the spread started at the market, which surprised many experts.
It also found that it was “possible” that the virus had been introduced via contaminated frozen food products from abroad. While few experts took that possibility seriously, it fit a narrative the Chinese government had been pushing, against nearly all evidence, that the pandemic had in fact not originated in China.
“China just doesn’t want to look bad,” Filippa Lentzos, a biosecurity expert at King’s College London, told Science last August. “They need to maintain an image of control and competence. And that is what goes through everything they do.”
[…] it seems clear that with more cooperation, scientists could have been looking at raccoon dogs a year or more ago.
“The big issue right now is that this data exists and that it is not readily available to the international community,” Maria Van Kerkhove, the WHO’s Covid-19 technical lead, told reporters on Friday. “This is first and foremost absolutely critical, not to mention that it should have been made available years earlier, but that data needs to be made accessible to individuals who can access it, who can analyze it and who can discuss it with each other.”
The irony is that by making it so difficult to properly investigate a zoonotic origin of Covid, the Chinese government has created a vacuum that has been filled by claims on all sides, including the much more damning accusation that the pandemic was the result of a lab error at the Wuhan Institute of Virology.
For what it’s worth, my timeline of Chinese denials and coverups looks like this:
December: COVID doesn’t exist, it’s all lies
Early January: Fine, it exists, but it’s just some wet market thing that can’t spread from person to person
Late January: Fine, it can spread from person to person, but we’ve got it under control now.
February: Fine, it’s out of control, but you would not believe how great our response was. We’re basically heroes.
March: COVID was a US bioweapon, or possibly came from Italy.
April: Chinese people are banned from researching the origins of COVID without government permission.
2: Comments Arguing Against Lab Leak
2.1: Is the pandemic starting near WIV reverse correlation?
randomstringofcharacterswrote:
Isn’t [the pandemic starting near the lab] a reverse correlation issue? The lab is situated there because it’s an area where coronaviruses were found in the past.
Many people had this question, but Wuhan Institute of Virology was founded in 1956, didn’t originally focus on coronaviruses, and isn’t in a coronavirus hot spot. Most of WIV’s coronavirus samples come from Yunnan, about a thousand miles away. COVID’s closest relatives were found in Laos, almost two thousand miles away.
During the debate, both Saar and Peter calculated the odds of a natural pandemic arising in Wuhan by dividing the population of Wuhan by the total urban population of East Asia (Saar) or South China (Peter). Saar got 1.5%, Peter got 3% (he later said this could be as high as 10% because it was a central hub in the wildlife trade).
This isn’t an Official Position and I don’t think anyone else shares it, but during the debate Peter pointed out a few times that there are plenty of disease-ridden bats in Hubei (the province Wuhan is in), and that it’s not impossible that a bat virus currently known only in Laos could be active in Hubei. Still, this is the minority viewpoint and most scientists just think it involved something about the wildlife trade.
3: Other Points That Came Up
3.1: Apology to Peter re: extreme odds
quiet_NaNwrote:
Hot take: Peter clearly failed to convince anyone.
The lab leak odds, in log10 (i.e. orders of magnitude are):
Peter -20.7
Saar 2.7
Eric -3.1
Will -2.5
Scott -1.2
Daniel -1.4One of these numbers is clearly an outlier. Scott mentions it and calls it “trolling”, I would argue that it is debating in bad faith. 2e-21 is a ratio which is just silly. For one thing, the gain of function at WiV pathway is not the only pathway towards a lab leak. The WIV could also have released a naturally occurring coronavirus at the wet market. At 2e-21 odds, we would probably have to consider the possibility that the WIV built a time machine and went back in time to infect the wet market.
I might have screwed up here - or at least I should have emphasized the “trolling” part. Peter complained about my presentation of his extreme-odds slide, saying:
This is basically accurate. During the debate, Saar gave lots of different numbers. I don’t want to say exactly what the different numbers meant, because in earlier drafts of my post, Saar said I misunderstood them. My impression were that some of his numbers were conservative, others were central, others were extreme, others were adjusted-for-out-of-model-error, others were not-adjusted, etc.
In an early draft of the post, I gave higher numbers for Saar. Saar asked me to replace them with the numbers I ended up using. I decided to agree, because I wanted to represent Saar fairly with the numbers he most centrally believed, but also because these were closest to the numbers on his Rootclaim site so it wasn’t like he was making them up just to fool me.
Peter didn’t argue quite as hard, and also he didn’t have anything like the Rootclaim site, so I just took his first set of numbers.
Trying to piece things together, I think a reasonable summary would be:
-
During the debate, Saar mentioned 700-million-to-one odds in favor of lab leak, not because he thought this was plausible, but just as a discussion of where the situation would end up if you didn’t adjust for human fallibility.
-
On his site, he properly adjusted for human fallibility.
-
Peter, very reasonably responding to the numbers Saar gave during the debate and not the numbers he had elsewhere, trolled him by giving a set of numbers that came out to 10^25-to-one against lab leak.
-
I put the numbers everyone had actually given into my spreadsheet.
-
Saar asked me to replace with his adjusted numbers, which he conveniently had in a canonical location. Peter had never bothered coming up with adjusted numbers (because he wasn’t as interested in probabilistic analysis) and didn’t ask me to change anything, so I didn’t.
-
The post made it look like Saar’s numbers were reasonable and Peter’s were crazy.
-
In the part about why Saar thought the debate was unfair, I repeated his argument against Peter’s crazy numbers. And because I thought it was an interesting and true rationality point, I went over it myself and endorsed it separately from “it’s a thing Saar said”.
-
This was unfair to Peter.
Over the past few weeks, I exchanged ~100 emails with Peter and Saar, and made dozens of tiny changes like this in response to one side or the other thinking my portrayal of them was unfair. Eventually I decided I would go crazy if I spent one more second talking to either of them and hit PUBLISH. This was unfair to them, and let a couple of smaller or harder-to-untangle misrepresentations get through, which I regret. But not as much as I would have regretted continuing the discussion.
3.2: Tobias Schneider on Rootclaim’s Syria Analysis
Tobias Schneiderwrote:
I have no horse in this particular race, but I do have a lot of expertise in some of the areas rootclaims “investigates” (especially the stuff related to Syria and chemical weapons) - where their analysis is so shoddy and laughable it’s indistinguishable from Youtube conspiracies - and the biggest surprise to me here is that anybody really bothers with rootclaim in the first place? The more you learn…
Tobias is talking about a Rootclaim analysis on who perpetrated a deadly chemical weapons attack in Syria (Rootclaim says it was rebels; Tobias presumably thinks it was the government).
This analysis is at the top of Rootclaim’s Track Record page, which says:
Rootclaim’s conclusion contradicted all Western intelligence agencies, but years later was shown to be correct. This demonstrates that superior inference methodologies are far more important than privileged access to information.
Apparently Tobias disagrees that it “later was shown to be correct”. Of note, a Tobias Scheider is listed as editor of Syria in Context, a set of “weekly briefs covering key humanitarian, stabilization, and security policy developments in and around Syria”. Wikipedia and all Western governments agree with Tobias and not Saar.
After Saar repeated that his analysis turned out to be “spot on”, Joshua E objected:
If you are going to claim your analysis is spot on, please link to a credible independent source. Otherwise this comes across as we believe this unlikely thing and used our analysis you find shoddy to conclude we were right so you should not consider our analysis shoddy.
And Saar again:
Follow the link [to a discussion on Rootclaim]. It describes external forensic work which you can verify yourself . . . [Media] outlets have no incentive to publish such findings. Sadly these are the kinds of things you need to verify yourself.
The Rootclaim analysis says:
The new findings are a result of what we believe to be the most impressive independent open-source investigation in history. It was initiated nearly a year ago by several volunteers who reviewed all the evidence from the attack and managed to uncover incontrovertible evidence implicating an opposition faction, confirming Rootclaim’s conclusion.
Which links to this report, somehow affiliated with Rootclaim and written by “Michael Kobs, Chris Kabusk, Adam larson, and many helpful citizen investigators.” It looks like Saar and some other people who didn’t believe the standard theory worked together to do some video forensics, which they published in a report. But governments, intelligence agencies, the media, Wikipedia, etc, haven’t noticed the report or updated on it.
So when Saar says that his method has a great track record, what he means is that when he looks into it further, he becomes even more convinced of his previous position. He doesn’t mean that any kind of external consensus has shifted towards his results over time.
During my email discussions with Saar, he kept insisting his position was obviously right. He would send me emails like (not exact quotes) ‘Now that I’ve demolished all the evidence for zoonotic transmission, you have to agree with me, right?’ or ‘You must secretly agree I’m right, it’s just be hard for you to admit.’ I’m sympathetic to this way of thinking - my beliefs also intuitively feel so obvious that nobody could possibly disagree. But I eventually learned real life didn’t work this way; I think Rootclaim would benefit from a similar lesson.
3.3: Closing Thoughts On Rootclaim
In my post, I suggested that if Saar wants to convince people that Rootclaim works, instead of sponsoring debates he should train more people to use it, then test whether there’s inter-rater reliability (eg five people, each doing an independent Rootclaim analysis, all get similar probabilities on COVID origins. Saar responded:
We don’t think this would be convincing to a wide audience outside people who think like Scott. However, we don’t really have any better ideas, and would love to hear ideas from readers.
I don’t think you should do this for me or people who think like me. I think you should do it for yourself.
Have you ever done a Rootclaim analysis on Rootclaim itself? Probability that Rootclaim works significantly better than a smart person using their normal intuitive reasoning methods? Why not?
Maybe this is wrong, but still. You’ve got to be curious if it works, right? And short of an oracle, proving inter-rater reliability is the best you’re going to get.
Even if you don’t want to convince yourself, this is the correct next step. Again by analogy to Tetlock - if he had started with just one superforecaster, and his thesis was “this guy is really smart, but I refuse to prove it”, nothing would have changed. Instead, his theory of change goes through publishing in a bunch of papers, to identifying other superforecasters, to teaching general principles of superforecasting, to superforecasting as a service (either through specific superforecasters at GJO, or through projects that seek to emulate them like Metaculus, FutureSearch, etc). If Rootclaim doesn’t scale, it either dies with Saar, or at best Saar lives a long life and puts out a few more dozen Rootclaim analyses but nothing else comes of it. You’ve got to start training other people eventually, and part of that process involves demonstrating you did it right, and that’s going to involve inter-rater reliability.
4: Conclusions And Updates
I don’t like getting in fights, and boy was this a fight.
And I don’t like making sweeping generalizations about The Nature Of Pseudoscience - it’s too likely to be incredibly embarrassing if it later turns out I was one in the wrong.
But I do feel like there’s a method going on here. It’s nothing sinister, just that the lab leak people have 100x more zealots and energy, and there are some strategies that make sense in that position, which no single individual necessarily chooses, but which are very noticeable from the other side.
The most glaring is the constant focus on “as of one minute ago, the case for the opposite side is in SHAMBLES”. The “as of one minute ago” makes it hard to trust institutional consensus or published papers - what if they just haven’t caught up to the new evidence? The “in SHAMBLES” is always a couple of papers that have “now debunked” the best papers of the other side. These come out on a regular schedule. They’re usually by people in unrelated fields - the ones I saw on COVID origins were by computer scientists, physicists, and agricultural scientists. They’re usually either preprints, or published in weird journals in unrelated fields. But they sure do look like Scientific Papers and have lots of equations in them, and they always end with “…and therefore this one peripheral argument in So-And-So Et Al 2020 is wrong.”
Once you collect, I don’t know, ten of these, you can spam a bunch of opposing discourse with “This didn’t even consider these ten new papers, all converging upon the fact that this case has now been debunked”. The very prestigious researchers who wrote the original paper probably won’t respond, because they don’t have time to respond to pre-prints by agricultural scientists. So it does kind of look, to an outsider, like all of the top papers of the side with more institutional support are debunked. Even if you spend hours and hours talking to the scientists involved and trying to figure out the flaws, it doesn’t matter, because there will be a new set of papers like that a few weeks later.
Some of this is inevitable - and it’s also what correct people have to do when arguing against incorrect papers. When people were still treating eg stereotype threat as state-of-the-art, I would respond to people talking about it by listing some of the papers that had debunked it, and probably that was annoying to supporters who didn’t want to have to defend it. So I don’t want to claim it’s all inexcusable bad behavior. And one insulting person in the middle of ten thoughtful people with good responses still poisons the barrel and makes things feel hostile. Still, this is something I’m more sensitive to now.
A lot of these problems center on a failure mode of hyperfocusing on little details. What if the exact set of cases Worobey et al uses is contaminated by ascertainment bias? Then we have to throw it out, right? Compare Worobey’s analysis in the supplementary text, where he shows robustness to up to half of the cases being totally false. Or consider his search for alternative uncontaminated datasets and then showing how those demonstrate the same pattern. Or consider that the first five cases ever detected were all from the market, and even that’s enough to prove at least a pared-down version of Worobey’s thesis. It’s fine to also want to make sure the official argument is exactly right and not biased. But just doing that seems like a failure of focus.
This is also how I feel about “There was a Brazilian case in November!” or “There was a weird case that might have been something in Italy in October!” or whatever. Instead of focusing on the exact sentences in a paper with “PEER-REVIEWED!” on the header, think about the big picture. If there had been lots of COVID in Brazil in November 2019, why didn’t it spread? Why wasn’t Rio locking down at the same time Wuhan was? Why didn’t governments notice and start banning flights from Brazil? If there were a bunch of cases floating around Wuhan in autumn 2019, why didn’t any of them form noticeable case clusters, the same way COVID did everywhere else? Sure, the 30,000 negative blood tests already refute this, but you shouldn’t need those!
(and this is also how I feel about all the “A/B intermediate found in Malaysia! A/B intermediate found in Shanghai!” claims. Okay, so there were 1 billion cases of A, 2 billion cases of B, and . . . a single-digit number of cases of the intermediate, detected months later? Why? I’m not saying this is unanswerable, I’m saying that the fact that lab leak doesn’t even wonder about this or start making arguments for it is why I feel like they don’t have a story - just a stamp collection of anomalies that fade away under closer observation).
I know this comments post won’t be the end of the story. I know that (just as with every other one of my posts, I’m not blaming origins debaters in particular here) someone’s going to go “Sure, Scott confronted 489 arguments. But hw failed to confront the strongest argument against his case - this one obscure article in a Nepalese journal that nobody except me has ever heard of. That means he’s a bad-faith actor strawmanning everyone he disagrees with!” I know that someone will find some detail I’m wrong about and spam it all over Twitter with “Scott didn’t realize that an 91Q mutation is different from a ZY6 mutation, how can you ever trust anything he says?” And I know that next month, someone will come up with another SMOKING GUN! - and if I don’t respond to it immediately they’ll say I’m scared and know I’ve lost and am refusing to admit I’m wrong out of sheer stubbornness, and twist some quote of mine to show I’ve admitted I’ve changed my mind.
(The one argument I know about, haven’t responded to, and it really is because I’m lazy and scared and bad is Michael Weissman’s Bayesian analysis here. It’s 25,000 words and uses a bunch of logits and calculus. Sorry, pass.)
If it helps, I’m currently working out terms for a 6-digit lab leak bet of my own (no guarantee this will come to fruition, most of these fall apart in the resolution criteria stage). I feel bad for not being willing to answer every possible lab leak argument going forward, but hopefully offering lab leakers a few hundred thousand dollars if I’m wrong will be a suitable consolation prize.
For now, I’m still at 90-10 zoonosis.