Did COVID-19 come from a laboratory?
2k
11kṀ4.3m
2040
47%
chance
Rootclaim debate released
-13.0%
on
ACX article published https://www.astralcodexten.com/p/practically-a-book-review-rootclaim
-12.0%
on

This market resolves once we have a definitive answer to this question. (i.e. "I've looked at all notable evidence presented by both sides and have upwards of 98% confidence that a certain conclusion is correct, and it doesn't seem likely that any further relevant evidence will be forthcoming any time soon.")

This will likely not occur until many years after Covid is no longer a subject of active political contention, motivations for various actors to distort or hide inconvenient evidence have died down, and a scientific consensus has emerged on the subject. For exactly when it will resolve, see /IsaacKing/when-will-the-covid-lab-leak-market

I will be conferring with the community extensively before resolving this market, to ensure I haven't missed anything and aren't being overconfident in one direction or another. As some additional assurance, see /IsaacKing/will-my-resolution-of-the-covid19-l

(For comparison, the level of evidence in favor of anthropogenic climate change would be sufficient, despite the existence of a few doubts here and there.)

If we never reach a point where I can safely be that confident either way, it'll remain open indefinitely. (And Manifold lends you your mana back after a few months, so this doesn't negatively impact you.)

"Come from a laboratory" includes both an accidental lab leak and an intentional release. It also counts if COVID was found in the wild, taken to a lab for study, and then escaped from that lab without any modification. It just needs to have actually been "in the lab" in a meaningful way. A lab worker who was out collecting samples and got contaminated in the wild doesn't count, but it does count if they got contaminated later from a sample that was supposed to be safely contained.

In the event of multiple progenitors, this market resolves YES only if the lab leak was plausibly responsible for the worldwide pandemic. It won't count if the pandemic primarily came from natural sources and then there was also a lab leak that only infected a few people.

I won't bet in this market.

Get
Ṁ1,000
to start trading!
Sort by:

Dr Jane Qui has gone from being very dismissive of any lab leak scenario to now penning a piece in the Guardian saying it's not a conspiracy theory and blaming the likes of Peter Daszak for damaging trust in science. Quite a turnaround. https://www.theguardian.com/commentisfree/2025/jun/25/covid-lab-leak-theory-right-conspiracy-science

@MikePa67d Until the other day, Jane Qiu was part of the conspiracy theory in lab leak world. Now she and Peter Daszak have some sort of falling out over a movie, we get an odd opinion piece that adds nothing to the debate and absolutely does not say that Jane Qiu thinks lab leak is likely.

But, give that there's no actual evidence to support any "lab leak" theory. The lab leak hive mind is jumping on this as vindication:

Same dude, back when Qiu was part of the global coverup conspiracy:

In the real world, this shows that there is no global coverup of "lab leak" being a likely origin. For people who thought that was somehow plausible a week ago, your inferred likelihood of lab leak should drop. For the rest of us, it's a boring spat between a journalist, a scientist, and a filmmaker spilling over in public.

That kind of spillover is a bit more common than the ones that cause pandemics.

@zcoli Aren't you the guy who gets paid to harass people who disagree with you by emailing their employers?

@Marion8w2 I sent a message (not by email) to the organization that has a book listed under "our work" on its website that says a paper I co-authored might be fraudulent, but doesn't explain why -- Bratlie has two employers and I didn't mail the irrelevant one. I asked if someone could explain what the issue is so that I could answer it; apparently and unsurprisingly, Bratlie can't back up what she writes and put on some performance on X claiming it was censorship to ask a question. If an organization employing someone says that a book is "our work" and the topic of the book is on the same topic as the work they do for the organization, I assume it's work for hire.

Ironically, the relevant part of her book is a call to remove articles, including one I co-authored, from the scientific literature.

The last thing I heard from her was posting an email of mine to X, including my email address, but cropping out the bottom of the email. Here it is -- the question she's so incapable of answering that she doesn't want her audience on X to know that it was asked.

The rest of the email can be seen here -- https://xcancel.com/sigridbratlie/status/1932786213762564129#m

Part of being a scientist is responding to criticism and that's what I tried to do there. This will be the end of responding to anonymous trolls coming over from X. I recommend taking it with a grain of salt when conspiracy theorists claim to be victims of censorship. Inevitably, claimed "censorship" is criticism that they can't respond to.

https://www.who.int/news/item/27-06-2025-who-scientific-advisory-group-issues-report-on-origins-of-covid-19

The WHO Scientific Advisory Group for the Origins of Novel Pathogens (SAGO), a panel of 27 independent, international, multidisciplinary experts, today published its report on the origins of SARS-CoV-2, the virus responsible for the COVID-19 pandemic.

SAGO has advanced the understanding of the origins of COVID-19, but as they say in their report, much of the information needed to evaluate fully all hypotheses has not been provided.

“I thank each of the 27 members of SAGO for dedicating their time and expertise to this very important scientific undertaking over more than three years,” said Dr Tedros Adhanom Ghebreyesus, WHO Director-General. “As things stand, all hypotheses must remain on the table, including zoonotic spillover and lab leak. We continue to appeal to China and any other country that has information about the origins of COVID-19 to share that information openly, in the interests of protecting the world from future pandemics.”

@George The report found that there's the same level of support for an engineering origin of SARS-CoV-2 that there is for other Intelligent Design origin theories.

The report describes the consensus best supported theory by scientists:

While available data support that the HSM played a significant role in early transmission and amplification, it is not conclusive that the HSM was where the virus first spilled over into the human population, or if it occurred through upstream infected humans or animals at the market.

The report then talks about additional evidence that could possibly be collected that could support this theory further or support something else. The paper on market environmental samples from Crits-Christoph et al on this subject says the same thing:

Any hypothesis of COVID-19’s emergence has to explain how the virus arrived at one of only four documented live wildlife markets in a city of Wuhan’s size at a time when so few humans were infected [3]. Human introductions linked to the animal trade offer one explanation for this, and the introduction of the virus by an animal trader or farmer cannot be excluded, but these hypotheses are challenged by phylodynamic evidence for multiple spillovers [11].

When it comes to lab leak on the other hand, there is no specificity at all about how the evidence demanded could test any lab leak theory. This would be impossible, because the there's no falsifiable lab leak theory presented in the report. The report can't even settle on which lab to investigate. The requested data is basically all biosafety data and occupational health data for two large organizations, plus access to open-ended interviews of everyone there. By definition everything that's requested couldn't falsify lab leak theories because the underlying assumption is that everyone who might spill the beans now has been lying for over five years as part of a perfect coverup. A failure to find lab leak evidence would be rejected by anyone who finds it plausible now that there was a lab leak and a massive cover up to suppress evidence of it.

It's telling that the access requested to vaguely investigate lab leak isn't requested for investigating wildlife origins -- because there's just no need to look for what evidence might exist were it not covered up; the evidence that's not covered up is strong enough.

But you can tell from SAGO wasting time humoring the "MA-30" theory that someone susceptible to lab leak nonsense has influence in the report. Plausibly that might be the person who spoke on behalf of a report that concluded one scenario out of four was the only one with supporting data, yet decided to lead with your "all hypotheses must remain on the table."

bought Ṁ50 NO

Can anyone point to a scientific manuscript that explains the available data and finds it more likely than not using any quantitative method that the COVID-19 pandemic originated in a lab?

It doesn't need to be peer reviewed -- anything will do. What is the best example you know of?

@zcoli Sure. I'm sorry it's a bit long but for a contentious issue with a variety of relevant evidence that's needed. I've gone over it with some very serious stats and virology folks, and its >10k readers include some intense zoonosis types, who have helped by finding some errors, now fixed. It's been stable for many months now.
https://michaelweissman.substack.com/p/an-inconvenient-probability-v57

It's got very extensive references.
Several other much shorter blogs on my substack deal with narrower parts of the question: the gross math errors in Pekar 2022, the improper Bayes methods used by Scott Alexander, etc.

@MichaelWeissman You’re pretty critical of people on one side of the issue yet cite Jesse Bloom’s deleted sequences paper extensively. What’s up with that? Omitting critical data in a paper seems a bit worse than anything anyone said on Slack that you quote.

https://academic.oup.com/mbe/article/42/6/msaf109/8158640

I don’t know how many qualified people you talked to, but I think it’d be worth taking the time to look at the primary data yourself for some of the many inaccurate things here you’re getting from others. For one example, the discussion of D614G. This didn’t quickly dominate because it happened many times — it quickly dominated because it was followed up by another important mutation to make B.1 and then another one to make B.1.1 — lineage A without D614G was more prevalent than lineage A with D614G until about April 2021.

For another example, you cite a plasmid encoding spike using CGG. It makes sense to use human codons for expressing a protein from a plasmid in human cells. It makes sense to use human coronavirus codons for engineering a human coronavirus. It makes no sense and it’s implausible that an engineer would use human codons to engineer a human coronavirus, rather than human coronavirus codons.

Your other example here is “plasmid primers” (itself a nonsensical term) in this table:

That underlined bit is an EcoRI site. The CG at the end is added to the primer because it increases efficiency of digestion to add a few nucleotides to the end. Then, they’re lost when this is digested and ligated with something else digested by EcoRI.

Someone spent a whole lot of time cherry picking papers tangentially related to WIV and ctrl+F’d for CGG in the text. Someone whose familiarity with molecular cloning doesn’t extend to literally the most common technique is the person informing you on what is and isn’t evidence of engineering. Knowing what this is is typically part of undergrad bio curricula.

Fully documented every example of this sort of thing in your document would take hours and it’s trivial to find examples. The very serious stats and virology folks are either not reading closely at all, not as serious as you say, or happy to have you make these nonsensical arguments.

@zcoli Zach- At a couple of points I think you've got the logic screwed up.
On using D614G as evidence of the recency of the FCS insert, that was explicitly to increase the likelihood of a zoonotic CGGCGG based on recent insert sequence properties. If it's not a recent insert then the CGGCGG probability falls to the typical rate for Asian coronaviruses, 0.0001, and the sequence becomes a smoking gun against zoonosis. You seem to assume that my arguments must all be against zoonosis but this one was intended to give it the fairest break possible. Your argument is backwards.

On your disputes with Bloom about the most likely MRCA, my blog already said "(These groups also suspect that the MRCA differed from A by an additional nt shared with wild relatives but not with B. There is some reason to doubt that conclusion since A differs from the main suspect by a T→C mutation, much less common at this stage than a C→T mutation, although non-reversionary mutations are much more common than reversionary ones.)". So you're arguing about a point that I explicitly don't use.


So one of your arguments is irrelevant and the other has the wrong sign of effect on the odds for your case.


Your claim that no engineer would use CGGCGG at least has the right sign for your case. Readers can compare my arguments (based on points made by people who engineer sequences) with yours and try to make their own rough estimates of the odds for that particular factor, the fourth most important of the likelihood factors used.

@MichaelWeissman I'm not discussing your quantitative argument at all. I'm demonstrating how your post is full of things you say are facts that are untrue or conclusions from unreliable sources such as Bloom's deleted sequences paper. It's relevant that you have no expertise in this and you are basing your argument on people who are at best very wrong. It makes no sense to discuss the logic applied to things that aren't facts. If it's true that it's a point that you don't use, it also makes no sense to discuss logic hidden in between irrelevant points.

If you had concluded zoonosis based on the same sort of nonsense I would be saying the same things. I don't care what direction the argument is in.

What in the world is "the typical rate for Asian coronaviruses" ? I promise you that alphacoronaviruses and betacoronaviruses sampled in Asia are less similar in every way than betacoronaviruses sampled inside and outside of Asia.

The fact is that the composition of the FCS is evidence against an engineering origin because no engineer would choose it, but natural selection doesn't care about codon usage tendencies that take hundreds of years to approach what we observe today. Might be worth a rethink on what timescale you're talking about here with "recency".

If the "plasmid primers" thing falls in the category of "points made by people who engineer sequences" then those people are lying to you in one way or another.

@zcoli Here's the relevant passage from my argument "In a broader set of relatives, the fraction of ArgArg pairs coded CGGCGG ranges from 0 outside Africa and Asia to 1/10790 in Asia to 1/5493 in Africa."
The broader set is betacoronaviruses.
https://www.preprints.org/manuscript/202110.0080/v2
I agree with your statement "I'm not discussing your quantitative argument at all." since you instead make ad hominem remarks and discuss factors that end up not being used.

@zcoli Here's your other engineering example for FCS insertion with one CGG:

In the one example of which I’m aware in which a collaborator of the WIV group added a 12nt code for an FCS to produce a viral protein via a plasmid (reminiscent of the 12nt addition in SC2) they only used CGG for one of its three Arg’s.

Let's check out the abstract; nope, this isn't correct at all. It's a plasmid for producing antibodies.

Your analysis is based on "facts" from people who are habitually wrong and/or lying (I think this one comes from Yuri?). Seriously, just slow down and pick any one thing and dig into the primary data yourself. Start with what you think is the most important factor. In this case, all it took was reading the abstract of the paper you linked or looking at any of the figures to realize this was nonsense. And it's so wildly nonsensical that whoever you heard it from should be ignored on everything else as well.

@MichaelWeissman BTW I see the paper cited for "plasmid primers" also has primers including tandem CGG-CGG. You write "these are for plasmid work and thus subject to substantially different optimization criteria" -- there's no optimization of that sequence; it's the sequence you find in bovine herpesvirus-1 isolates.

I supposed at one point this was yet another smoking gun for a synthetic origin of the FCS? It's something somehow associated with someone at WIV with an RRAR and the RR encoded by CGG-CGG?

A quick search of X showed that this was right: Yuri posted about it in Sept 2023 and you misinterpreted this as being a synthetic CGG-CGG as a choice:

It would kind of make sense as a choice for that type of virus, by the way -- CGG is one of two common arginine codons (herpesvirus also doesn't have the same codon usage as hosts).

The person Yuri credited with this had proposed another smoking gun just a couple months earlier for exactly the same thing:

How many smoking guns do people need to claim they found before you realize it's a trivial creative writing exercise? Pick any old random natural virus that's somewhat rare in nature and you can do exactly the same sort of cherry picking.

@zcoli To the limited extent that example could have affected P(CGGCGG|LL) it would have lowered it, since 1/3 is less than the rates used in the somewhat more relevant examples. You've got an unerring sense of how to find the most irrelevant tangents. For somebody interested in the passage you're criticizing, here it is

"In the one example of which I’m aware in which a collaborator of the WIV group added a 12nt code for an FCS to produce a viral protein via a plasmid (reminiscent of the 12nt addition in SC2) they only used CGG for one of its three Arg’s. Other plasmid primers from WIV use high fractions of CGG, including CGGCGG dimers, but again these are for plasmid work and thus subject to substantially different optimization criteria."


I can drop the word "viral" without changing the fact that these data are mentioned only for completeness and explicitly not used.


I think you consistently misunderstand the need for probabilistic reasoning. None of those features are used as "smoking guns" for LL. E.g. RRAR is not used as a signature of LL. The point is just that it's among the many reasonable possibilities for LL, just as it's among the many possibilities for ZW, and thus provides no reliable factor either way.
Likewise the preceding P in PRRAR could easily swing either way, so not used.

@MichaelWeissman None of the smoking guns proposed for the FCS are remotely plausible. A lot of impossible things don't add up to one possible thing. They fall into a few categories:

  • Misunderstanding expectations from natural selection e.g. David Baltimore and CGG-CGG, and Nicholas Wade's inane argument that inserts can only be acquired from closely related organisms.

  • "Underpants Gnomes" theories (look it up) missing a step without a plausible way that step could happen e.g. Sachs & Harrison ignore the P in PRRA, and Lisewski ignores the A in PRRA. Because there's no plausible explanation for either in their theories. Step 1: Notice homology and copy FCS around it, Step 2: ?, Step 3: Pandemic.

  • Blatantly cherry picked nonsense that's statistically insignificant e.g. "Adrian J" googling "WIV & PRRA", HIV inserts cherry picked from highly variable regions in single patients, and the Moderna patent thing that Daoyu Zhang cherry picked and some scientists, I think, stole the idea and lied about how they found it in a paper. Interesting aside: one of those scientists was once the world's youngest doctor!

That's a condensed history of FCS "smoking guns" and I'm missing some -- I think you mention Yuri arguing it's there as some sort of marker? Alina Chan once said the proline was there to introduce a restriction site. It's impossible to describe just how stupid all of this is.

So you've got a scale and on one side is natural selection and on the other side is a bunch of the worst examples of Intelligent Design that anyone's ever come up with. You reason that on balance, this means that it "provides no reliable factor either way." Gotta recalibrate your scale.

@zcoli again, you criticize many things I don't say.
E.g. I agree that Wade's claim that inserts only come from related organisms is false and therefore do not use it.
E.g. I don't use any of the "HIV inserts" stuff because there seems to be a consensus that it's BS, presumably because of the multiple comparisons issue.

So you've started with criticisms of things that I do not use and do not believe and then layered on all sorts of emotional verbiage,

@MichaelWeissman Neither of those is any more or less false than Bruttel's theory. I case anyone forgot, Bruttel responded to contradictory evidence by claiming the Chinese were monitoring his tweets and fabricating genomes to stay ahead of him. He stopped saying that when I told him it required time travel. He also thinks Ebola is a lab leak and some other lab leaks I can't keep track of.

Your article cites several of the orthogonal synthetic FCS theories that are all equally false. I forgot to include Yuri's cherry picked sick cat; that'd be in the last category I guess.

If you aren't going to do any of the work yourself to learn what's true and false, there's no point in going beyond that.

@zcoli Your "orthogonal" remark provides a sort of lead-in to a discussion of compound hypotheses. Having several possible ways something can happen under one broad hypothesis is normal. It applies particularly to natural evolution, so that gives a nice way to illustrate.

E.g. The FCS in SC2 looks like an improbable event. Early attempts to give a zoonotic account relied on point mutations and small inserts. Later attempts involved an insert in a bat. Or maybe in one of a variety of intermediate species. Or maybe in an immunocompromised person.
This is not a logical contradiction. In the limit of low probabilities for each account, the net probability here is just the sum of those. It's small, but not due to any problem with the parallel-possibility logic.

@MichaelWeissman Again, the net probability of one of a set of impossible things being possible is zero. There are countless plausible, low probability pathways via nature to get from the (recombinant) common ancestor of SARS-CoV-2 and RaTG13, BANAL-52, and MP789 to SARS-CoV-2, including acquiring an insert with or without subsequent adaptation to give the observed polybasic S1/S2 cleavage site.

I don't think Gallaher's HKU9 hypothesis is particularly plausible, by the way. Nor would I have agreed with him in 2009 that a lab origin of 2009 H1N1 was plausible. In general, anyone who says they can tell a story of where the insert came from (and if/how it subequently adapted or expanded or shrunk) is almost certainly wrong.

For what it's worth, my guess at a mechanism is that I figure the subset of plausible mechanisms are more likely that involve the repeat ahead of the S1/S2 insert that is found in SARS-CoV-2 but so far absent in the handful of related viruses sampled (TCAGACTCAGACT vs TCAGACTCAAACT).

The most plausible theory of the SARS-CoV-2 S1/S2 site being an engineered insert goes like this:

  1. WIV sampled something that was almost identical to SARS-CoV-2, but lacked the FCS. Experiments showed that its RBD bound hAce2. This virus was sequenced and all of the related experiments have been perfectly covered up.

  2. WIV sampled something else that was very, very similar to that, but a little bit more diverged from SARS-CoV-2 so that experiments showed its RBD didn't bind hAce2; this virus was sequenced and all of the related experiments have been perfectly covered up.

  3. WIV made a recombinant virus identical to the first one and has successfully covered up its existence--all of the associated sequencing and experiments and so on. It didn't grow all that well despite hAce2 binding.

  4. WIV made another recombinant virus with the observed, predicted cleavage site at S1/S2 (we will ignore the problem that the R-R-A-R sequence doesn't match R-X-[R/K]-R that WIV planned to search for). The construction and existence of this was also perfectly covered up.

No proposes this because it's less plausible than skipping steps 2 through 4 and simply finding SARS-CoV-2 in nature in step 1. No one proposes simply finding SARS-CoV-2 in nature because mad scientist theories are attractive to people with bad intuition who are incapable of recognizing that the supporting evidence is a 5.5-year-long creative writing exercise based on being able to Google "WIV and RRAR" for example, and elide the origins story about the immeasurable multiplicity of hypotheses you tested doing that.

You might want to check out the history of alternative proposals to explain global warming. There are many -- they're all wrong -- there could be 100 times more equivalent examples of people writing vague about "What if there's a cycle lasting X years that we just lack the knowledge to explain? We need more data to be sure enough to act! Imagine the cost if we're wrong!" It wouldn't reduce the likelihood that greenhouse gas emissions are responsible. These alternative theories are all be in the "what about blah blah blah?" genre and fail to explain all of the observed data. They all come hand in hand with questioning the legitimacy of published data, often arguing that there's a global conspiracy of relevant experts to suppress the truth. Sound familiar?

https://doi.org/10.1093/molbev/msaf109

In 2021, Jesse Bloom published a study addressing why the earliest SARS-CoV-2 sequences in Wuhan from late December 2019 were not those most similar to viruses sampled in bats. The study concluded that recovered partial sequences from Wuhan and annotation of Wuhan links for other sequences increased support for one genotype as the progenitor of the SARS-CoV-2 pandemic. However, we show that the collection date for the recovered sequences was January 30, 2020, later than that of hundreds of other SARS-CoV-2 sequences. Mutations in these sequences also exhibit diversity consistent with SARS-CoV-2 sequences collected in late January 2020. Furthermore, we found that Wuhan exposure history was common for early samples, so Bloom's annotation for a single familial cluster does not support that an early genotype was undersampled in Wuhan. Both the recovered partial sequences and additional annotation align with contemporaneous data rather than increase support for a progenitor. Our findings clarify the significance of the recovered sequences and are supported by additional data and analysis published since mid-2021.

New work published. Shows that an earlier paper relied on unreported data exclusion and selective annotation to conclude that the the first human SARS-CoV-2 infection or infections that led to the pandemic were not viruses with lineage A or lineage B genomes. Both lineage A and lineage B were found in Huanan market, consistent with the pandemic originating in that market's trade in live animals susceptible to SARS-CoV-2 infection.

Comment hidden
Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules