The Roman Philosopher Lucius Anneaus Seneca (4 BCE-65 CE) was perhaps the first to note the universal trend that growth is slow but ruin is rapid. I call this tendency the "Seneca Effect."

Monday, May 1, 2023

When Science Fails: Surrogate endpoints and wrong conclusions


Galileo Galilei and Anthony Fauci are linked to each other by a chain of events that started at the beginning of modern science, during the 17th century. But the Science that Fauci claimed to represent is very different from that of Galileo. While Galileo studied simple linear systems, modern science attempts to study complex, multi-parameter systems, where the rigid Galilean method just cannot work. The problem is that, while it is obvious that we can measure only what we can measure, that's not necessarily what we want, or need, to measure. Tests based on "surrogate endpoints" may well be the best we can do in medicine and other fields, but we should understand that the results are not, and cannot be, a source of absolute scientific truth.

The Scientific Method

Galileo Galilei is correctly remembered as the father of modern science because he invented what we call today the "scientific method," sometimes still called the "Galilean method." It is supposed to be the basis of modern science; the feature that makes it able to be called "Science" with a capital first letter, as we were told over and over during the Covid pandemic. But what is really this scientific method that's supposed to lead us to the truth? 

Galileo's paradigmatic idea was an experiment about the speed of falling objects. It is said that he took two solid metal balls of different weights and dropped them from the top of the Pisa Tower. He then noted that they arrived at the ground at about the same time. That allowed him to lampoon an ancient authority such as Aristotle for having said that heavier objects fall faster than lighter ones (*). There followed an avalanche of insults to Aristotle that continues to this day. Even Bertrand Russel fell into the trap of poking fun at Aristotle, accused of having said that women have fewer teeth than men. Too bad that he never said anything like that.

It may well be that Galileo was not the first to perform this experiment, and it is not even clear that he actually performed it, but that's a detail. The point is that the result was evident, clear-cut, and irrefutable. Later, Newton started from this result to arrive to the assumption that the same force that acted on an apple falling from a tree in his garden was acting on the Moon and the planets. From then on, science was supposed to be largely based on laboratory experiments or, anyway, experiments performed in tightly controlled conditions. It was a major change of paradigm: the basis of the scientific method as we understood it today.

The Pisa Tower experiment succeeded in separating the two parameters that affect a falling body: the force of gravity and the air drag. That was relatively easy, but what about systems that have many parameters affecting each other? Here, let me start with the case of health care, which is supposed to be a scientific field, but where the problem of separating the parameters is nearly impossible to overcome.

The surrogate endpoint in medicine

How can you apply the scientific method in medicine? Dropping a sick person and a healthy one from the top of the Pisa Tower won't help you so much. The problem is the large number of parameters that affect the nebulous entity called "health" and the fact that they all strongly interact with each other. 

So, imagine you were sick, and then you feel much better. Why exactly? Was it because you took some pills? Or would you have recovered anyway? And can you say that you wouldn't have recovered faster hadn't you taken the pill? A lot of quackery in medicine arises from these basic uncertainties: how do you determine what is the specific cause of a certain effect? In other words, is a certain medical treatment really curing people, or is it just their imagination that makes them think so?

Medical researchers have worked hard at developing reliable methods for drug testing, and you probably know that the "gold standard" in medicine is the "Randomized Controlled Test" (RCT). The idea of RCTs is that you test a drug or a treatment by keeping all the parameters constant except one: taking or not taking the drug. It is designed to avoid the effect called "placebo" (the patient gets better because she believes that the drug works, even though she is not receiving it) and the one called "nocebo" (the patient gets worse because he believes that the drug is harmful, even though he is not receiving it). 

An RCT involves a complex procedure that starts with separating the patients into two similar groups, making sure that none of them knows to which group she belongs (the test is "blinded"). Then, the members of one of the two groups are given the drug, say, in the form of a pill. The others are given a sugar pill (the "placebo"). After a certain time, it is possible to examine if the treatment group did better than the control group. There are statistical methods used to determine whether the observed differences are significant or not. Then, if they are, and if you did everything well, you know if the treatment is effective, or does nothing, or maybe it causes bad effects.  

For limited purposes, the RCT approach works, but it has enormous problems. A correctly performed RCT is expensive and complex, its results are often uncertain and, sometimes, turn out to be plain wrong. Do you remember the case of "Thalidomide"? It was tested, found to work as a tranquilizer, and approved for general use in the 1960s in Europe. It was later discovered that it had teratogenic effects on fetuses, and some 10.000 babies in Europe were born without arms and legs before the drug was removed from the market. Tests on animals would have shown the problem, but they were not performed or were not performed correctly. 

Of course, the rules have been considerably tightened after the Thalidomide disaster and, nowadays, testing on animals is required before a new drug is tested on humans. But let's note, in passing, that in the case of the mRNA Covid vaccines, tests on animals were performed in parallel (and not before) testing on humans. This procedure exposed volunteers to risks that normally would not be considered acceptable with drug testing. Fortunately, it does not appear that mRNA vaccines have teratogenic effects. 

Even assuming that the tests are complete, and performed according to the rules, there is another gigantic problem with RCT: What do you measure during the test?  Ideally, drugs are aimed at improving people's health, but how do you quantify "health"? There are definitions of health in terms of indices called QALY (quality-adjusted life years) or QoL (quality of life). But both are difficult to measure and, if you want long-term data, you have to wait for a long time. So, in practice, "surrogate endpoints" are used in drug testing.  

A surrogate endpoint aims at defining measurable parameters that approximate the true endpoint -- a patient's health. A typical surrogate endpoint is, for instance, blood pressure as an indicator of cardiovascular health. The problem is that a surrogate endpoint is not necessarily related to a person's health and that you always face the possibility of negative effects. In the case of drugs used to treat hypertension, negative effects exist and are well known, but it is normally believed that the positive effects of the drug on the patient's health overcome the negative ones. But that's not always the case. A recent example is how, in 2008, the drug bevacizumab was approved in the US by FDA for the treatment of breast cancer on the basis of surrogate endpoint testing. It was withdrawn in 2011, when it was discovered that it was toxic and that it didn't lead to improvements in cancer progression (you can read the whole story in "Malignant" by Vinayak Prasad).  

Consider now another basic problem. Not only the number of parameters affecting people's health are many, but they strongly interact with each other, as is typical of complex systems. The problem may take the form called "polydrug use," and it especially affects old people who accumulate drugs on their bedstands, just like old cars accumulate dents on their bodies. An RCT test that evaluates one drug is already expensive and lengthy; evaluating all the possible combinations of several drugs is a nightmare. If you have two drugs, A and B, you have to go through at least three tests: A alone, B alone, and the combination of A+B. If you have three drugs, you have seven tests to do (A, B, C, AB, BC, AC and ABC). And the numbers grow rapidly. In practice, nobody knows the effects of these multiple drug uses, and, likely, nobody ever will. But a common observation is that when the elderly reduce the number of medicines they take, their health immediately improves (this effect is not validated by RCTs, but that does not mean it is not true. I noted it for my mother-in-law who died at 101). 

The case of Face Masks 

Some medical interventions have specific problems that make RCTs especially difficult. An example is that of face masks to prevent the spreading of an airborne pathogen. Evidently, there is no way to perform a blind test with face masks, but the real problem is what to use as a surrogate end-point. At the beginning of the Covid pandemic, several studies were performed using cameras to detect liquid droplets emitted by people breathing or sneezing with or without face masks. That was a typical "Galilean," laboratory approach, but what does it demonstrate? Assuming that you can determine if and how much a mask reduces the emission of droplets, is this relevant in terms of stopping the transmission of an airborne pathogen? As a surrogate endpoint, droplets are at best poor, at worst misleading.  

A much better endpoint is the PCR (polymerase chain reaction) test that can directly detect an infection. But even here, there are many problems. As an example, consider an often touted study performed in Pakistan that claimed to have demonstrated the effectiveness of face masks. Let's assume that the results of the study are statistically significant (really?) and that nobody tampered with the data (and we can never be sure of that in such a heavily politicized matter). Then, the best you can say is that if you live in a village in Pakistan, if there is a Covid wave ongoing, if the PCR tests are reliable, if the people who wore masks behave exactly like those who don't, and if random noise didn't affect the study too much, then by wearing a mask you can delay being infected for some time, and maybe even avoid infection altogether. Does the same result apply to you if you live in New York? Maybe. Is it valid for different conditions of viral diffusion and epidemic intensity? Almost certainly not. Does it ensure that you don't suffer adverse effects from wearing face masks? Duh! Would that make you healthier in the long run? We have no idea.

The Pakistan study is just one example of a series of studies on face masks that were found to be ill-conceived, poorly performed, inconclusive, or useless in a recent rigorous review published in the Cochrane Network. The final result is that no one has been able to detect a significant effect of face masks on the diffusion of an airborne disease, although we cannot say that the effect is actually zero. 

The confusion about face masks reached stellar levels during the COVID-19 pandemic. In 2020, Tony Fauci, director of the NIAID, first advised against wearing masks, then he reversed his position and publicly declared that face masks are effective and even that two masks are better than just one. Additionally, he declared that the effectiveness of masks is "science" and, therefore, cannot be doubted. But, nowadays, Fauci has reversed his position, at least in terms of mask effectiveness at the population level. He still maintains that they can be useful for an individual "who religiously wears a mask." Now, imagine an RCT dedicated to demonstrating the different results of "religiously" and "non-religiously" wearing a mask. So much for science as a pillar of certainty. 

Surrogate endpoints everywhere

Medicine is a field that may be defined as "science" since it is based (or should be based) on data and measurements. But you see how difficult it is to apply the scientific method to it. Other fields of science suffer from similar problems. Climate science, ecosystem science, biological evolution, economics, management, policies, and others are cases in which you cannot reproduce the main features of the system in a laboratory and, at the same time, involve a large number of parameters interacting with each other in a non-linear manner. You could say, for instance, that the purpose of politics is to improve people's well-being. But how could that be measured? In general, it is believed that the Gross Domestic Product (GDP) is a measure of the well-being of the economy and, hence, of all citizens. Then, it is concluded that economic growth is always good, and that it should be stimulated by all possible interventions. But is it true? GDP growth is another kind of surrogate endpoint used simply because we know how to measure it. But people's well-being is something that we don't know how to measure. 

Is a non-Galilean science possible? We have to start considering this possibility without turning to discard the need for good data and good measurements. But, for complex systems, we have to move away from the rigid Galilean method and use dynamic models. We are moving in that direction, but we still have to learn a lot about how to use models and, incidentally, the Covid19 pandemic showed us how models can be misused and lead to various kinds of disasters. But we need to move on, and I'll discuss this matter in detail in an upcoming post. 


(*) Aristotle's "Physics" (Book VIII, chapter 10) where he discusses the relationship between the weight of an object and its speed of fall:

"Heavier things fall more quickly than lighter ones if they are not hindered, and this is natural, since they have a greater tendency towards the place that is natural to them. For the whole expanse that surrounds the earth is full of air, and all heavy things are borne up by the air because they are surrounded and pressed upon by it. But the air is not able to support a weight equal to itself, and therefore the heavier bodies, as having a greater proportion of weight, press more strongly upon and sink more quickly through the air than do the lighter bodies."



  1. Well my friend, the biological warfare people for decades now have built a pretty clear set of protocols. Respirators, bubble suits etc are how they deal with the materials in the lab and in the field. Nobody in that community would tell you a cloth face mask was any good at stopping nano particle materials. In the recent past a lot of medical professionals gave opinions without training or sufficient knowledge. An odd thing to see in such a coordinated way. Many environmental exposure experts were marginalized so as to favor the mask as inconclusive... or it can't hurt. But it can and it did. Even today many ill informed put confidence in them. If you review the work prior to 2019 the results were pretty consistent... nothing better than the noise floor could support the idea. Then viola, in 2020 we started to see published work that suggests "some" efficacy. Now add the ever popular "meta-analysis" and you get "we cannot discount the value." LOL, that's not how the scientific method really works. Not to mention the squelching of opposition as to bacterial loading of the masks, CO and CO2 re-aspiration and on and on. It has never in all of history been this easy to create hoards of useful idiots.

  2. "Biden Energy Secretary Jennifer Granholm, a Canadian born lawyer with no military background, testified Wednesday to the Senate Armed Services Committee that she supported requiring the United States military to move to an all-electric vehicle fleet by 2030."

    A lawyer "cluless about energy" Energy Secretary of the Absurdistan formerly known as the US of A, clueless about the military, makes pronoucements to lawmakers about energy and the military. But she does know who butters her bread.

    1. What’s easier to bring to the battlefield diesel fuel or electricity?

    2. Hmmm, interesting question. Apart from the limiting range, potential for spectacular battle damage ignition and the slow recharge rate (unless you have JP8 powered gensets around) the trade-offs don't look so appealing for batteries. Fossil fuels are horrible to re-supply too. One thing for sure, Ms Granholm can only parrot a position as her public press Q&As reveals she has no idea of what she's talking about!

    3. One outcome of the US military going all-electric will be the eventual forced closure/abandonment of all overseas US bases and a return to a strictly defensive role. Might not be a bad thing!

  3. Galilei wasn't corrupt and working for some Big Industry. Galilei didn't cause the Earth to go around the Sun either.

    1. Maybe Galilei and Kepler were part of a cabal that caused the heliocentric situation we now find ourselves in - up until the 17th century, we were doing just fine on a flat disk held aloft by four giant elephants on the back of a turtle! My point is that you're conflating motivation and moral standing with achievement and causality.

      For example, there's evidence that Erwin Schrodinger was a paedophile, but that doesn't undermine the validity of his work.


  4. "Ibn al-Haytham was an early proponent of the concept that a hypothesis must be supported by experiments based on confirmable procedures or mathematical reasoning—an early pioneer in the scientific method five centuries before Renaissance scientists" -

    Mass extraction of finite fossil fuels starting 300 years ago - has actually killed Science - if we remember that "the scientific method" has already been arrived at - 10 centuries ago.

    In the last two weeks or so, youtube has been sinking in well-financed and professionally produced posts - all assert that the Big Bang is dead...

    Well, the big-bang theory has more or less hypothesised when oil in Iraq has not been ever touched yet.

    Today, the big-bang theory proves shaky - yet oil in Iraq is almost all gone - and we as if defaulting back to the basics - if not to the Science of the Islamic era in Space and Cosmology - ha ha ha

    Now humans will have to wait until crude oil reserves deplete to the ground - to get real Science - back on earth - but that is much easier said than what it tragically means

    "No system of energy can deliver sum useful energy in excess of the total energy put into constructing it.
    This universal truth applies to all systems.

    Energy, like time, flows from past to future".


    1. Thinking about your musical chairs theory of demand destruction based on nation state borders, just looking around, that seems to be what might be going on. Some countries that have been largly cut off from oil: Venezuela, Syria, Sri Lanka, Lebanon, maybe Pakistan is next.

      I guess that Mexico has recently stopped all oil exports.

      It will be interesting to see how de-dollarization and the establishment of new and varied trading blocks, such as BRICS, affects the process,

    2. The"demand-destruction" may happen at the oil-well - but not on other side of the world - where Control is managed and weapons, etc are manufactured 24/7 to execute and orchestrate that assumed elusive demand-destruction - which will increase demand - naturally....

      In the result - no demand destruction is achieved - and extracting fossil fuels always remains an operation that requires minimum number of humans dedicated to the task - all demanding fossil fuels - up and up*...

      Today, that minimum number of humans required for fossil fuels extraction - is 8 billion, flat out, 24/7 - strong...

      Our Western Civilisation has burned most of fossil fuels to the ground since Jevons - simply to secure dealing with fossil fuel reserves to a Central Plan

      Our naïve Central Plan cannot understand that destroying-demand at the oil-well creates more demand somewhere else, and depopulating people at the oil-well makes more people emerge somewhere else...

      No matter how the extraction of fossil fuels seems totally mechanised and only a tiny minority of people is physically involved in it on the ground - the process actually involves - behind scenes - all humans - as if fossil fuels are extracted by hands with shovels, buckets and ropes....

      The rest of humans that don't come to the oil-field hand-picking fossil fuels - are working no less hard - to protect fossil fuels from falling into any other than the hands of our Central Plan - i.e to protect fossil fuels from themselves - ensuring the flow of fossil fuels continues uninterrupted...

      What is called US Dollar, BRICS and Digital Currencies - are no more than tools to protect fossil fuels from falling in the hands of others - which implies - no extraction will be sustained - a reality humans feel crazy against it...

      This is regardless of how Control wants the choreography to look Hollywood, dramatic and polarised - going from US Dollars to what's called Alice in the BRICS wonderland ...

      Our outgoing Western Civilisation - and all what it was and is doing - is simply no more than a fossil fuels-extraction operation - ha ha ha

      * This far, individual voices in the Middle East are calling for re-settling 20 million Iraqis (43 million population, up from just 10 millions in 1980 when Iraq-Iran War erupted) immediately - into those nations that have destroyed and destroying Iraq, importing today most of its oil - mainly the US, Canada, Europe and Japan. This is to avoid the explosion in Iraq's population pushing a chaotic China-assisted construction boom that will end up no different from Shanghai's, and imminently later Shanghai's brutal gulag-type - lockdowns - if B-52s and Sukhois don't like to give a hand...
      The campaign's slogan is "so traverse ye through its tracts";
      "It is He Who has made the earth manageable for you, so traverse ye through its tracts" - Quran

    3. I kind of think that's the point of demand destruction, no matter how it might be carried out, or where various boundaries might be drawn.

      Someone will use the oil, if at all possible. But less (or none) for you, is more for me. That kind of thing.

  5. Broad subject. Anyhoo, I think that both modern medicine and modern science are like Siamese Twins, two heads on one body. One head screams "health and long life", while the other yells and babbles about "knowledge, truth and rationality". But when the body gets hungry both cry in unison, "feed me".

  6. The way I like to think about this: classical physics and chemistry are "hard science". Everything else, including quantum physics, is something qualitatively different. Therefore, it's a category error to apply the methods of "hard science" in other domains. More specifically, you can try those methods but you have to understand they won't give definitive answers.

  7. When you think of it, you realize that Newtonian physics is not even able to solve a problem as "simple" as the problem of N bodies in which :
    there is only one force (gravity)
    there is no friction, nor electromagnetic forces
    there is no dust between the bodies
    The speed of the bodies is low compared to the speed of light in the vacuum,
    the gravitational fields are weak
    and so on.

    The first thing to admit is the weakness of the present science, even the so-called hard science. The second is that one can still be enthusiastic to have such a vast field of exploration.

  8. One thing facemasks did do was make pedestrian tracking more difficult and manufactured a false consensus of public opinion. Take away masks and I doubt belief in a pandemic could have gotten traction. If one looks to China I believe they have face id as payment verification so maybe the fed wasn't ready.

    I feel a joke must already exist along the lines of two doctorates being in a bar when one warns the other he has been hitting on a married woman and now there is a very angry man approaching. He responds that he'll need to see something peer reviewed before he's willing to duck.

  9. "Fortunately, it does not appear that mRNA vaccines have teratogenic effects. "
    Are you sure about that? I guess death is not considered a birth defect, so maybe that's true.

  10. Hello Ugo and all a somewhat related is an artice appearing in the New York Times. Notable mostly because it was in the Times Attempting to share ...
    If that didn't work ... a peer reviewed paper by well published scientists suggesting that scientists should be objective and " a gay white man should get the same results from an experimental as a heterosexual black woman (example) was rejected...I couldn't even make this up. ArtDeco

  11. A quote from the above paper... explain why it was rejected...

    What’s being advocated are philosophies that are explicitly anti-scientific,” Anna Krylov, a chemistry professor at the University of Southern California and one of the paper’s authors, told me. “They deny that objective truth exists.” Having grown up in the Soviet Union, where science was infused with Marxist-Leninist ideology, Krylov is particularly attuned to such threats. And while she has advocated on behalf of equal treatment for women in science, she prefers to be judged on the basis of her achievements, not on her sex. “The merit of scientific theories and findings do not depend on the identity of the scientist,” she said in a phone interview. ArtDeco