The Most Heavily-Referenced SARS-CoV2 Model is Now Systematically Underestimating the Virus' Impact
I don't recall another time of my life when computer models have occupied so much of the national conversation. The model we talk about most often was created by The Institute for Health Metrics and Evaluation of The University of Washington (IHME). In the US it is given credit, more so than any other model, for helping shape US policy regarding SARS-CoV2.
It's not universally loved. Many have criticized it for having made the pandemic seem like a larger problem than it really was. Others respond that the pandemic didn't become a larger problem specifically because of the actions we took based on the model's findings. I doubt we'll ever see agreement on that. Regardless, few will say it hasn't been the most influential model, which is why this article merits your attention. Bluntly, the model's results are becoming increasingly misleading. From here on out the model will significantly underestimate the impact of SARS-CoV2 in terms of the duration of the pandemic, the hospital resources required, and the human cost.
This isn't obvious right now because the model's overall results have concealed the existence of two, systematic errors that have offset one another. Simply put, the model overestimates deaths in states that have not yet hit their peak, and it underestimates deaths in states past their peak.
I wrote last week that New York State will not hit a clear peak on April 9, as IHME's model had predicted. Rather, I said that deaths would stay at about the same level. That day 799 New Yorkers died. Since then deaths have averaged 758 a day. Yesterday (April 15) New York lost 775 of its own -- 251 more than IHME's estimate of 524. IHME revises its estimates constantly. It currently projects 297 SARS-CoV2 deaths in New York State this coming Monday (April 20). That's unlikely.
In the paragraphs that follow, I'll present a number of analyses that progressively demonstrate the almost inevitable conclusion that deaths will soon be significantly greater in the US than IHME's model indicates.Then I'll suggest what could be done to remedy the situation.
Demand for Hospital Resources in New York and Louisiana
IHME's frequent model updates make it impossible, for example, to determine from their website how accurately its model from two weeks ago is predicting what is happening today. This type of information is critical in evaluating a predictive model. Fortunately, they have enabled these types of analyses by making available the data from past model iterations. The following analysis employs data from two of these iterations: March 25 (their first published model) and April 9 (the model produced from data available a week ago).
The diagram below charts IHME's estimates for the number of required ICU beds in New York State from March 21 - April 21. It also plots the data point representing the actual number ICU beds in use on March 26. Here's what's important to know:
The original March 25 Model (Orange Line) started out in the wrong place. It overstated the actual number of ICU Rooms in use by over 300%. You can see that by looking at the difference at the estimate for March 26 (4,123) compared with the actual number, represented by the red diamond (1,290).
The April 9 Model (Blue Line) fixed that problem by re-anchoring the model on the actual number of ICU rooms in use on March 26
The re-anchored April 9 model showed much lower overall ICU room usage and a later peak than did the original March 25 model.
It would have been better, of course, had the March 25 Model started out in the right place, but modelling is time consuming and complicated. While the initial error is regrettable, IHME appeared to have corrected it.
The problem is it looks like they only corrected the starting point. When they reran the model on April 9, they should have updated the historical data through that date. But it doesn't look like they did that.
You can see this in the chart below that adds in the data for actual ICU room counts through April 14 (red line). Notice that at no point prior to April 10 do actual ICU room counts match those of the April 9 model, despite the fact these data would have been known at the time the model was run.
The only time the model results and the actuals meet is around April 12 when the model transitioned from overestimating ICU room usage to underestimating it. Given the slope of the actual and modelled lines, the gap between model results and actual ICU usage will continue to increase.
Subsequent updates have not fixed this discrepancy. As of last night (April 15) IHME's website (screenshot below) indicated 4,194 ICU's would be needed in New York on April 14. In reality 5,225 were in use.
New York isn't the only state where estimates for critical hospital resources don't match what the states are reporting. Last night IHME indicated 244 ventilators are currently required in Louisiana. Louisiana's web site, on the other hand, indicates there are presently 425 patients on ventilators.
New York and Louisiana are not isolated examples. And the problems with the estimates are not confined to the usage of hospital resources. Since April 9, IHME's model has underestimated the number of deaths in New York and Louisiana by 14% and 41% respectively. I'll discuss this in greater detail in the section that follows the next.
Underestimate of Deaths in European Post-Peak Countries
IHME has recently started modeling the data of many countries other than the United States. Because many of these countries are much farther past their peaks than are any US state, they provide a very useful way to demonstrate how the models are likely to be working in the US in a few week's time.
Above is a snapshot I took last night of actual and IHME projected deaths for Italy. The solid line indicates actuals through April 12. IHME's projections are represented by the dotted line. Italy hit its peak 19 days ago. Since then the number of deaths hasn't been a smooth graceful curve downward. In fact, deaths in Italy have recently hovered around 600 per day.
Now look at IHME's projections. Their projections from April 13 onward look nothing like the prior 19 days. From the look of it they are just applying a set curve to Italy's data without consideration as to what occurred in the prior days and weeks. Their estimate for the number of deaths yesterday was 160. In reality there were 578 deaths. The estimate was low by 418.
A week from today (4/23) IHME's model estimates there will be 25 deaths due to SARS-CoV2 in Italy. I just have to say it. That's preposerous.
Their estimates for Spain are no more realistic. Yesterday Spain lost 557 souls. IHME currently estimates that a week from today the virus will only take 49 more. Unfortunately, the toll is going to be higher than that.
OK, one more, IHME's estimate for deaths in France yesterday was 355. The actual number recorded was 1,438.
Understating Deaths in the US
Those were some large underestimates. How's it going in the US? There are 11 states for which IHME had assigned a peak on or before April 9. IHME estimated these states would experience 6,594 deaths between April 10 and April 15. Actual deaths over that period of time were 24% higher at 8,169. So, it's possible we're going down the same road as Italy, Spain, and France.
The chart below compares IHME's estimated deaths in New York State through April 30 along with actuals through April 15. The blue bars represent actual deaths, while the orange bars indicate IHME's estimates. Notice how the drop of the estimates is not matched by the actuals.
More importantly, look at the total number of estimated deaths on April 30. IHME's April 9 model indicated New York would experience 12 deaths that day. IHME's website now pegs that number at 21. That's 14 days from now. With over 18,000 New Yorkers hospitalized and over 5,000 in an ICU, that is simply not going to happen.
Recall from above there are a total of 11 states that have passed their peaks. The chart below shows the difference between actual deaths and IHME's estimates for each state.
To again take New York as an example, from April 10 - April 15, New York lost 4,550 to SARS-CoV2. Over that same time frame, IHME estimated they'd lose 3,917, a difference of 633 lives. That's the figure you'll find in the chart below.
IHME underestimated the number of deaths that would occur in 9 out of the 11 states. Five states had underestimates of over 150 deaths: Illinois, Lousiana, Michigan, New Jersey, and New York. Going the other way the state with the greatest overestimate was Washington State with 46.
To complete the picture, the chart below shows the figures for the 36 states whose peaks are projected to occur on or after April 15. IHME overestimated the number of deaths in 31 of these 36 states. The total number of overestimated deaths was 1,086. Maryland and Pennsylvania were the only two notable exceptions to this general rule.
As time goes by and these states approach and pass their peaks, they will compound the problem of the IHME model underestimating the near-term severity of the pandemic.
The nation needs a more accurate forecast that it can rely on to allocate resources and set policy. There are three alternatives for a near-term fix.
Modify IHME's model to utilize a more realistic curve for hospitalization, mortality, and related trends
Identify another existing model that better fits the way the pandemic is developing in the US and elsewehere
Develop a new model, paying particularly close attention to ensuring the model accurately predicts near-term resutls and is based on up-to-date information
In the intermediate, term, the nation needs to create a more refined, localized model based on more granular data like new case counts, the duration of individual hospitalizations (including the ICU), the pre-existing conditions of the ill, etc.
This work should in turn form part of the basis for an early analytic warning system to identify when and where flare ups of the pandemic are occurring.