On Tuesday, April 28, the seminar talk, “Coping with Heterogeneity and Uncertainty of COVID-19 Datasets,” was held via Zoom. The seminar is the third of the UC Santa Barbara’s interdisciplinary COVID-19 Webinar series, “Issues, Approaches, and Consequences of the COVID-19 Crisis,” sponsored by UCSB, Novim and Cottage Health.

Measuring uncertainty with confidence intervals, in IMHE model. Courtesy of Yu-Xiang Wang.

The speakers of the seminar were Professor Ambuj K. Singh and Assistant Professor Yu-Xiang Wang, both from the UCSB Computer Science department. 

Professor Singh started the seminar by looking at some lessons from three historical epidemics, including the cholera outbreak in 1854, the Spanish flu in 1918-1919 and the Ebola outbreak in 2013-2016, to understand how data and models in the past have informed people’s decisions. 

More specifically, Professor Singh mentioned that scientists “[were] able to do contact tracing” back in the cholera outbreak in 1854.” He also stated that “the prediction was 10 times worse than what actually happened” during the Ebola outbreak and that through the analysis of Ebola models, people found that the error was caused by the fact that “human behavior changed at a much faster rate than what the models predicted.”

After a discussion from the historical aspect, Professor Singh introduced three features of COVID-19 datasets: heterogeneous, uncertain and dynamic.  

According to Professor Singh, COVID-19 datasets are heterogeneous in the sense that “the risk factors vary at the level of a single person, a country or a region based upon geography, culture and the human element.”

“Testing errors are there, we are not certain about what is the rate, what is the dynamics of the disease itself.” For example, how long does it take for someone to heal? That is “hard to gauge,” said Professor Singh.

Lastly, Professor Singh mentioned, “COVID-19 datasets are highly dynamic in the sense that there are frequent tightening and loosening of interventions, change of behavior and the disease dynamic itself changes at the level of a single person or the level of the population.”

In the last part of Professor Singh’s talk, he shared his personal point of view on the steps that need to be taken before reopening the campus. First, “we need to make [sure] contact tracing works on campus. We can ask students, staff and faculties whether they want to be a part of it, and people who opted in will be using a phone app or a manual system to trace their contacts.” 

On top of contact tracing, Professor Singh added, “We need to be doing three things: We need to have a network in which we can predict, and we need to have a system for testing, and we need to build a way of [treating] people that have fallen sick — and we need to have alternate plans if things don’t work out. If we can make those three things happen, it is feasible a campus like UCSB can open.”

After Professor Singh finished his part, Professor Wang took over for the remainder of the seminar.

Professor Wang first presented the graph “Measuring uncertainty with confidence intervals, in the IMHE model,” pictured above. The red solid line “comes from the actual realized history. We are plotting the number of death[s] per day.”

“From this point onward, the shaded area is the forecasting that is made under the IMHE model about what is going to happen in the future,” Professor Wang explained. 

“Luckily, we can see that we already passed the peak, and if the model is predicting correctly, we will see a steady decline of the number of death[s] per day, and, hopefully by June, everything will be settled, and we will no longer have any additional cases in COVID-19.”

Professor Wang also presented a graph of the “Predictions of SB cases with fitted SIR model.” The green line plots the actual cases in Santa Barbara County, and the blue and yellow lines are predictions from two different methods. The shaded blue area marks the range interval.


Prediction of Santa Barbara COVID-19 cases with fitted SIR model. Courtesy of Yu-Xiang Wang.

Professor Wang closed the seminar by quoting George E. P. Box, saying, “All models are wrong, but some are useful.”

“Obviously, we can never really model the world exactly, but the actual measure of whether a model is useful or not is by how much impact this model [has] in terms of decision making and how we can quantify the usefulness of the model,” Wang said. 

“As in today, for decision-making purposes, we should make decisions in ranges rather than based on a single number,” concluded Professor Wang.