Political Calculations
Unexpectedly Intriguing!
September 29, 2020

Getting the results of coronavirus tests has been a chronic problem throughout much of the 2020 pandemic.

Mufid Majnun: Laboratory worker takes a swab test, via Unsplash at https://unsplash.com/photos/oI20ehIGNd4

Early on, simply getting things like an adequate supply of nasal swabs to collect samples for all those seeking COVID tests was a huge problem. Later, testing labs entrusted with determining whether a person either tested positive for the SARS-CoV-2 coronavirus infection or not became a problem when many couldn't keep up with the demand for tests.

The answer to the first challenge was to increase the supply of the test kit components that were in short supply. Engineers deployed 3-D printing technology to make the nasopharyngeal swabs used in COVID test kits. Earlier this month, a study found that one of the more widely used 3-D printed swabs that was developed to cope with the shortage "work as well, and safely, as the standard synthetic flocked nasal swabs".

The answer to the second challenge has been more difficult. The standard answer for the testing labs has been to add more testing equipment to process test results. Unlike nasal swabs however, this equipment cannot be easily produced using 3-D printers, and the supply of qualified equipment and related testing supplies has also come to be in short supply.

That shortage can be seen in the experience of testing labs in states coping with surges in coronavirus infections. Many have had to wait weeks to get new equipment they ordered, only to then face further delays as they needed more time to clear the huge backlogs of past test results after it arrived.

These delays have made it difficult for testing labs to get caught up enough to make testing an effective way to monitor and control the rate of spread of coronavirus infections as envisioned by public health officials. In the absense of sufficient testing capacity, many politicians have stepped in to impose restrictions on commerce and other activities in a bid to slow the spread of infections, but their reactionary policies have wrought considerable damage.

Damage that might be avoided if only COVID-19 testing can be done in a much more timely manner.

That's where the idea of pooling test samples makes a lot of sense. Take a portion of the individual test samples that have been collected and combine them together to perform a single test. If the test on the combined sample comes back negative, then all the individuals whose samples were pooled together this way can be cleared in the time it takes to perform a single test. If the combined test samples comes back positive, then individual tests might be performed to identify the individuals who are infected.

The following diagram provides an example of how pooled testing works in the case of testing for HIV infections in multiple individuals using collected blood samples.

Horemheb-Rubio et al (September 2017). Figure 1. Diagram of pooling method for serum samples from blood donors

In this example, we can see how pooling test samples can reduce the number of required tests from 20, one for each individual sample, down to as few as 9 tests, a 55% reduction. Applied to COVID-19, pooled testing could greatly amplify the capacity of testing labs while reducing their immediate needs to add expensive equipment.

That's the promise of pooled testing, but the reality hinges on a number of factors. How prevalent are coronavirus infections among the population being tested? How many people's tests can be batched and usefully processed and tracked together using this testing method? What's the optimal size of a testing subgroup?

Fortunately, if you have an idea of what the answer for the first two of these questions might be, there's math to answer the third question! Math that we've deployed in the following tool. If you're accessing this article on a site that republishes our RSS news feed, please click through to our site to access a working version.

Testing Parameters
Input Data Values
Total Number of Individual Samples in Testing Pool
Probability of a Positive Sample in the Tested Population

Optimal Subgroup Size for Pooled Testing
Calculated Results Values
Optimal Size of Pooled Testing Subgroup
Benefits of Using Pooled Testing
Expected Number of Tests with Pooled Testing
Percentage of Tests Saved Using Pooled Testing

Without pooled testing, the number of tests that would otherwise need to be performed would be equal to the total number of individual samples.

For the default scenario in the tool, that would be 20 tests. With pooled testing, and assuming that 10% of the population would test positive, the required number of tests to identify all those with positive results would drop to 9, a 55% reduction. That was to be expected, seeing as the default scenario presented in the tool matched the example in the diagram.

But how might the results change if you increased the number of individual samples? What would happen if the expected test positivity rate was 5% instead of 10%? Being able to answer questions like those is why we built this tool!

For what it's worth, the Food and Drug Administration first approved pooled testing for a COVID-19 test developed by Quest Diagnostics that would group four individuals at a time back on 19 July 2020. On Monday, 28 September 2020, the FDA granted an emergency use authorization to Hologic for a new COVID-19 test that would increase the number of individuals in a pooled test group to five.

That's the size of pooled COVID-19 saliva testing now being conducted at the University of Tennessee. Starting with 574 individual samples, divided into 115 pools for testing (114 pools made from 5 individual samples and 1 pool made from the remainder), university researchers found 21 pooled samples with positive results, with individual tests to be conducted on 105 to identify students actually testing positive. Adding 115 tests to the 105 tests to be conducted, pooled testing will have reduced the amount of needed testing to find students testing positve by over 61% from what would have been needed if each student had to have their samples tested individually.

Pooled testing for COVID-19 is looking to be super beneficial indeed.


Horemheb-Rubio, Gibran & Ramos-Cervantes, Pilar & Arroyo Figueroa, Hugo & Ávila-Ríos, Santiago & García Morales, Claudia & Reyes-Teran, Gustavo & Escobedo, Galileo & Estrada, Gloria & García-Iglesias, Trinidad & Muñoz-Saucedo, Claudia & Kershenobich, David & Ostrosky-Wegman, Patricia & Ruiz-Palacios, Guillermo. (2017). High HPgV replication is associated with improved surrogate markers of HIV progression. PLoS ONE. 12. e0184494. DOI: 10.1371/journal.pone.0184494.

Summer J Decker et al, 3D Printed Alternative to the Standard Synthetic Flocked Nasopharyngeal Swabs Used for COVID-19 testing, Clinical Infectious Diseases (2020). DOI: 10.1093/cid/ciaa1366.

Image Credit: Photo by Mufid Majnun on Unsplash

Labels: , , ,

September 28, 2020

The market for new homes in the U.S. is continuing its recent torrid pace. Based on the latest sales data reported by the U.S. Census Bureau, the preliminary nominal estimate of the market capitalization for new homes was $30.6 billion in August 2020.

Taking the trailing twelve month average of the market cap for new homes to factor out seasonality in the data while factoring in data revisions in previous months, we estimate August 2020's adjusted market cap to be $25.76 billion. In nominal terms, this is the highest this figure has been since August 2006, which can be seen in a chart showing the historical market cap data going back to January 1976.

Trailing Twelve Month Average New Home Sales Market Capitalization, January 1976 - August 2020

Perhaps more remarkably, the median sale price of new homes sold in the U.S. fell to an initial estimate of $312,700 in August 2020. The initial estimate of the average sale price of a new home sold in the U.S. in August 2020 is $369,000. Overall, new home sale price data have been slowly trending downward since peaking in January 2018, as shown in the following chart presenting median and average new home sale prices since Janaury 2000.

Median and Average New Home Sale Prices, January 2000 - August 2020

With the trend for the sale prices of new homes generally flat to slightly falling over the last two years, the only way the market cap of new homes could increase is because of rising sales volumes. That fact may be confirmed in the next chart showing the trailing twelve month average of the annualized number of new home sales in the U.S. from January 1976 through August 2020.

The preliminary data for the months from April 2020 through August 2020 suggest the number of new home sales is rising at one of the fastest paces on record. Since the data for the last few months of this period may be subject to revision during the next several months ahead, we won't be able to confirm if its the fastest until later this year.

Labels: ,

September 27, 2020

The S&P 500 (Index: SPX) has come to revolve as much around the miscellaneous pronouncements of various minions of the Federal Reserve as it does about their expectations for the fundamental future business prospects of the 500 largest publicly-traded U.S. companies.

The latest sign of how deeply dependent investors have become on those pronouncements on Tuesday, 22 February 2020. Speaking to a virtual forum of Official Monetary and Financial Institutions, Chicago Fed President Charles Evans 'accidentally' set a new expectation the Fed's future monetary policy would be less expansionary than it previously communicated it would be in announcing its new average inflation target policy.

For the dividend futures-based model we use to project the future potential levels of the S&P 500, that kind of change alters the model's amplification factor, which we think shifted from +1.0 to +1.5 as a direct consequence of Evans' statement. We've visually indicated that shift in the latest update of the alternative futures chart indicating the model's future projections.

That change also occurs as investors would seem to have shifted their forward-looking focus from 2020-Q4 toward the more distant quarter of 2021-Q1, which began last week. We think that shift can be best understood as the market starting to pay much closer attention to the 2020 election, whose outcome will have considerable impact on the future for the U.S. government's fiscal policies. We anticipate investors may switch their focus back and forth between 2020-Q4 and 2021-Q1 severval times before the end of the 2020 calendar year.

We've described Evans' rate hike statement as 'accidental' since he attempted to walk it back on the next day, though the level of the S&P 500 indicates his effort, combined with the statements of other Fed officials, was unsuccessful.

Speaking of which, there was quite a lot of noise coming from the Fed's minions in the trading week ending on 25 September 2020, mostly calling for the U.S. government to step up its fiscal stimulus efforts. There was other stuff too, but that's what stood out to us in reviewing what we consider to be market-moving headlines from the week's newstream.

Monday, 21 September 2020
Tuesday, 22 September 2020
Wednesday, 23 September 2020
Thursday, 24 September 2020
Friday, 25 September 2020

Barry Ritholtz presents the positives and negatives he found in the past week's economics and markets news over at The Big Picture.

Finally, for those looking for a primer of how the outcome of an election can alter the future expectations of investors, be sure to review the history of 2012's Great Dividend Raid, which we had the pleasure of documenting in real time as it happened!

Update 1 October 2020: Following the close of trading on 30 September 2020, we've tweaked the alternative futures chart to refine the start of the next redzone forecast range, which we're now showing as beginning on 1 October 2020. Here's the updated chart:

Alternative Futures - S&P 500 - 2020Q3 - Standard Model (m=+1.5 from 22 September 2020) - Snapshot on 30 Sep 2020

On Monday, 28 September 2020, investors shifted their forward-looking focus from 2020-Q1 back toward 2020-Q4. The new redzone forecast range assumes investors will hold their attention on 2020-Q4 through 28 October 2020. We anticipate investors may continue to shift their investing time horizon between these two quarters in the weeks ahead. While a strong focus on 2020-Q4 would see the trajectory of the S&500 run within the forecast range, we would see shifts toward 2020-Q1 would coincide with the trajectory of the S&P 500 running toward the low end of the forecast range, if not falling below it.

Labels: ,

September 25, 2020
Like sands through the hourglass, so our the coins in our jars...

How much do you need to set aside each payday to save up for a big ticket item you will need to buy a few years from now?

Sure, you could do what a lot of people do and just pull out your credit card when it is time for you to buy that big ticket item, then spend the next several years paying for it and the interest your credit card company will charge you. But what if you would rather only pay once for what you know well in advance that you're going to be buying?

Better still, what if you set aside money every payday and earned a little bit of interest on it from that savings account at your bank? You wouldn't need to set aside quite so much, but all the money you would need would be ready when you're ready to pull the trigger on your planned purchase.

That's where our latest tool might be really helpful for you. Just enter the indicated data for your future purchase in the input fields below, and it will work out how much you will need to set aside from each of your paychecks until you have saved enough! [If you're reading this article on a site that republishes our RSS news feed, please click here to access a working version of the tool at our site.]

Big Ticket Item Price and Savings Information
Input Data Values
How much money are you looking to save?
What is the interest rate for your savings account?
Over how many years will you save before buying?
How often do you receive a paycheck?

Savings To Set Aside With Each Paycheck
Calculated Results Values
Amount to Save From Each Paycheck

For our default calculation, if you placed $126.69 out of every paycheck in your savings account and earned 0.8% interest on it, you would have $10,000 saved up after 3 years. If you change the interest rate to 0%, you'll find that you'll have reduced the amount you need to save by $1.52 per paycheck, which doesn't sound like much, but that's a $118.56 savings for you over three years.

If you can get a higher interest rate on your savings account, then the savings math may become more compelling. Alternatively, if you can find a way to get a discount on what you're looking to buy and are willing to adjust the timing of when you plan to buy, that will work in your favor too.

If you're wondering about the math behind the tool, it the same that big corporations use when they plan to set aside funds to pay dividends to their shareholding owners or to pay back money they've borrowed. They call these special purpose savings accounts "sinking funds", although we have yet to find a compelling explanation for why they're called that.

But you have to admit, they're an excellent way to ensure you have the money you will need when it is time to buy that costly, not-so-everyday item. Not to mention being easier to manage than three years worth of loose change tossed in a jar!

Image credit: Photo by Michael Longmire on Unsplash

Labels: , ,

September 23, 2020

How has the cumulative distribution of income in the United States changed over the last six years?

Since the U.S. Census Bureau published its income distribution data for American individuals, households, and families for the 2019 calendar year on 15 September 2020, we thought it would be interesting to show how the distributions for each of these subgroups of the nation's population has changed over time. To do that, we've put together several animated charts to show the evolution of the distribution of income in the U.S. from 2014 through 2019.

If you're reading this article on a site that republishes our RSS news feed, you may want to click through to the original version of this article at our site to see the animations in action.

Here is the animated chart showing the shifting distribution of total money income for individual Americans:

Animation: Cumulative Distribution of Total Money Income for U.S. Individuals, 2014-2019

Next, let's look at the animated chart for the total money income distribution of American households:

Animation: Cumulative Distribution of Total Money Income for U.S. Households, 2014-2019

Finally, here is the animated chart showing the ongoing development of total money income for American families:

Animation: Cumulative Distribution of Total Money Income for U.S. Families, 2014-2019

The U.S. Census Bureau distinguishes families from households by recognizing that families are made up of individuals who live together that are related to each other by birth, marriage, or adoption. Households may consist of people who either are related or that are not related as a family.

Each of the animated charts show the distribution of nominal cumulative total money income for American individuals, households, or families, which have not been adjusted for inflation. For each of these groupings, the animations show the distribution of income in the U.S. shifting toward the right, with Americans' incomes rising over time. The rate of increase has accelerated considerably in the most recent years, although we should note that the data for 2019 was collected in early March 2020, before any negative impacts from the Coronavirus Recession would be observed.

The animations also show the shifting distribution of income in the U.S. resulting in a falling percentage of Americans with the lowest incomes during these years, while the percentage of Americans at the upper end of the nation's income spectrum has been rising.

If you would like to keep track of the latest trends for median household income, we estimate that vital statistic each month. Our next update will come on 1 October 2020, when we'll present the latest updates through August 2020. Until then, our estimate of median household income for July 2020 is the latest entry in the series.

Labels: , ,

September 22, 2020

From time to time, we test drive new forecasting methods for stock prices to see how they perform.

Back in early November 2018, we presented a prediction for what would happen with the share price of General Electric (NYSE: GE) based on a relationship between its expected future dividends and its market cap. Let's quickly recap that old forecast:

Now that General Electric (NYSE: GE) has slashed its quarterly dividend by 91%, from $0.12 to $0.01 per share, which we estimate is about 50% more than what investors had already priced in to the stock, what can they expect next for the company's share price?

Based on the historic relationship that investors have set between the company's market capitalization and its aggregate forward year dividends since 12 June 2009, we would anticipate GE's share price falling to somewhere within a range of $3 to $7.

General Electric Market Capitalization versus Forward Year Aggregate Dividends at Dividend Declaration Dates from 12 June 2009 through 30 October 2018

Almost two years later, we can see how well the forecasting method we were testing worked, updating that original chart to show how history played out. The new chart catches the data up through 3 September 2020 to coincide with the date of GE's most recent dividend declaration:

General Electric Market Capitalization versus Forward Year Aggregate Dividends at Dividend Declaration Dates from 12 June 2009 through 3 September 2020

It didn't take long for GE to prove that prediction right, with its market cap dropping within the top end of our target range in a month's time.

But that outcome didn't last very long. Soon, GE's stock price rose above the range we had projected using nine years worth of historical data, staying well above it for a prolonged period of time. It took the onset of today's Coronavirus Recession to drop GE's market cap back within that target range. Even so, GE's market cap hasn't fallen below the midpoint of that range during any of that time, which means the method we used set the target too low.

Being able to connect a company's expected forward year aggregate dividends to its market capitalization to forecast its stock price could be a valuable way of determining whether its current share price presents a buying or a selling opportunity. Doing a better job in setting the target would better indicate which kind of investing opportunity might exist at any given time.

We have an idea for how to improve our result, which we'll explore more in upcoming weeks. To test it out though, we'd like to look at some other stock than GE, since we don't expect the company will be changing its dividend payouts anytime soon. If you have a candidate for us to consider, please drop us a line!


Dividend.com. General Electric Dividend Payout History. [Online Database]. Accessed 22 September 2020.

Ycharts. General Electric Market Cap. [Online Database]. Accessed 22 September 2020.

Yahoo! Finance. General Electric Company Historical Prices. [Online Database]. Accessed Accessed 22 September 2020.

Labels: , ,

September 21, 2020

What effect does going back to school have on the spread of COVID-19 coronavirus infections?

The possible answers to that question have greatly concerned many parents and policymakers around the world. To find out the possible effect, we've turned to data from the state of Arizona where a combination of demographic data from the state's Department of Health Services and its three major public universities provides a window into seeing what that effect might be.

Arizona's universities started their Fall 2020 sessions by delivering course content online in August, but began providing either hybrid or traditional classroom instruction at their campuses in late-August or early September. Since we're mainly interested in how returning to the classroom might affect the spread of COVID-19 among the student age population, we looked at the total number of positive coronavirus tests reported for the Age 0 to 44 population across the state and just by Arizona State University (ASU), the University of Arizona (UA), and Northern Arizona University (NAU) at two points in time: 3 September 2020 and 18 September 2020.

We've presented the results of that exercise in the following chart showing the change in the number of coronavirus cases in the between these two points in time, where we find a mixed picture. Please click here to access a full-size version of the chart. [Update 26 September 2020: We identified an error in our original presentation that undercounted the number of cases in the Age 0-19 portion of the population. The original chart we presented is here, the following chart has been corrected. corrections in our analysis below are indicated with boldface font.]

Corrected - Back to School In Arizona: Change in Number of Reported COVID-19 Positive Test Results for Age 0-44 Age Group Between 3 September 2020 and 18 September 2020

Between 3 September 2020 and 18 September 2020, the total number of positive test results reported by Arizona's Department of Health Services increased by 6,568. Of this figure, 37% of the reported increase in cases originated at Arizona's three major public universities. The other 63% represents the total increase in cases reported in the state for its entire Age 0 to 44 population.

Here is the breakdown for the three public universities:

  • Arizona State University: 627 cases (10%)
  • Univerisity of Arizona: 1,554 cases (24%)
  • Northern Arizona University: 273 cases (4%)

The University of Arizona's high case count stands out because it is utilizing rapid antigen tests, which differ from the established COVID-19 tests performed at the other universities and at testing sites across the state. After long excluding results from these tests in its daily reported case counts, Arizona's DHS began including results from antigen tests in its statewide tallies on 17 September 2020. UA's antigen test results have had issues with false positives.

The reported data is limited because it is silent about where an individual with a positive COVID-19 test result may have been exposed to the SARS-CoV-2 coronavirus. For example, a student's exposure to the viral infection may have taken place in a classroom, a campus facility, a dorm, or even off campus. We should also note that the positive COVID-19 results for university students, faculty and staff members may also include individuals older than Age 44.

That's why breaking out data for a university's faculty and staff may be valuable, since the incidence of cases would come primarily through contact in classroom and on-campus facilities. Here, data from ASU indicates that students account for 99% of the reported cases, while faculty and staff account for just 1% of the new cases reported in our period of interest. Data going back to 1 August 2020 indicates 2% of all ASU's reported cases have been among faculty and staff members.

All these institutions are operating with classes set up to minimize the potential spread of coronavirus infections. The exceptionally low number of new positive test results among faculty and staff suggest those approaches are effective at protecting the health of older individuals who are much more at risk of health complications from COVID-19 than the student-age population, who make up the vast majority of infections on campus.

For all the testing the universities are doing, two of the three are reporting comparatively low test positivity rates, consistent with levels indicating the spread of coronavirus infections is manageable. Both ASU and NAU report their cases are below a 5% threshold.

By contrast, UA reports a 15.5% rate from its antigen tests, prompting the university to tell students on 14 September 2020 to "shelter-in-place" for two weeks. The action is expected to bring the spread of infections back down to manageable levels.

Meanwhile, falling rates of COVID-19 hospitalizations are continuing to be reported for the state. The Age 0-44 demographic is also the least likely to experience health complications from the viral infection requiring admission to hospitals, which may account for why these numbers have not been rising.

Perhaps the most significant factor behind the pattern we see in the incidence of COVID-19 infections at Arizona's major college campuses is whether or not its local community has already had significant numbers of cases. Arizona became a hotbed for infections during June 2020 and peaked in July 2020, with over 60% of its reported concentrated in the Phoenix metropolitan area (where ASU is located). Meanwhile, the Tucson metro area (the home of UA) had a moderate number of cases and greater Flagstaff (home to NAU) had has relatively low numbers.

As we've previously observed, COVID is very much a geographic phenomenon, tending to spread most where it hasn't previously been in great numbers, where local herd immunity hasn't developed. We suspect that dynamic lies behind the high number of cases at UA in Tucson now being recorded, and we fear NAU in Flagstaff may have a surge in cases in its future.

The patterns we've described for Arizona would seem to have direct bearing on the "going back to school" season for college students in other states. Dave Tufte describes what he's observing with a marked surge in COVID-19 cases now taking place in Utah. Looking over the state's data, we think the sharp increase in number of new coronavirus infections in the state may be tied to an initial exposure event coinciding with the late start of classes at Brigham Young University, which was then amplified and spread to students' home towns during the Labor Day holiday weekend a week later. Like Arizona's UA outbreak, it seems to be spreading in areas that haven't previously seen high levels of infections.

That still leaves us with one big question needing more information to be answered. Do high numbers of cases among college students take place in places that have already experienced high epidemic numbers? Arizona's data hints the answer is no, but the sample size of universities running in-person classes across the country is still pretty small.


We've used contemporary news reports to compile the COVID-19 data for the universities, and Arizona's DHS COVID-data dashboard for the state's overall figures for the Age 0-44 population.

Arizona Department of Health Services. COVID-19 Data Dashboard. [Online Application/Database]. Accessed 19 September 2020.

Eltohamy, Farah. University of Arizona reports new daily high for positive COVID-19 tests. AZCentral. [Online Article]. 3 September 2020.

Hansen, Piper J and Myscow, Wyatt. There are 983 positive coronavirus cases within the ASU community. State Press. [Online Article]. 3 September 2020.

Ackley, Madeline. ASU has 112 new COVID-19 cases in past 3 days, whil UA has 678. AZCentral. [Online Article]. 18 September 2020.

Northern Arizona University. Coronavirus updates and resources. [Online Article]. Accessed 18 September 2020.

Labels: , ,

September 20, 2020

The dividend futures-based model we use to project the future for the S&P 500 (Index: SPX) presents some unique challenges from time to time.

In 2020, one of those challenges has been coping with changes in the model's amplification factor (m) which, after more than a decade of holding a virtually constant value, suddenly became a variable. Add to that a bubble in stock prices inflated by a Japanese investment bank, and we've had our hands full in keeping up with the changes that have driven stock prices.

The unwinding of the one-sided trades launched by the Japanese investment bank's "NASDAQ whale" combined with statements by Federal Reserve officials on Wednesday and Thursday in the past week however provided us with an opportunity to calibrate the model and empirically determine the amplification factor. Assuming investors are continuing to focus on 2020-Q4 in setting current day stock prices, it seems to have settled at a positive value of 1.0.

That's less than the value of 1.5 that held in the period prior to the NASDAQ whale's influence, where the reduction from this level is consistent with the Fed adopting a more expansionary monetary policy. Since nobody outside of Japan's SoftBank had visibility on its role in the summer stock price rally, we had previously attributed the runup in the S&P 500 to investors responding the Fed's signaling its increasing willingness to adopt a more 'dovish' policy. Now that the NASDAQ whale is out of the picture, so to speak, we can now better quantify the contribution of the Fed's signaled policy change to the summer rally, where it would appear to account for 25% of the change in the amplification factor.

This past week is when that signal was set more definitively, although as you'll see in the headlines we plucked from the week's major market-moving newstream, the Fed is still really shaky on what that new policy means.

Monday, 14 September 2020
Tuesday, 15 September 2020
Wednesday, 16 September 2020
Thursday, 17 September 2020
Friday, 18 September 2020

Meanwhile, Barry Ritholtz succinctly summarized each of the positives and negatives he found in the past week's economics and markets news.

Labels: ,

September 18, 2020

More or Less presenter Tim Harford talks about deliberately misleading statistical analysis in the following Numberphile video. If you have nine minutes, you'll find why its essential to maintain a healthy skepticism of both claims and counterclaims based on statistical analysis.

If you're anywhere but the U.S. or Canada today, Tim's newest book, How to Make the World Add Up, is now available for sale. If you're in the U.S. or Canada, you'll have to wait until February 2021 to get a copy that will carry a different title: The Data Detective: Ten Easy Rules to Make Sense of Statistics, which can be pre-ordered at Amazon today.

Labels: ,

September 17, 2020
Benford's Law Leading Digit Distribution

Can you trust the numbers the U.S. government reports daily for the number of confirmed COVID-19 cases? Can you trust China's or Italy's figures? How about the case counts reported by Russia or other nations?

2020 has been a bad year for many people around the world, mainly because of the coronavirus pandemic and many governments' response to it, which has almost made COVID-19 as much a political condition as a viral infection. Among the factors that make it a political condition is the apparent motives of political leaders to justify their policies in responding to the pandemic, which raises questions of whether they are honestly reporting the number of cases their nations are experiencing.

Telling whether they are or not is where Benford's Law might be used. Benford's Law describes the frequency by which leading digits appear in sets of data where exponential growth is observed, as shown in the chart above. The expected pattern that emerges in data showing exponential growth over time according to Benford's Law is strong enough that significant deviations from it can be taken as evidence that non-natural forces, such as fraud or manipulation for political purposes are at play.

Economists Christoffer Koch and Ken Okamura wondered if the data being reported by China, Italy and the United States for their respective numbers of reported cases was trustworthy and turned to Benford's Law to find out. We won't keep you in suspense, they found that the growth of each nation's daily COVID-19 case counts prior to their imposing 'lockdown' restrictions were consistent with the expectations of Benford's Law, leading them to reject the potential for the data having been manipulated to benefit the interests of their political leaders. Here's the chart illustrating their findings from their recently published report:

Koch, Okamura: Figure 2. First Digit Distribution Pre-Lockdown number of confirmed cases in Chinese Provinces, U.S. States and Italian Regions

But that's only three countries. Are there any nations whose leaders have significantly manipulated their data?

A preprint study by Anran Wei and Andre Eccle Vellwock also found no evidence of manipulation in COVID-19 case data by China, Italy and the U.S., and extends the list of countries with trustworthy data to include Brazil, India, Peru, South Africa, Colombia, Mexico, Spain, Argentina, Chile, France, Saudia Arabia, and the United Kingdom. However, when they evaluated COVID-19 case data for Russia, they found cause for concern:

Results suggest high possibility of data manipulations for Russia's data. Figure 1e illustrates the lack of Benfordness for the total confirmed cases. The pattern resembles a random distribution: if we calculate the RMSE related to a constant proability of 1/9 for all first digits, it shows that the RMSE is 20.5%, a value lower than the one related to the Benford distribution (49.2%).

Wei and Vollock also find issues with Russia's COVID-19 data for daily reported cases and deaths. Here is their chart summarizing the results for total confirmed COVID-19 cases for each of the nations whose data they reviewed:

Wei and Vellwock. Figure 1. Total confirmed cases for (a) the whole world and (b-q) selected countries. The black curve refers to Benford's Law probability.

They also found issues with Iran's daily confirmed cases and deaths, but not enough to verify the nation's figures have been manipulated.

Previously on Political Calculations


Koch, Christopher and Okamura, Ken. Benford's Law and COVID-19 Reporting. Economics Letters. Volume 196, November 2020, 209573. DOI: 10.1016/j.econlet.2020.109573.

Wei, Anran and Vellwock, Andre Eccel. Is the COVID-19 data reliable? A statistical analysis with Benford's Law. [Preprint PDF Document]. September 2020. DOI: 10.13140/RG.2.2.31321.75365.

Labels: , ,

About Political Calculations

Welcome to the blogosphere's toolchest! Here, unlike other blogs dedicated to analyzing current events, we create easy-to-use, simple tools to do the math related to them so you can get in on the action too! If you would like to learn more about these tools, or if you would like to contribute ideas to develop for this blog, please e-mail us at:

ironman at politicalcalculations.com

Thanks in advance!

Recent Posts

Stock Charts and News

Most Popular Posts
Quick Index

Site Data

This site is primarily powered by:

This page is powered by Blogger. Isn't yours?

CSS Validation

Valid CSS!

RSS Site Feed

AddThis Feed Button


The tools on this site are built using JavaScript. If you would like to learn more, one of the best free resources on the web is available at W3Schools.com.

Other Cool Resources

Blog Roll

Market Links

Useful Election Data
Charities We Support
Shopping Guides
Recommended Reading
Recently Shopped

Seeking Alpha Certified

Legal Disclaimer

Materials on this website are published by Political Calculations to provide visitors with free information and insights regarding the incentives created by the laws and policies described. However, this website is not designed for the purpose of providing legal, medical or financial advice to individuals. Visitors should not rely upon information on this website as a substitute for personal legal, medical or financial advice. While we make every effort to provide accurate website information, laws can change and inaccuracies happen despite our best efforts. If you have an individual problem, you should seek advice from a licensed professional in your state, i.e., by a competent authority with specialized knowledge who can apply it to the particular circumstances of your case.