Spreadsheet bibliography

Title What we don't know about spreadsheet errors today: The facts, why we don't believe them, and what we need to do
Authors Raymond R. Panko
Year 2015
Type Proceedings
Publication EuSpRIG
Series July

Research on spreadsheet errors is substantial, compelling, and unanimous. It has three simple conclusions:

  • The first is that spreadsheet errors are rare on a per-cell basis, but in large programs, at least one incorrect bottom-line value is very likely to be present.
  • The second is that errors are extremely difficult to detect and correct.
  • The third is that spreadsheet developers and corporations are highly overconfident in the accuracy of their spreadsheets.

The disconnect between the first two conclusions and the third appears to be due to the way human cognition works. Most importantly, we are aware of very few of the errors we make. In addition, while we are proudly aware of errors that we fix, we have no idea of how many remain, but like Little Jack Horner we are impressed with our ability to ferret out errors.

This paper reviews human cognition processes and shows first that humans cannot be error free no matter how hard they try, and second that our intuition about errors and how we can reduce them is based on appallingly bad knowledge. This paper argues that we should reject any prescription for reducing errors that has not been rigorously proven safe and effective.

The paper also argues that our biggest need, based on empirical data, is to do massively more testing than we do now. It suggests that the code inspection methodology developed in software development is likely to apply very well to spreadsheet inspection.

Full version Available
Cell error rates and probabilities of a bottom-line error
Spreadsheet configuration metamodel
The probability of an error increases rapidly when there are many calculations that depend on precedent cells. A spreadsheet with 100 cascade cells and a 3% cell error rate has a 95% probability of containing an error.