Spreadsheet bibliography

Title Measuring spreadsheet formula understandability
Authors Felienne Hermans, Martin Pinzger, & Arie van Deursen
Year 2012
Type Proceedings
Publication EuSpRIG
Series  
Abstract

Spreadsheets are widely used in industry, because they are flexible and easy to use. Sometimes they are even used for business-critical applications. It is however difficult for spreadsheet users to correctly assess the quality of spreadsheets, especially with respect to their understandability. Understandability of spreadsheets is important, since spreadsheets often have a long lifespan, during which they are used by several users.

In this paper, we establish a set of spreadsheet understandability metrics. We start by studying related work and interviewing 40 spreadsheet professionals to obtain a set of characteristics that might contribute to understandability problems in spreadsheets. Based on those characteristics we subsequently determine a number of understandability metrics.

To evaluate the usefulness of our metrics, we conducted a series of experiments in which professional spreadsheet users performed a number of small maintenance tasks on a set of spreadsheets from the EUSES spreadsheet corpus. We subsequently calculate the correlation between the metrics and the performance of subjects on these tasks.

The results clearly indicate that the number of ranges, the nesting depth and the presence of conditional operations in formulas significantly increase the difficulty of understanding a spreadsheet.

Full version Available
Sample
Formula understandability metrics

We compiled a set of metrics for assessing the understandability of a formula.

Formula complexity:

  • The number of direct references.
  • The number of ranges in which the references are grouped.
  • The presence of conditional operations in a formula.
  • Nestedness of formulas.
  • Length of calculation chain.

Formula placement:

  • Percentage of reverse references.
  • Percentage of references in the same row.
  • Percentage of references in the same column.
  • Percentage of distant references.