i-nth logo

Authors

Chris Chambers & Martin Erwig

Abstract

We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and thus is able to detect errors in spreadsheets.

Our approach is based on three static analysis components. First,the spatial structure of the spreadsheet is analyzed to infer a labeling relationship among cells. Second, cells that are used as labels are lexically analyzed and mapped to potential dimensions. Finally, dimension information is propagated through spreadsheet formulas.

An important aspect of the rule system defining dimension inference is that it works bi-directionally, that is, not only "downstream" from referenced arguments to the current cell, but also "upstream" in the reverse direction. This flexibility makes the system robust and turns out to be particularly useful in cases when the initial dimension information that can be inferred from headers is incomplete or ambiguous.

We have implemented a prototype system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50% of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets.

Sample

Dimension inference results
Dimension inference results

This example shows the results of dimension inference on a spreadsheet. Inconsistent use of dimensions, such as miles and gallons, in formulae is highlighted.

Publication

2009, Journal of Visual Languages and Computing, Volume 20, Issue 4, August, pages 269-283

Full article

Automatic detection of dimension errors in spreadsheets