i-nth logo

Authors

Wensheng Dou, Chang Xu, Shing-Chi Cheung, & Jun Wei

Abstract

Spreadsheets are widely used by end users for numerical computation in their business. Spreadsheet cells whose computation is subject to the same semantics are often clustered in a row or column as a cell array.

When a spreadsheet evolves, the cells in a cell array can degenerate due to ad hoc modifications. Such degenerated cell arrays no longer keep cells prescribing the same computational semantics, and are said to exhibit ambiguous computation smells.

We propose CACheck, a novel technique that automatically detects and repairs smelly cell arrays by recovering their intended computational semantics. Our empirical study on the EUSES and Enron corpora finds that such smelly cell arrays are common. Our study also suggests that CACheck is useful for detecting and repairing real spreadsheet problems caused by smelly cell arrays.

Compared with our previous work AmCheck, CACheck detects smelly cell arrays with higher precision and recall rate.

Sample

CACheck example
CACheck example

CACheck marks its detection results with three annotations:

  • Cell arrays that suffer from ambiguous computation smells are colored in yellow.
  • Spreadsheet comments are added to smelly cells for suggesting their corresponding repairs.
  • Conformance errors are colored in red with comments explaining their reasons.

These annotations can assist end users to quickly validate the reported problems.

Publication

2016, IEEE Transactions on Software Engineering, Issue 99, June

Full article

CACheck: Detecting and repairing cell arrays in spreadsheets