i-nth - On the empirical evaluation of similarity coefficients for spreadsheets fault localization

Authors

Birgit Hofer, Alexandre Perez, Rui Abreu, & Franz Wotawa

Abstract

Spreadsheets are by far the most prominent example of end-user programs of ample size and substantial structural complexity. They are usually not thoroughly tested so they often contain faults. Debugging spreadsheets is a hard task due to the size and structure, which is usually not directly visible to the user, i.e., the functions are hidden and only the computed values are presented.

A way to locate faulty cells in spreadsheets is by adapting software debugging approaches for traditional procedural or object-oriented programming languages. One of such approaches is spectrum-based fault localization (SFL).

In this paper, we study the impact of different similarity coefficients on the accuracy of SFL applied to the spreadsheet domain. Our empirical evaluation shows that three of the 42 studied coefficients (Ochiai, Jaccard and Sorensen-Dice) require less effort by the user while inspecting the diagnostic report, and can also be used interchangeably without a loss of accuracy. In addition, we illustrate the influence of the number of correct and incorrect output cells on the diagnostic report.

Sample

Influence of the number of correct and incorrect output cells on the ranking result for the EUSES spreadsheets.

Publication

2015, Automated Software Engineering, Volume 22, Issue 1, March, pages 47-74

Full article

On the empirical evaluation of similarity coefficients for spreadsheets fault localization