Authors
Felienne Hermans, Efthimia Aivaloglou, & Bas Jansen
Abstract
Spreadsheets are used heavily in many business domains around the world. They are easy to use and as such enable end-user programmers to build and maintain all sorts of reports and analyses. In addition to using spreadsheets for modeling and calculation, spreadsheets are often also used for creating reports and dashboards: combining data from different sources and creating overviews.
For this, lookup functions can be used: they search for a value in a range and return a corresponding row or column. Lookup functions are common: according to recent research the VLOOKUP
is the fifth most common Excel function.
In this paper we investigate the use of lookup functions in more detail. We analyze lookup functions within the newly released Enron spreadsheet corpus. The results show that:
- A minority of 43% of lookup formulas use the default setting where an approximate match may be returned.
- 77% of approximate matches are used unnecessary.
- 23% of approximate lookups is problematic: they search over unsorted ranges, while this is specifically advised against in the specification, and might lead to wrong results.
Sample
This VLOOKUP
function is approximately searching in an unsorted range, resulting in erroneous values.
The user has sorted the cells in F2:G6
on the names of the majors, rather than on the majors's code, hence the search range in F2:F6
is not sorted. The VLOOKUP
function in column D
returns a wrong major for code 4, as highlighted in red in cell D7
.
When this occurs, no error message or warning is given.
Publication
2015, Delft University of Technology, Report TUD-SERG-2015-011