|Title||Enron's spreadsheets and related emails: A dataset and analysis|
|Authors||Felienne Hermans & Emerson Murphy-Hill|
|Publication||International Conference on Software Engineering|
Spreadsheets are used extensively in business processes around the world and as such, are a topic of research interest. Over the past few years, many spreadsheet studies have been performed on the EUSES spreadsheet corpus. While this corpus has served the spreadsheet community well, the spreadsheets it contains are mainly gathered with search engines and might therefore not represent spreadsheets used in companies.
This paper presents an analysis of a new dataset, extracted from the Enron email archive, containing over 15,000 spreadsheets used within the Enron Corporation. In addition to the spreadsheets, we also present an analysis of the associated emails, where we look into spreadsheet-specific email behavior.
Our analysis shows that:
Regarding the emails, we observe that spreadsheets:
This is a list of the 15 most used functions in the Enron corpus. There is little variety in the use of Excel's functions, with more than half of the spreadsheets using three or fewer functions.