Best Practices for Data Tables: Glossary

Key Points

Introduction
Tabular Data & Spreadsheets
  • Data organization starts at the sampling phase of a reserach project.

  • Spreadsheets are good for data entry, but we use them for a lot more like formatting tables for publication and figures.

  • Not all data is tabular.

Formatting Data Tables
  • Computers need to be able to understand data tables

  • Never modify your raw data.

  • Keep track of all of the steps you take to clean your data.

  • Organize your data according to tidy data principles.

Discussion Formatting Problems
  • Avoid using multiple tables within one spreadsheet.

  • Avoid spreading data across multiple tabs.

  • Record zeros as zeros.

  • Use an appropriate null value to record missing data.

  • Don’t use formatting to convey information or to make your spreadsheet look pretty.

  • Place comments in a separate column.

  • Record units in column headers.

  • Include only one piece of information in a cell.

  • Avoid spaces, numbers and special characters in column headers.

  • Avoid special characters in your data.

  • Record metadata in a separate plain text file.

Dates as data
  • Treating dates as multiple pieces of data rather than one makes them easier to handle.

Exporting data
  • Data stored in common spreadsheet formats will often not be read correctly into data analysis software, introducing errors into your data.

  • Exporting data from spreadsheets to formats like CSV or TSV puts it in a format that can be used consistently by most programs.

Glossary

FIXME