Skip to content
praftery edited this page Sep 5, 2015 · 17 revisions

Note that there are several example input files in the data subfolder that comes with the mave installation.

Metadata section

Up to the first 100 lines of the file can be used to describe the input file. If 'country' is entered, and the value below it matches 'us', then mave will also include standard federal holidays in the United States as an available input feature to the model.

Required input data:

  • The first column must be named: 'LocalDateTime'. As that implies, this column should contain the local datetime at which the other data was measured. This is also sometimes called a timestamp. The standard format that mave uses is: '%m/%d/%y %H:%M' and datetimes that match this format will be processed faster than other formats. However, it will still be read successfully as long as the format is consistent and logical, though this slightly slows down the pre-processing step.
  • The column containing the target data (e.g. energy consumption data) must be named: 'EnergyConsumption'.

Optional input data

  • To use outside air temperature data from a previous time period as an input feature for a model (where applicable), the column containing weather data must be named: 'OutsideDryBulbTemperature'. We recommend using this feature if you have outside air temperature data as it can capture the effects of precooling, the impact of a warmer or a cooler night on otherwise identical data later in the day, etc.
  • As long as the column has a name (it can be any arbitrarily selected name) it will also be used as an input feature. An example might be 'number of occupants' or ' units produced'. Note that any column of data that is not named will be ignored and not used as an input feature.

Example data

Opening an example csv file in any plain text editor will show something like this:

buildingID,zip,country,floorarea.SF,buildingtype.STR
cbe_08,94720,us,238270,education-college_university
some other information
even more information
etc. (up to 100 lines)

LocalDateTime,EnergyConsumption,OutsideDryBulbTemperature,building occupancy,
1/29/2012 12:45,107,59.381,112,
1/29/2012 13:00,106,60.15,108,
1/29/2012 13:15,107,60.759,102,
1/29/2012 13:30,113,60.469,99,
1/29/2012 13:45,109,60.525,104,
......
Clone this wiki locally