ReadFromFile

The ReadFromFile function allows WEAP to read annual, monthly (or whatever your timestep is) or daily data from a delimited text file (typically comma-separated value (CSV) but any separator, including spaces, can be used) into any WEAP variable.  In cases where aggregate data needs to be disaggregated, such as generating a daily time series from monthly climate data, one of many disaggregation methods can be used.  Alternatively, daily data can be aggregated to match your timestep (e.g., weekly or monthly), or monthly data can be aggregated to annual, using one of many aggregation methods.  Missing values (either blanks or -9999) can be filled in using one of several missing value methods.  You may refer to files in any directory on your computer although it is best practice to place the files in the same directory as the WEAP area, or in a subdirectory of the WEAP area directory, so that the files will be included when the area is backed up or transferred to another computer.  In this case, do not include the full directory path ("absolute" reference)--just include the file name ("relative" reference).  The ReadFromFile Wizard is the easiest way to construct a ReadFromFile expression, and it helps you explore, preprocess and visualize the time series data.

A text file can contain one or more columns of data for each year, month or day. The format of the WEAP expression is:

  ReadFromFile(FileName) or
  ReadFromFile(FileName, DataColumnNumberOrName) or
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset) or
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter) or
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter)
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse)
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse)
  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse, CycleMethod)
 

If using the Divide into Clusters disaggregation method, there are two additional parameters before the optional Missing Value Method parameter:

  ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Divide into Clusters Disaggregation Method, Method Parameter, ClusterLength, ClusterVariance, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse, CycleMethod)

Only the FileName parameter is required; all others are optional.  If you do not specify a data column number or name, the first data column will be used.  The numbering of the data columns starts after the year and month columns.  Instead of using a number to refer to the data column, it is better to use its name (enclosed in quotes), because this makes the expression clearer and more resilient to later changes in the CSV file.  This requires specification of the column names with the $Columns directive of the CSV file (see below for info).  If the other parameters are omitted, the following defaults will be used: YearOffset = 0, Aggregation Method = Average, Disaggregation Method = Repeat, ClusterVariance = 1, Missing Value Method = Replace with 0, FirstYearToUse = first year in file, LastYearToUse = last year in file, CycleMethod = No Cycle.  Each parameter is described in detail below.  You can leave parameters empty, e.g., ReadFromFile(DailyPrecipitation.csv, , , Sum).  WEAP can export its results in a format compatible with ReadFromFile -- click the CSV button on the table toolbar when viewing tabular results in the Data or Results View and choose the ReadFromFile Format option.

Year Offset

The optional YearOffset parameter can be used to use data from different years, and is specified either as an offset from the Base Year or as the data year from the file to use for Base Year data. For example, to use historical stream flow data (starting in 1950) for future values (2005-2025), you would use either 1950 or -55 for the YearOffset.  Most people find it easier to give the YearOffset as a year rather than an offset.  When specified as an offset, the YearOffset should equal the first year of data in the CSV file minus the Current Account Year.  In the previous example, YearOffset = 1950 - 2005 = -55.  When WEAP is looking for 2005 data, it will offset -55 years, so will read in data 55 years previous to 2005 (1950).  When looking for 2006 data, it will read in data from 1951, etc.  If YearOffset is blank, it will default to an offset of 0, meaning that the data in the file corresponding to the Base Year will be used for Base Year data.

Aggregating daily data to monthly (or other timestep)

In cases where you have daily data but your WEAP model is not daily, you may specify an Aggregation Method.  Available methods are Sum, Average, Median, Minimum and Maximum.

Disaggregating monthly (or other timestep) data to daily

In cases where you have annual data but need a monthly time series for your WEAP model, or have monthly data but need a daily time series (e.g., for a daily model, or for the MABIA catchment method, which requires daily data), you may specify a Disaggregation Method.  Available methods are Interpolate, Repeat, Divide, and Divide with Gaps.

Handling Missing Data

In cases where there are gaps in the data, either because of missing rows (timesteps) in the text file, or from values marked as missing in the file (-9999), you need to tell WEAP how to interpret the missing data.  You may specify a Missing Value Method.  Available methods are Interpolate, Repeat, Replace, Mark and Error.  The method will default to Replace if none is specified, which means that missing values will be replaced with 0.

Note: If you are aggregating daily data into monthly values, or monthly data into annual values, WEAP will fill missing data using the Missing Value Method before aggregating.  For example, if you have daily temperature data that you need to aggregate to monthly averages, you might use the Interpolate Missing Value Method to fill in gaps in the daily data, all of which will all be averaged to get the monthly average temperature.  If the Missing Value Method is Mark, any missing daily values will cause the aggregated monthly value to be the MissingValue.

Limiting which Years to Use

In some cases, you may wish to choose a subset of years from the CSV file.  For example, a file with streamflow gauge records has data for 1900-2010, but you want only the period 1920-1960 because the years before 1920 were poor quality (many missing values) and after 1960 the river was severely impaired (you want a "natural" flow record).  Use the FirstYearToUse and LastYearToUse parameters to specify the first and last years to use.  If either are blank or 0, they will default to the first or last year of data for the selected column of data.  

Cycle Data

The optional 10th parameter to ReadFromFile is either "Cycle" or "No Cycle" (without the quotes).  If "Cycle" is specified, WEAP will wraparound from the end of the file back to the beginning (or from the LastYearToUse to the FirstYearToUse, if those parameters are specified).  For example, if the WEAP area time horizon is 2010-2040 and a CSV file has data from 1960-1969, WEAP would use data from 1960-1969 for years 2010-2019, then wraparound so that 1960-1969 would be used again for 2020-2029 and 2030-2039 (2040 would wraparound to use 1960 data). If the cycle parameter is not specified, it will default to "No Cycle."

Comments

Comment lines (any line that does not begin with a number) are ignored.  Also, any part of a line after a semicolon (;) or number sign (#) will be ignored, so can be used to include comments on the same line with data.  It is good practice to have a few comment lines at the beginning of the file, documenting when, how and where the data came from, units used, and which data are in each data column.

Column Descriptions

Use the optional $Columns directive to describe the contents and unit for each data column.  The optional unit, if included, is in square brackets after the name.  This is a very good way to document your data.  In addition, you can then refer to columns by their name instead of their number in the ReadFromFile expression -- not only will this make for a clearer expression, but the reference will still be correct if you later add or remove columns in the file.  Here is an example from a climate file named Climate.csv with 5 data columns (rainfall, min and max temperature, humidity and wind):

$Columns = Precipitation[mm], Min Temperature[C], Max Temperature[C], Relative Humidity[%], Wind Speed[m/s]

Here's how to refer to the Max Temperature column in the ReadFromFile expression:  ReadFromFile(Climate.csv, "Max Temperature[C]")      Note: you must enclose the column name in quotes.  The unit is optional and can be omitted, e.g., ReadFromFile(Climate.csv, "Max Temperature")

Optionally, you may list the time columns first. Some examples:

If any column names include the list separator character, you must enclose the name in quotes, e.g., "Daily Temperature, Average[C]"

The ReadFromFile and FDCShift Wizards will display information from the $Columns directive.

List separator

WEAP assumes that the values in the CSV file will be separated by the Windows List Separator character (set in the "Regional and Language Options" Windows Control Panel).  However, you can override this setting by placing a $ListSeparator directive into the CSV file, on a line before any data values.  (You may not use # as the list separator, because this denotes a comment line.)  For example, to specify "," (comma) as the separator, include the following line at the top of the file:  

$ListSeparator = ,

Because the list separator varies from country to country, it is a good practice to always include the $ListSeparator directive in the CSV file so that WEAP can read the file, regardless of the setting of the Windows List Separator character.  This is helpful if the dataset is shared between users in different countries.   If you will be editing your CSV file in Excel, you should enclose this entire directive in quotes so that Excel doesn't interpret the list separator character as an actual list separator, e.g.,    "$ListSeparator = ,"

Decimal Symbol

WEAP assumes that the numbers in the CSV file will use the Windows Decimal Symbol character (set in the "Regional and Language Options" Windows Control Panel).  However, you can override this setting by placing a $DecimalSymbol directive into the CSV file, on a line before any data values.  For example, to specify "." (period) as the symbol, include the following line at the top of the file:  

$DecimalSymbol = .

Because the decimal symbol varies from country to country, it is a good practice to always include the $DecimalSymbol directive in the CSV file so that WEAP can read the file, regardless of the setting of the Windows Decimal Symbol character.  This is helpful if the dataset is shared between users in different countries.  If you will be editing your CSV file in Excel, you should enclose this entire directive in quotes so that Excel doesn't interpret the decimal symbol character as the list separator (some countries use a comma as a decimal separator while others use it as a list separator), e.g.,    "$DecimalSeparator = ."

Date Format (for daily data)

By default, WEAP will use the Windows system "Short date" format (set in the "Regional and Language Options" Windows Control Panel) for interpreting daily dates, e.g., mm/dd/yyyy or dd/mm/yyyy.  However, because you may want to share your WEAP area with someone in another country (whose regional settings are different), it is best practice to include in the file a line that specifies the date format used.  To do this, place a $DateFormat directive into the CSV file, on a line at the top before the data values.   For example, to specify mm/dd/yyyy as the date format, include the following line at the top of the file:  

  $DateFormat = mm/dd/yyyy

Some other examples:

$DateFormat = d-m-y
$DateFormat = m d y
$DateFormat = d/m/y
$DateFormat = yyyy/m/d

When specifying the date format, you do not need to repeat the 'm', 'd', or 'y', .e.g., mm/dd/yyyy is equivalent to m/d/y.  The two separator characters must be the same, e.g., m-d-y is fine, but m/d-y is not.  For the date values in the CSV file, single-digit month and day numbers do not need to have a leading zero, but it is allowed, e.g., 1/3/1990 or 01/03/1990 are both fine.  Years must have 4 digits.  

Examples

  ReadFromFile(GroundwaterRecharge.txt)

will read in data from the first data column of file GroundwaterRecharge.txt in the directory for the current WEAP area (e.g., if the area was Weaping River Basin, the file would be C:\Program Files\WEAP21\Weaping River Basin\GroundwaterRecharge.txt).  

  ReadFromFile(DemandActivity.txt, 2)

will read in data from the second data column of file DemandActivity.txt.

  ReadFromFile(DemandActivity.txt, "Population", -50)

will read in data from a data column named "Population" of file DemandActivity.txt, shifting the years by 50 (year 1950 data in the file will be interpreted as 2000 data).

Aggregating Daily Data to Monthly (or other timestep)

  ReadFromFile(DailyTemperature.csv, 1, 0, Average)

will read in daily air temperature data from the first data column of file DailyTemperature.csv, not shifting the years at all (offset = 0), and deriving monthly values by averaging the daily temperature values.  Because the parameter values in this example are actually the same as the defaults, you could omit them and get the same result (monthly average of daily values):  ReadFromFile(DailyTemperature.csv)

  ReadFromFile(DailyPrecipitation.csv, , , Sum)

will read in daily precipitation data from DailyPrecipitation.csv and derive monthly values by totaling the daily precip values.  Because Column and YearOffset parameters were not specified, their defaults are used (Column = 1, YearOffset = 0).

  ReadFromFile(Streamflow.csv, , , Median)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the median value for each month.

  ReadFromFile(Streamflow.csv, , , Minimum)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the minimum value for each month.

  ReadFromFile(Streamflow.csv, , , Maximum)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the maximum value for each month.

  ReadFromFile(Climate.csv, Precip[mm], , Average Of Nonzero Values)

will read in daily precipitation data from Climate.csv and calculate the monthly average of the non-zero values -- to help understand how intense the average rainstorm was.

  ReadFromFile(Climate.csv, T_Max[C], , Count Above, 32)

will read in daily temperature data from Climate.csv and calculate how many days of each month the temperature exceeded 32 C (90 F).

  ReadFromFile(Climate.csv, T_Min[C], , Count Below, 0)

will read in daily temperature data from Climate.csv and calculate how many days of each month the temperature exceeded 32 C (90 F).

  ReadFromFile(Climate.csv, Precip[mm], , Count Equal, 0)

will read in daily precipitation data from Climate.csv and calculate how many days of each month has no precipitation.

Disaggregating Monthly (or other timestep) Data to Daily

  ReadFromFile(Climate.txt, "Daily average temperature[C]", 0, Interpolate)

will read in monthly air temperature data from a data column named "Daily average temperature[C]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by interpolating the monthly values

  ReadFromFile(Climate.txt, 2, 0, Divide with Gaps, 5)

will read in monthly precipitation data from the second data column of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by splitting the monthly precipitation data into storms every 5 days

  ReadFromFile(Climate.txt, "Monthly Precip[mm]", 0, Divide)

will read in monthly precipitation data from a data column named "Monthly Precip[mm]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by splitting the monthly precipitation data evenly across every day in that month (e.g., if total January rainfall was 31 mm, each day would have 1 mm)

  ReadFromFile(Climate.txt, "Wind Speed[m/s]", 0, Repeat)

will read in monthly mean wind speed data from a data column named "Wind Speed[m/s]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by repeating the monthly value for every day in that month

Handling Missing Data

  ReadFromFile(Climate.txt, 1, 0, , , Interpolate)

will fill in any gaps in the data by a linear interpolation of the previous and next values.

  ReadFromFile(Climate.txt, 1, 0, , , Repeat)

will fill in any gaps in the data by repeating the previous value.

  ReadFromFile(Climate.txt, 1, 0, , , Replace, 2)

will fill in any gaps in the data with a 2.  (If the number is omitted, the default will be 0.)

  ReadFromFile(Climate.txt, 1, 0, , , Mark)

will fill in any gaps in the data with the MissingValue (-9999).  These will not show up on graphs, but will cause an error if the value is needed for a calculation.

  ReadFromFile(Climate.txt, 1, 0, , , Error)

will display an error if any gaps are found.

Limiting Which Years to Use and Cycling

  ReadFromFile(Climate.txt, , 1930, , , , , 1920, 1960, Cycle)

will use data from the range 1920-1960, starting with 1930.  If there more than 31 years (1930-1960 is 31 years of data) in the WEAP time horizon, WEAP will wraparound to 1920.

Annual Data

For annual data, each line of the file contains data for one year, in the format:

  Year, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

  2000, 15.123, 43.01
  2001, 10.321, 35.835
  2002, 12.423, 38.922
  ...

Data years must be in chronological order, although gaps are allowed (a zero will be added for missing values).

Monthly (or other timestep) Data

For monthly (or other timestep) data, each line of the file contains data for one month, in the format:

  Year, Month, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

  2000, 1, 44.29, 64.77
  2000, 2, 59.12, 74, 55
  ...
  2000, 12, 61.11.78.74
  2001, 1, 24.29, 44.77
  ...

Data months must be in chronological order, although gaps are allowed (a zero will be added for missing values).  

Daily Data

For daily data (which can be used if the WEAP's timestep is daily, or when aggregating daily data to monthly or other timestep, or when using the MABIA method for catchment runoff), each line of the file contains data for one day, in the format:

  Date, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

  1/1/2000, 44.29, 64.77
  1/2/2000, 59.12, 74, 55
  ...
  12/30/2000, 23.1, 88.22
  12/31/2000, 61.11.78.74
  1/1/2001, 24.29, 44.77
  ...

Note: See the "Date Format" section above for information about the format of the date (e.g., mm/dd/yyyy or dd/mm/yyyy).

Data for leap days (e.g., 2/29/2000) will be ignored unless the "Add Leap Days?" option is selected on the Years and Timesteps screen.  If data for leap days are missing and the "Add Leap Days?" option is selected, a zero will be used for those days.