ReadFromFile

The ReadFromFile function allows WEAP to read annual, monthly (or whatever your timestep is) or daily data from a delimited text file (typically comma-separated value (CSV) but any separator, including spaces, can be used) into any WEAP variable. In cases where aggregate data needs to be disaggregated, such as generating a daily time series from monthly climate data, one of many disaggregation methods can be used. Alternatively, daily data can be aggregated to match your timestep (e.g., weekly or monthly), or monthly data can be aggregated to annual, using one of many aggregation methods. Missing values (either blanks or -9999) can be filled in using one of several missing value methods. You may refer to files in any directory on your computer although it is best practice to place the files in the same directory as the WEAP area, or in a subdirectory of the WEAP area directory, so that the files will be included when the area is backed up or transferred to another computer. In this case, do not include the full directory path ("absolute" reference)--just include the file name ("relative" reference). The ReadFromFile Wizard is the easiest way to construct a ReadFromFile expression, and it helps you explore, preprocess and visualize the time series data.

A text file can contain one or more columns of data for each year, month or day. The format of the WEAP expression is:

ReadFromFile(FileName) or
ReadFromFile(FileName, DataColumnNumberOrName) or
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset) or
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter) or
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter)
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse)
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse)
ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Aggregation or Disaggregation Method, Method Parameter, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse, CycleMethod)

If using the Divide into Clusters disaggregation method, there are two additional parameters before the optional Missing Value Method parameter:

ReadFromFile(FileName, DataColumnNumberOrName, YearOffset, Divide into Clusters Disaggregation Method, Method Parameter, ClusterLength, ClusterVariance, Missing Value Method, Missing Value Method Parameter, FirstYearToUse, LastYearToUse, CycleMethod)

Only the FileName parameter is required; all others are optional. If you do not specify a data column number or name, the first data column will be used. The numbering of the data columns starts after the year and month columns. Instead of using a number to refer to the data column, it is better to use its name (enclosed in quotes), because this makes the expression clearer and more resilient to later changes in the CSV file. This requires specification of the column names with the $Columns directive of the CSV file (see below for info). If the other parameters are omitted, the following defaults will be used: YearOffset = 0, Aggregation Method = Average, Disaggregation Method = Repeat, ClusterVariance = 1, Missing Value Method = Replace with 0, FirstYearToUse = first year in file, LastYearToUse = last year in file, CycleMethod = No Cycle. Each parameter is described in detail below. You can leave parameters empty, e.g., ReadFromFile(DailyPrecipitation.csv, , , Sum). WEAP can export its results in a format compatible with ReadFromFile -- click the CSV button on the table toolbar when viewing tabular results in the Data or Results View and choose the ReadFromFile Format option.

Year Offset

The optional YearOffset parameter can be used to use data from different years, and is specified either as an offset from the Base Year or as the data year from the file to use for Base Year data. For example, to use historical stream flow data (starting in 1950) for future values (2005-2025), you would use either 1950 or -55 for the YearOffset. Most people find it easier to give the YearOffset as a year rather than an offset. When specified as an offset, the YearOffset should equal the first year of data in the CSV file minus the Current Account Year. In the previous example, YearOffset = 1950 - 2005 = -55. When WEAP is looking for 2005 data, it will offset -55 years, so will read in data 55 years previous to 2005 (1950). When looking for 2006 data, it will read in data from 1951, etc. If YearOffset is blank, it will default to an offset of 0, meaning that the data in the file corresponding to the Base Year will be used for Base Year data.

Aggregating daily data to monthly (or other timestep)

In cases where you have daily data but your WEAP model is not daily, you may specify an Aggregation Method. Available methods are Sum, Average, Median, Minimum and Maximum.

Sum: Total the daily values for each month.
Average: Divide the monthly total of the daily values by the number of values to get the average.
Median: The 50% percentile value.
Minimum: The minimum daily value from each month. You can abbreviate this as "Min"
Maximum: The maximum daily value from each month. You can abbreviate this as "Max"
Average of Nonzero Values: Divide the monthly total of the daily values by the number of nonzero values to get the average. You can abbreviate this as "NonZeroAverage"
Count Above: The number of daily values in each month that are greater than or equal to the threshold given in the Method Parameter.
Count Below: The number of daily values in each month that are less than or equal to the threshold given in the Method Parameter.
Count Equal: The number of daily values in each month that are exactly equal to the threshold given in the Method Parameter.

Disaggregating monthly (or other timestep) data to daily

In cases where you have annual data but need a monthly time series for your WEAP model, or have monthly data but need a daily time series (e.g., for a daily model, or for the MABIA catchment method, which requires daily data), you may specify a Disaggregation Method. Available methods are Interpolate, Repeat, Divide, and Divide with Gaps.

Interpolate: Assume the monthly values are the values for mid-month; interpolate between them to derive daily values. Typical use would be to derive daily temperature or humidity values from monthly averages. For example, if the January value was 3 and the February value was 10, the "Interpolate" method would come up with the following daily values: January 16 = 3, January 17 = 3 + (1/29) * (10 - 3) = 3.24 (29 days between Jan 16 and Feb 14, so Jan 17 would be 1/29th of the way from the Jan 16 value to the Feb 14 value), January 18 = 3 + (2/29) * (10 - 3) = 3.48, ..., February 13 = 3 + (28/29) * (10 - 3) = 9.76, February 14 = 3 + (29/29) * (10 - 3) = 10.
Interpolate within Water Year: The same as Interpolate, unless data for an entire water year is missing, in which case it will not be interpolated. For example, if the water year starts in October, running from October to September, and if a file has a gap from June 1990 to March 1992, the water year in the middle (Oct 1990 – Sep 1991) will not be interpolated. Values for June - September 1990 will be the same value as May 1990; values for October 1991 - March 1992 will be the value from April 1992, and values from October 1990 - September 1991 will be marked as missing.
Repeat: Repeat the monthly value for each day.
Divide: Divide the monthly value evenly across each day of the month, for example, to divide the monthly precipitation evenly onto every day. For example, if the January rainfall was 62 mm, each day in January would have 2 mm.
Divide with Gaps: Divide the monthly value into a sequence of evenly spaced "events," such as rainstorms, each of which lasts one day. The frequency of events is specified by the Method Parameter. For example, if the January rainfall was 70 mm and the Disaggregation Method Parameter was 5 (rainstorms every 5 days), there would be rainfall on these days: Jan 1, Jan 6, Jan 11, Jan 16, Jan 21, Jan 26 and Jan 31. The 70 mm of January rainfall would be evenly split across these seven events; therefore, each event would have 10 mm. Note that the last event was January 31, which means that the next event would occur on February 5 (5 days later). The first event will always occur on the first day of the Current Accounts year. In the unlikely case that the frequency is larger than the length of each timestep (e.g., rainstorms every 10 days in a weekly model), WEAP will have a single event in the middle of the timestep (e.g., day 4 of the week for a weekly model).
Divide into Clusters: Divide the monthly value into a sequence of evenly spaced "events," such as rainstorms, each of which can last several days. The number of events per month is specified by the Method Parameter; the duration of each event is specified by the ClusterLength parameter. These two parameters are required. The values for each day of an event follow a curve of a normal (or Gaussian) distribution. This defaults to a standard normal distribution (variance = 1), but you can specify a different variance by the optional ClusterVariance parameter. In general, the smaller the variance, the steeper the curve (midpoint will be higher and start and endpoints will be lower). For example, if the January rainfall was 30 mm, the Disaggregation Method Parameter was 2 (two rainstorms that month), the ClusterLength was 5 (each rainstorm lasts five days), there would be a total of 15 mm of rain in each of the two rainstorms. The standard normal distribution of 15 mm over 5 days would be: 0.817mm, 3.663mm, 6.039mm, 3.663mm, 0.817mm. The first storm would start on January 6 and last through January 10; the second storm would start on January 22 and last through January 26. Using the same parameters but changing the variance from the default of 1.0 to 0.5, the normal distribution of 15 mm over 5 days would be: 0.155mm, 3.113mm, 8.463mm, 3.113mm, 0.155mm. Notice that the peak of the storm on day 3 is higher than with variance = 1, while the first and last days are lower. You can use the ReadFromFile Wizard to explore the effect of different parameter values.

Handling Missing Data

In cases where there are gaps in the data, either because of missing rows (timesteps) in the text file, or from values marked as missing in the file (-9999), you need to tell WEAP how to interpret the missing data. You may specify a Missing Value Method. Available methods are Interpolate, Repeat, Replace, Mark and Error. The method will default to Replace if none is specified, which means that missing values will be replaced with 0.

Interpolate: Linear interpolation between the previous and next (non-missing) values. You can abbreviate this as "Interp"
Repeat: Repeat the previous (non-missing) value.
Replace: The number given as the 7th parameter (Missing Value Method Parameter) will be used in place of the missing value. If the 7th parameter is not specified, 0 will be used. If no missing method is specified, this will be the default.
Divide:For daily CSV files with gaps, Divide will split a value over that date and all subsequent consecutive missing dates. For example, monthly precipitation which is given only for the first day of a month can be evenly divided over the month.
Mark: Will fill in any gaps in the data with the MissingValue (-9999). These will not show up on charts or tables. Note: most WEAP calculations will not accept MissingValue and will cause an error. For example, if your climate data has gaps, you must fill these gaps in order to use the data for catchment calculations. The gaps can be filled by one of the methods above (Interpolate, Repeat, Replace), or you can edit the file yourself to fill the gaps. One exception is the comparison of streamflow gauge data to calculated streamflow. No comparison will be made for any timestep for which the gauge data is missing.
Error: Will display an error if there are any missing data.

Note: If you are aggregating daily data into monthly values, or monthly data into annual values, WEAP will fill missing data using the Missing Value Method before aggregating. For example, if you have daily temperature data that you need to aggregate to monthly averages, you might use the Interpolate Missing Value Method to fill in gaps in the daily data, all of which will all be averaged to get the monthly average temperature. If the Missing Value Method is Mark, any missing daily values will cause the aggregated monthly value to be the MissingValue.

Limiting which Years to Use

In some cases, you may wish to choose a subset of years from the CSV file. For example, a file with streamflow gauge records has data for 1900-2010, but you want only the period 1920-1960 because the years before 1920 were poor quality (many missing values) and after 1960 the river was severely impaired (you want a "natural" flow record). Use the FirstYearToUse and LastYearToUse parameters to specify the first and last years to use. If either are blank or 0, they will default to the first or last year of data for the selected column of data.

Cycle Data

The optional 10th parameter to ReadFromFile is either "Cycle" or "No Cycle" (without the quotes). If "Cycle" is specified, WEAP will wraparound from the end of the file back to the beginning (or from the LastYearToUse to the FirstYearToUse, if those parameters are specified). For example, if the WEAP area time horizon is 2010-2040 and a CSV file has data from 1960-1969, WEAP would use data from 1960-1969 for years 2010-2019, then wraparound so that 1960-1969 would be used again for 2020-2029 and 2030-2039 (2040 would wraparound to use 1960 data). If the cycle parameter is not specified, it will default to "No Cycle."

Comments

Comment lines (any line that does not begin with a number) are ignored. Also, any part of a line after a semicolon (;) or number sign (#) will be ignored, so can be used to include comments on the same line with data. It is good practice to have a few comment lines at the beginning of the file, documenting when, how and where the data came from, units used, and which data are in each data column.

Column Descriptions

Use the optional $Columns directive to describe the contents and unit for each data column. The optional unit, if included, is in square brackets after the name. This is a very good way to document your data. In addition, you can then refer to columns by their name instead of their number in the ReadFromFile expression -- not only will this make for a clearer expression, but the reference will still be correct if you later add or remove columns in the file. Here is an example from a climate file named Climate.csv with 5 data columns (rainfall, min and max temperature, humidity and wind):

$Columns = Precipitation[mm], Min Temperature[C], Max Temperature[C], Relative Humidity[%], Wind Speed[m/s]

Here's how to refer to the Max Temperature column in the ReadFromFile expression: ReadFromFile(Climate.csv, "Max Temperature[C]") Note: you must enclose the column name in quotes. The unit is optional and can be omitted, e.g., ReadFromFile(Climate.csv, "Max Temperature")

Optionally, you may list the time columns first. Some examples:

Annual: $Columns = Year, Population
Monthly: $Columns = Year, Month, Burlington Brook USGS 1188000[CFS], Weber River USGS 1932000[CFS]
Daily: $Columns = Date, Precipitation[mm], Mean Temperature[C]

If any column names include the list separator character, you must enclose the name in quotes, e.g., "Daily Temperature, Average[C]"

The ReadFromFile and FDCShift Wizards will display information from the $Columns directive.

List separator

WEAP assumes that the values in the CSV file will be separated by the Windows List Separator character (set in the "Regional and Language Options" Windows Control Panel). However, you can override this setting by placing a $ListSeparator directive into the CSV file, on a line before any data values. (You may not use # as the list separator, because this denotes a comment line.) For example, to specify "," (comma) as the separator, include the following line at the top of the file:

$ListSeparator = ,

Because the list separator varies from country to country, it is a good practice to always include the $ListSeparator directive in the CSV file so that WEAP can read the file, regardless of the setting of the Windows List Separator character. This is helpful if the dataset is shared between users in different countries. If you will be editing your CSV file in Excel, you should enclose this entire directive in quotes so that Excel doesn't interpret the list separator character as an actual list separator, e.g., "$ListSeparator = ,"

Decimal Symbol

WEAP assumes that the numbers in the CSV file will use the Windows Decimal Symbol character (set in the "Regional and Language Options" Windows Control Panel). However, you can override this setting by placing a $DecimalSymbol directive into the CSV file, on a line before any data values. For example, to specify "." (period) as the symbol, include the following line at the top of the file:

$DecimalSymbol = .

Because the decimal symbol varies from country to country, it is a good practice to always include the $DecimalSymbol directive in the CSV file so that WEAP can read the file, regardless of the setting of the Windows Decimal Symbol character. This is helpful if the dataset is shared between users in different countries. If you will be editing your CSV file in Excel, you should enclose this entire directive in quotes so that Excel doesn't interpret the decimal symbol character as the list separator (some countries use a comma as a decimal separator while others use it as a list separator), e.g., "$DecimalSeparator = ."

Date Format (for daily data)

By default, WEAP will use the Windows system "Short date" format (set in the "Regional and Language Options" Windows Control Panel) for interpreting daily dates, e.g., mm/dd/yyyy or dd/mm/yyyy. However, because you may want to share your WEAP area with someone in another country (whose regional settings are different), it is best practice to include in the file a line that specifies the date format used. To do this, place a $DateFormat directive into the CSV file, on a line at the top before the data values. For example, to specify mm/dd/yyyy as the date format, include the following line at the top of the file:

$DateFormat = mm/dd/yyyy

Some other examples:

$DateFormat = d-m-y
$DateFormat = m d y
$DateFormat = d/m/y
$DateFormat = yyyy/m/d

When specifying the date format, you do not need to repeat the 'm', 'd', or 'y', .e.g., mm/dd/yyyy is equivalent to m/d/y. The two separator characters must be the same, e.g., m-d-y is fine, but m/d-y is not. For the date values in the CSV file, single-digit month and day numbers do not need to have a leading zero, but it is allowed, e.g., 1/3/1990 or 01/03/1990 are both fine. Years must have 4 digits.

Examples

ReadFromFile(GroundwaterRecharge.txt)

will read in data from the first data column of file GroundwaterRecharge.txt in the directory for the current WEAP area (e.g., if the area was Weaping River Basin, the file would be C:\Program Files\WEAP21\Weaping River Basin\GroundwaterRecharge.txt).

ReadFromFile(DemandActivity.txt, 2)

will read in data from the second data column of file DemandActivity.txt.

ReadFromFile(DemandActivity.txt, "Population", -50)

will read in data from a data column named "Population" of file DemandActivity.txt, shifting the years by 50 (year 1950 data in the file will be interpreted as 2000 data).

Aggregating Daily Data to Monthly (or other timestep)

ReadFromFile(DailyTemperature.csv, 1, 0, Average)

will read in daily air temperature data from the first data column of file DailyTemperature.csv, not shifting the years at all (offset = 0), and deriving monthly values by averaging the daily temperature values. Because the parameter values in this example are actually the same as the defaults, you could omit them and get the same result (monthly average of daily values): ReadFromFile(DailyTemperature.csv)

ReadFromFile(DailyPrecipitation.csv, , , Sum)

will read in daily precipitation data from DailyPrecipitation.csv and derive monthly values by totaling the daily precip values. Because Column and YearOffset parameters were not specified, their defaults are used (Column = 1, YearOffset = 0).

ReadFromFile(Streamflow.csv, , , Median)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the median value for each month.

ReadFromFile(Streamflow.csv, , , Minimum)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the minimum value for each month.

ReadFromFile(Streamflow.csv, , , Maximum)

will read in daily streamflow data from Streamflow.csv and derive monthly values by finding the maximum value for each month.

ReadFromFile(Climate.csv, Precip[mm], , Average Of Nonzero Values)

will read in daily precipitation data from Climate.csv and calculate the monthly average of the non-zero values -- to help understand how intense the average rainstorm was.

ReadFromFile(Climate.csv, T_Max[C], , Count Above, 32)

will read in daily temperature data from Climate.csv and calculate how many days of each month the temperature exceeded 32 C (90 F).

ReadFromFile(Climate.csv, T_Min[C], , Count Below, 0)

will read in daily temperature data from Climate.csv and calculate how many days of each month the temperature exceeded 32 C (90 F).

ReadFromFile(Climate.csv, Precip[mm], , Count Equal, 0)

will read in daily precipitation data from Climate.csv and calculate how many days of each month has no precipitation.

Disaggregating Monthly (or other timestep) Data to Daily

ReadFromFile(Climate.txt, "Daily average temperature[C]", 0, Interpolate)

will read in monthly air temperature data from a data column named "Daily average temperature[C]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by interpolating the monthly values

ReadFromFile(Climate.txt, 2, 0, Divide with Gaps, 5)

will read in monthly precipitation data from the second data column of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by splitting the monthly precipitation data into storms every 5 days

ReadFromFile(Climate.txt, "Monthly Precip[mm]", 0, Divide)

will read in monthly precipitation data from a data column named "Monthly Precip[mm]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by splitting the monthly precipitation data evenly across every day in that month (e.g., if total January rainfall was 31 mm, each day would have 1 mm)

ReadFromFile(Climate.txt, "Wind Speed[m/s]", 0, Repeat)

will read in monthly mean wind speed data from a data column named "Wind Speed[m/s]" of file Climate.txt, not shifting the years at all (offset = 0), and deriving daily values by repeating the monthly value for every day in that month

Handling Missing Data

ReadFromFile(Climate.txt, 1, 0, , , Interpolate)

will fill in any gaps in the data by a linear interpolation of the previous and next values.

ReadFromFile(Climate.txt, 1, 0, , , Repeat)

will fill in any gaps in the data by repeating the previous value.

ReadFromFile(Climate.txt, 1, 0, , , Replace, 2)

will fill in any gaps in the data with a 2. (If the number is omitted, the default will be 0.)

ReadFromFile(Climate.txt, 1, 0, , , Mark)

will fill in any gaps in the data with the MissingValue (-9999). These will not show up on graphs, but will cause an error if the value is needed for a calculation.

ReadFromFile(Climate.txt, 1, 0, , , Error)

will display an error if any gaps are found.

Limiting Which Years to Use and Cycling

ReadFromFile(Climate.txt, , 1930, , , , , 1920, 1960, Cycle)

will use data from the range 1920-1960, starting with 1930. If there more than 31 years (1930-1960 is 31 years of data) in the WEAP time horizon, WEAP will wraparound to 1920.

Annual Data

For annual data, each line of the file contains data for one year, in the format:

Year, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

2000, 15.123, 43.01
2001, 10.321, 35.835
2002, 12.423, 38.922
...

Data years must be in chronological order, although gaps are allowed (a zero will be added for missing values).

Monthly (or other timestep) Data

For monthly (or other timestep) data, each line of the file contains data for one month, in the format:

Year, Month, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

2000, 1, 44.29, 64.77
2000, 2, 59.12, 74, 55
...
2000, 12, 61.11.78.74
2001, 1, 24.29, 44.77
...

Data months must be in chronological order, although gaps are allowed (a zero will be added for missing values).

Daily Data

For daily data (which can be used if the WEAP's timestep is daily, or when aggregating daily data to monthly or other timestep, or when using the MABIA method for catchment runoff), each line of the file contains data for one day, in the format:

Date, DataColumn1, DataColumn2, ..., DataColumnN

e.g.,

1/1/2000, 44.29, 64.77
1/2/2000, 59.12, 74, 55
...
12/30/2000, 23.1, 88.22
12/31/2000, 61.11.78.74
1/1/2001, 24.29, 44.77
...

Note: See the "Date Format" section above for information about the format of the date (e.g., mm/dd/yyyy or dd/mm/yyyy).

Data for leap days (e.g., 2/29/2000) will be ignored unless the "Add Leap Days?" option is selected on the Years and Timesteps screen. If data for leap days are missing and the "Add Leap Days?" option is selected, a zero will be used for those days.