Model data files
Table of contents
This documentation pertains to model builds up to build 178. It describes the data files, all of which are stored in the model’s data folder. All of the data files are in the CSV format.
The mains files are:
database.csv
- the database fileiotables.csv
- the input/output tables for all regionssetparameters.csv
- user-defined parametersmodpop.csv
- population growth ratesprodmat.csv
- labor-augmenting productivity growth ratesaeeinew.csv
- autonomous energy efficiency improvements
Additional files are likely to be present int the data
folder. These files are used to configure the baseline projections and to calibrate various parameters, depending on the specific G-Cubed model version. Commonly, these additional files will include a baseline_design.csv
file and a modprod.csv
file and a product.csv
file. Details of those files are provided in the documentation of the baseline projections.
The database file
The database file, database.csv
, contains the model database. It has a row for each variable in the model and a value for each of the years covered by the database.
The first row of the data file contains the column headings. All other rows of the datafile contain data, with one row for each variable in the model. The variables have a strict ordering that must correspond to the ordering of the variables produced by the SYM processor from the SYM model definition. This ordering of variables can be found in the model_<VERSION>_<BUILD>_vars.csv
file produced by the SYM processor.
Following the variables names, are descriptions, units of measurement, and the G-Cubed region code for each variable.
The input/output tables file
The single iotables.csv
file contains the Input/Output (IO) tables for all regions. Each region has an IO table. The IO tables are stacked vertically in the IO tables file.
The first column of the IO table, with the <REGION_CODE>
in the first cell, must be in the first column of the CSV file.
Each Input/Output table has the following structure:
<REGION_CODE> | a01 | … | a0N | C | I | G | X | M |
---|---|---|---|---|---|---|---|---|
g01 | ||||||||
: | ||||||||
g0N | ||||||||
L | ||||||||
K | ||||||||
TAX |
The region code
The first row and the first column of the table are labels for the rows and columns respectively.
Columns of the table describe ‘uses’ by a particular sector or for:
- Consumption -
C
- Investment -
I
- Government spending -
G
- Exports -
X
- Imports -
M
The sector labels must be the sector identifiers in the set of sectors in the SYM model definition. Typically these are a01
for sector 1, a02
for sector 2 etc.
Rows of the table describe ‘inputs’ by type of sectoral good (service) produced or:
- Labour -
L
- Capital -
K
- Tax -
TAX
The goods labels must be the good identifiers in the set of goods in the SYM model definition. Typically these are g01
for sector 1 g02
for sector 2 etc.
the user-defined parameters file
The setparameters.csv
file contains user-defined parameters. These can be altered by the user to change the model’s behaviour.
The user_parameters.csv
file contains user-defined parameters. This is the subset of the parameters in the model that users are encouraged to consider adjusting. Note that there are other parameters in the model are calibrated using information from the model’s database and IO tables.
The first column in the file contains the name of the parameter.
There is then one additional column in the file for each region in the model. column label for each region’s column is the region identifer used in the SYM model definition.
The region columns must be in the same order as the regions are defined in the regions
set in the SYM model definition.
A parameter value is required in each column, for each parameter listed in the file.
Some parameters are defined for different sectors as well as for different regions. These parameters have names that include the sector identifier, e.g. sigma_df(a01)
for sigma_df for sector a01
. For those parameters, the sectors must be in the same order as they are defined in the sectors
set in the SYM model definition
Optionally, the last row of the file can contain the word end
in the parameter name column, with no values in any of the other columns. This optional last row is ignored.
The population growth rates file
The population growth rates file, modpop.csv
, records annual population growth rates data for all regions, expressed as percentages so a value of 1 is a 1% growth rate. It has a row for each region in the model.
The header row contains a year label in YYYY
format for each year in the population projections, out to the last projection year as specified in the model configuration.
The row labels must be the region identifiers used in the SYM model definition and they must be in the same order as the regions are defined in the SYM model.
The productivity growth rates file
The productivity growth rates file, prodmat.csv
, specifies the information needed to generate projections of labor-augumenting productivity growth rates.
When doing baseline projections with the model, the population and labour-augmenting productivity growth rate projections are combined into exogenous effective labour productivity growth rate projections.
The first row of the productivity growth rates file contains column labels. The first column label is region
. The second column label is sector
. The remaining column labels are the years from a year at or before the first projection year through to the last projection year.
The file contains:
- productivity growth in each year of the projection for each sector of the US. A value of 1 implies a 1% simple annual growth rate.
- For each non-US region, for each sector, specify the starting period fraction of US productivity for the same sector (a value of 1 implies the same productivity). This is only required in the initial period.
- For each non-US region, for each sector, in each year of the projection, specify the catch-up rate as that non-US region’s sector catches up to the productivity of the same sector in the US. A value of 0.02 implies that the gap in productivity from the previous year declines by 2% to determine the new gap to US productivity. Note that this is not the same as the productivity growth rate information.
Each of these three elements are contained in the file, one after the other. Their beginning in the file is identified by a text heading in column 1 of the row before where they start.
The labels for the three sections of this file are:
- productivity growth
- sector ratio to the USA leader
- catchup rate
These labels are case sensitive, they must be in the first column and they are relied upon when loading the productivity data.
Productivity growth data for USA sectors
The first row of this section of data just contains the G-Cubed region identifier for the United States in the first column.
There is then one row for each sector in the SYM model definition (the sectors
set): a01
and a02
, if there are 2 such sectors. They must be in the same order as the sector set membership declaration in the SYM model definition.
For each sector row, the first column is blank and the second column is the sector identifier, e.g. a01
. The row then contains a percentage growth rate for the sector in the column corresponding to each year column in the file.
Sector ratios to the United States
This section of the file has one table per non-United States region.
For each region, the table begins with a row that contains the G-Cubed region identifier in the first column and nothing else in any of the other columns.
There is then one row for each of the members of the sectors
set in the SYM model definition, a01
through to a02
, if there are 2 such sectors. They must be in the same order as the sector membership declaration in the SYM model definition.
For each sector’s row, the first column is blank and the second column is the sector identifier, e.g. a01
. The third column is the productivity ratio to the same sector in the United States. Thus, a value of 0.5 would mean productivity for that region is half the productivity of the same sector for the United States. No other columns have values.
Catchup rates to the United States
This section of the file has one table per non-United States region, in the same order as the regions are declared in the regions
set of the SYM model definition.
For each region, the table begins with a row that contains the region identifier in the first column and nothing else in any of the other columns.
There is then one row for each of the members of the sectors
set in the SYM model definition, a01
through to a02
, if there are 2 such sectors. They must be in the same order as the sector membership declaration in the SYM model definition.
For each sector’s row, the first column is blank and the second column is the sector identifier, e.g. a01
. The row then contains a decimal catchup rate for the sector in the column corresponding to each year column in the file. Thus a value of 0.02 in a particular year means that 2% of the remaining gap in productivity is closed in that year.
The autonomous energy efficiency improvements file
The aeeinew.csv
file records data on Autonomous Energy Efficiency Improvement (AEEI): exogenous improvements in the way energy contributes to production in each sector and to consumption.
See McKibbin and Wilcoxen (2013) A global approach to energy and environment: the G-Cubed model for details of AEEI.
The first row of the file contains column labels. The first column label is blank. The remaining column labels are the years from a year at or before the first projection year through to the last projection year.
The first column contains row labels. All row labels are prefixed by aeei
. They are then an integer indicating the sector and then the region identifier.
The rows must order the sectors in the same way that they are ordered when declared in the SYM model definition. They must also order the regions in the same way that they are ordered when declared in the SYM model definition.
The values in the remaining cells for each region are the percentage exogenous improvement in energy efficiency for that sector in that region for that year.
The file also contains a set of rows for consumption energy efficiency. These are the last rows in the file. There is one such row for each region. The row identifiers for these rows also start with aeei
, followed by lowercase c
for consumption, and then the region identifier.