Model data files
Table of contents
- The database file
- The input/output tables file
- the user-defined parameters file
- The labor-force growth rates file
- The labor productivity growth rate projections file
- The technology advancement rate projections file
- The technology gaps file
- The catchup rate projections file
- The autonomous energy efficiency improvements file
The data folder contains the data files for the model. All of the data files are in the CSV format.
The mains files are:
database.csv
- the database fileiotables.csv
- the input/output tables for all regionsuser_parameters.csv
- user-defined parameterslabor_force_growth_rates.csv
- labor-force growth rates out to the last projection year.technology_advancement_rates.csv
- the rate of advancement of the technological frontier in each sector in each year out to the last projection year.technology_gaps.csv
- The gap behind the technological frontier for each sector in each regiontechnology_catchup_rates.csv
- The rate at which the gap from the technological frontier is closed for each sector in each region in each year.autonomous_energy_efficiency_improvements.csv
- autonomous energy efficiency improvements
Additional files are likely to be present int the data
folder. These files are used to configure the baseline projections and to calibrate various parameters, depending on the specific G-Cubed model version. Commonly, these additional files will include a baseline_design.csv
file and a labor_augmenting_technical_change.csv
file and a baseline_exogenous_projections.csv
file. Details of those files are provided in the baseline projections explanation.
The database file
The database file, database.csv
, contains the model database. It has a row for each variable in the model and a value for each of the years covered by the database.
The first row of the data file contains the column headings. All other rows of the datafile contain data, with one row for each variable in the model. The variables have a strict ordering that must correspond to the ordering of the variables produced by the SYM processor from the SYM model definition. This ordering of variables can be found in the model_<VERSION>_<BUILD>_varmap.csv
file produced by the SIM processor.
Following the variables names, are descriptions, units of measurement, and the G-Cubed region code for each variable.
The input/output tables file
The single iotables.csv
file contains the Input/Output (IO) tables for all regions. Each region has an IO table. The IO tables are stacked vertically in the IO tables file.
The first column of the IO table, with the <REGION_CODE>
in the first cell, must be in the first column of the CSV file.
Each Input/Output table has the following structure:
<REGION_CODE> | a01 | … | a0N | C | I | G | X | M |
---|---|---|---|---|---|---|---|---|
g01 | ||||||||
: | ||||||||
g0N | ||||||||
L | ||||||||
K | ||||||||
TAX |
The region code
The first row and the first column of the table are labels for the rows and columns respectively.
Columns of the table describe ‘uses’ by a particular sector or for:
- Consumption -
C
- Investment -
I
- Government spending -
G
- Exports -
X
- Imports -
M
The sector labels must be the sector identifiers in the set of sectors in the SYM model definition. Typically these are a01
for sector 1, a02
for sector 2 etc.
Rows of the table describe ‘inputs’ by type of sectoral good (service) produced or:
- Labour -
L
- Capital -
K
- Tax -
TAX
The goods labels must be the good identifiers in the set of goods in the SYM model definition. Typically these are g01
for sector 1 g02
for sector 2 etc.
the user-defined parameters file
The user_parameters.csv
file contains user-defined parameters. This is the subset of the parameters in the model that users are encouraged to consider adjusting. Note that there are other parameters in the model are calibrated using information from the model’s database and IO tables.
The first column in the file contains the name of the parameter.
There is then one additional column in the file for each region in the model. column label for each region’s column is the region identifer used in the SYM model definition.
The region columns must be in the same order as the regions are defined in the regions
set in the SYM model definition.
A parameter value is required in each column, for each parameter listed in the file.
Some parameters are defined for different sectors as well as for different regions. These parameters have names that include the sector identifier, e.g. sigma_df(a01)
for sigma_df for sector a01
. For those parameters, the sectors must be in the same order as they are defined in the sectors
set in the SYM model definition
Optionally, the last row of the file can contain the word end
in the parameter name column, with no values in any of the other columns. This optional last row is ignored.
The labor-force growth rates file
The labor-force growth rates file, labor_force_growth_rates.csv
, records annual population growth rates data for all regions. It has a row for each region in the model.
The labor force growth rates file records annual labor force growth rates data for all regions, expressed as percentages so a value of 1 is a 1% growth rate.
The CSV file format for the 2R model is shown below:
2018 | … | 2150 | |
---|---|---|---|
USA | 1 | 0 | |
ROW | 2 | 0 |
The first row contains the projection year column labels in columns 2 onward in YYYY format. The following rows contain the labor force growth rates for each region. Each row of data has the SYM region code in the first column and the percentage growth rate for each year in the column corresponding to that year.
In the example above, the first projection year is 2018 and the last projection year is 2150. The USA
labor force grows at 1% in 2018 and 0% in 2150. The ROW
region labor force grows at 2% in 2018 and 0% in 2150.
The row labels should be the region identifiers from the SYM model definition. They must be in the same order as the regions are defined in the SYM model definition of the regions
set.
The labor productivity growth rate projections file
The data is available in 3 separate CSV files.
The technology advancement rate projections file
technology_advancement_rates.csv
is a CSV file that contains the rate of advancement of the technological frontier in each sector in each year out to the last projection year. It provides information about the advancement rates for technology in each sector, through all projection years. Values are expressed as a percentage so a value of 2.0 means that the technology will advance by 2% in the associated year. The data is stored with sectors for rows and projection years for columns. The row labels are the SYM sector codes. The columns are the projection years in YYYY format out to the last projection year.
For example, for the 2R model:
sector | 2018 | … | 2150 |
---|---|---|---|
a01 | 1.4 | … | 1.4 |
a02 | 1.4 | … | 1.4 |
The technology gaps file
technology_gaps.csv
is a CSV file that documents the gap behind the technological frontier for each sector in each region. The information about the technology gaps in each region for each sector is expressed as a percentage. Thus, a value of 50 means the region has a sector that is 50% as efficient as is possible in the first projection year. The data is stored with sector rows and region columns. The row labels are the SYM sector codes. The column labels are the SYM region codes.
For example, for the 2R model:
sector | USA | ROW |
---|---|---|
a01 | 90 | 100 |
a02 | 100 | 90 |
Note that the maximum value is 100 and the minimum value must be positive.
It is not mandatory, but it is typical that at least one region is on the technology frontier, with a sector value of 100.
The catchup rate projections file
technology_catchup_rates.csv
is a CSV file that documents the rate at which the gap from the technological frontier is closed for each sector in each region in each year. It sets out the catchup rates for technology in each sector, through all projection years. Values are expressed as a percentage so a value of 2.0 means that the technology gap will close by 2% in the associated year. The data is stored with region in the first column and sector in the second column. The columns are the projection years in YYYY format out to the last projection year.
For example, for the 2R model:
region | sector | 2018 | … | 2150 |
---|---|---|---|---|
USA | a01 | 2 | … | 2 |
USA | a02 | 2 | … | 2 |
ROW | a01 | 2 | … | 2 |
ROW | a02 | 2 | … | 2 |
Note that the maximum value is 100 and the minimum value must be great than -100.
The autonomous energy efficiency improvements file
The autonomous_energy_efficiency_improvements.csv
file records data on Autonomous Energy Efficiency Improvement (AEEI): exogenous improvements in the way energy contributes to production in each sector and to consumption.
See McKibbin and Wilcoxen (2013) A global approach to energy and environment: the G-Cubed model for details of AEEI.
The AEEI CSV file records Autonomous Energy Efficiency Improvements for all regions and all sectors within each region, through the projection years. It also captures these improvements for consumption.
See McKibbin and Wilcoxen (2013) A global approach to energy and environment: the G-Cubed model for details of AEEI.
An example layout for this CSV file is shown below for the 2 region/2 sector model:
2017 | 2018 | 2019 | …. | 2150 | |
---|---|---|---|---|---|
AEEI(a01,USA) | 0 | 0 | 0 | 0 | 0 |
AEEI(a01,USA) | 0 | 0 | 0 | 0 | 0 |
AEEI(a01,ROW) | 0 | 0 | 0 | 0 | 0 |
AEEI(a01,ROW) | 0 | 0 | 0 | 0 | 0 |
AEEIC(USA) | 0 | 0 | 0 | 0 | 0 |
AEEIC(ROW) | 0 | 0 | 0 | 0 | 0 |
The first row contains the ordered years as column labels.
Note that the first year can be before the first projection year.
The last year must be the last projection year.
The first column contains row labels. All row labels are made up from three components, in the following order:
- The prefix
AEEI
for production by sectors andAEEIC
for consumption. - In brackets, the set combinations that are affected by the autonomous energy efficiency improvements. For sectors, the set combinations are a sector code followed by a region code. For consumption, the set combination is just a region code.
For example, the row labelled AEEI(a01,USA)
is the AEEI projections for sector 1 for the United States.
The consumption rows must be the last rows in the file.
The sector rows must be in the SYM-defined sector order.
The rows must also be in the SYM-defined region order as you work down the file.
The data values are the percentage exogenous improvement in energy efficiency for that sector, or for consumption, in the given region for a given year. Thus, a value of 1 is a 1% improvement in energy efficiency in that year.