Model definitions
Table of contents
Each model defines its structure using the SYM language. The SYM language was developed by Peter Wilcoxen and its usage is explained in the SYM website. The SYM model definition is a set of SYM files, each of which has a sym
file extension. For each model, the SYM files define:
- the sets of regions, goods, sectors;
- the parameters;
- the variables; and
- the equations
These SYM files need to be processed and analysed to generate the Python module that expresses all of the equations in the model and various CSV and other text file resources required to solve the model and perform simulation experiments.
Every time the SYM files are modified, the SYM processor needs to be rerun. This is important because the SYM model definition has been made modular, supporting different ways of representing variables and supporting different policy representations.
Organisation of the SYM files for a model
The SYM model definition is typically modularised, spread across a number of individual SYM files.
The SYM files for a typical model definition are shown below.
├── sym
│ │ ├── ggg-<VERSION>-<BUILD>.sym
│ │ ├── ggg-sets.sym
│ │ ├── linear
│ │ │ ├── ggg-configuration.sym
│ │ │ ├── ggg-main.sym
│ │ │ ├── *.sym
│ │ ├── log
│ │ │ ├── ggg-configuration.sym
│ │ │ ├── ggg-main.sym
│ │ │ ├── *.sym
│ │ ├── model_<VERSION>_<NUMBER>_eqnmap.csv
│ │ ├── model_<VERSION>_<NUMBER>_optmap.csv
│ │ ├── model_<VERSION>_<NUMBER>_varmap.csv
│ │ ├── model_<VERSION>_<NUMBER>_varinfo.csv
│ │ ├── model_<VERSION>_<NUMBER>_vars.csv
│ │ ├── model_<VERSION>_<NUMBER>.lis
│ │ ├── model_<VERSION>_<NUMBER>.py
Aside from the SYM files themselves, the sym
folder, there are a number of other files with csv
, lis
and py
extensions that are produced by the SYM processor. These are generated from the SYM files when the SYM processor is run.
The root SYM file
ggg-<VERSION>-<BUILD>.sym
is the root SYM file. The SYM processor is always run agains the root SYM file. By following the SYM include
statements, the SYM processer uses the root SYM file to discover of all of the other SYM files that together fully define a model. By commenting out include
statements, the model definition can be altered, for example, to determine whether the linear
or log
version of a model is being used.
The set definitions file
The ggg-<VERSION>-<BUILD>-sets.sym
is where the main sets are defined. These capture key details for a model version including the regions in the model, the sectors in the model and the goods in the model.
The linear and log forms of the model
The sym
folder contains two subfolders, linear
and log
. Each of these contains its own configuration SYM file and its own main SYM file, along with a number of SYM file modules.
The configuration SYM file imports the associated main SYM file. The main SYM file contains most of the variable and parameter declarations and the model equations.
The configuration SYM file also determines which modules are being used in the model. Modules are components of the model that can be expressed in different ways. For example, some models have different monetary policy modules. All models have different fiscal closure modules. Most models have modules that allow for different ways of calculating tax revenue. Make sure that the configuration SYM file only imports one file for each module that is being used in the model.
Processing SYM files
Before running a model, the SYM files for that model need to be processed by the SYM processor. The SYM processor is software that parses and extracts information from the SYM model definition. It is compiled from source and installed on your virtual machine when first create it.
The SYM processor is invoked from the commandline. It produces lists of variables and parameters as well as a library of Python functions, one for each equation in the model. The output from running the SYM processor is:
- a listing of the model sets, parameters, variables and equations in the
.lis
file; - an implementation of each of the model equations in Python in the
.py
file; and -
various CSV files that document model details in a way that is easily loaded when running the G-Cubed model.
To run the SYM processor on a model, make sure your current folder set to be the same folder as the one that contains the SYM file that you will be processing. This is typically in the sym folder of the model you are working with. Then you can run the SYM processor against the root SYM file. For example, for model 2R build 178, that SYM file is
ggg-2R-178.sym
file. To process the model definition for the 2R 178 model build, run the following command at the terminal prompt in VS Code:
sym -python ggg-2R-178.sym model_2R_178.py
The generic formulation of this command, applicable to any model build, is:
sym -python ggg-<VERSION>-<BUILD>.sym model_<VERSION>_<BUILD>.py
To change a model’s equations, you need to edit the SYM model definition and then run the SYM processor. Changes to a model definition may also require alterations to the model database and the data used to calibrate parameters, depending on the changes that have been made.
HTML model documentation
The SYM processor can also be used to generate HTML documentation of the model. The HTML documentation sets out:
- definitions of various sets
- definitions of the model variables and their type and their units etc.
- the model parameters
- the model equations
The various links between the model elements facilitate rapid exploration of the G-Cubed model. This documentation is key to understanding the model and to designing your own simulations. But before using the model documentation, it is helpful to understand a bit more about the SYM model definition language and what to expect in a model definition. That understanding can be developed by reviewing the SYM language.
To create the HTML documentation for a model, run the following command, again from the model’s SYM folder:
sym -html ggg-<VERSION>-<BUILD>.sym model_<VERSION>_<BUILD>.html
The resulting HTML document can be found in the sym
folder.
SYM - the model definition language
Sets
A given variable can be required in a model for various combinations of sectors and regions etc.
For example, in the teaching model, region USA
, the United States and region ROW
, the rest of the World, both track their nominal GDP. Thus, the nominal GDP variable must be defined over the regions set.
Nominal GDP for the USA will be named GDPN(USA)
and nominal GDP for all other countries combined will be named GDPN(ROW)
.
A set is a collection of distinct objects or elements, which are used to define variables, parameters and equations.
In larger models, for example, with many sectors and many regions, these sets can be subsetted in various ways as part of the model definition. This subsetting is evident in the documentation of the model definition.
Region and sector codes
Two types of sets are central to G-Cubed models, a set for regions and a set for sectors.
The regions set
In the teaching model the regions
set contains 2 regions:
USA
, United States
ROW
, Not United States
The sectors set
In the teaching model the sectors
set contains 2 sectors:
a01
The sector that produces energya02
The sector that produces all other material goods and services (but not capital for production or households)
All G-Cubed models have two additional sectors built into them:
Y
The sector that produces capital for firmsZ
The sector that produces capital for households
Care needs to be taken because these two capital-producing sectors are not explicitly included in the sectors set of G-Cubed models.
Variables
G-Cubed models include many different variables. The name of each variable has two components:
- A prefix, starting with a letter and then followed by zero or more letters and digits.
- An optional suffix, contained within round brackets,
()
, that specifies which members of associated sets, the variable relates to, e.g.GDPN(USA)
is nominal GDP, identified by the prefix, for the United States, identified by the suffix that contains the regions set member for the United States.
Variable Types
The variables have different types.
end
- normal endogenous variablesets
- expected next period values of endogenous variablescos
- costate (or jumping variables - that depend on expectations of the future)sta
- state variablesstl
- state variables lagged by one periodexo
- exogenous variables.
Only exogenous variables do not have their own equations in the SYM model definition. Their values, in all years, are determined outside of the model.
Variable Names
Variable names consist of the main name followed by relevant qualifier based on predefined sets enclosed within parentheses.
Examples:
Government debt in the USA is BOND(USA)
where BOND
is the main variable identifier and the qualifier contained in () is USA
which is the country code for the USA.
The capital stock in sector 1 in the USA is defined as CAP(a01,USA)
where CAP
is the main variable identifier the first qualifier in () is the sector number and the second qualifier is the region code USA
.
Variable annotations
G-Cubed uses extension attributes on variable declarations to provide additional details about how those variables are used in the model. These are:
- an attribute whose value indicates the variable type - the value can be one of:
exo
for exogenous variablesend
for endogenous variablessta
for state variablesstl
for lead state variablesets
for expected-next-period endogenous variablescos
for costate variables
- an attribute whose value indicates the units that the variable is measured in - the units can be one of:
gdp
for variables measured as a percent of GDPusgdp
for variables measured as a percent of US GDPpct
for variables measured as a percentagedel
for variables measured as in percentage points (e.g. a risk premium)dollar
for variables measured in local currency units (LCU) - e.g. a dollar for the USAidx
attribute that is used for variable that are indices with a specific base year (e.g. prices and exchange rates);
- an
intertemporal_constant
attribute that is used for variables where an intertemporal constant is included in the linear approximation to the equation for that variable when solving the model; exclude_dest_equals_orig
is used for variables that are defined over a destination region and a source region but where the source region cannot match the destination region (e.g. imports); and- a
logged
attribute is used for variables that enter the model in natural logs. Where such variables are on the left-hand side of model equations, the right-hand side of the equation needs to be in logs too. This is done for all prices and for all exchange rates. In the linear version of the model, it is not done for quantity variables. In the log version of the model it is also done for quantity variables.
Variable units
- idx - index (100 in base year in the database)
- rate - percentage (e.g. tax rate)
- del - percentage points (e.g. shock to target inflation rate)
- gdp - normalized by local gdp (numerator and denominator both expressed in USD)
- usgdp - normalized by US gdp (numerator and denominator both expressed in USD)
- dollar - US dollars (e.g. dollar tax on Carbon emissions)
- btu - quadrillion btu
- mmt - million metric tons
- gwh - Gigawatt hours
- btugdp - btu normalized by local gdp valued in USD
- mmtgdp - mmt normalized by local gdp valued in USD
- gwhgdp - Gigawatt hours normalized by local gdp valued in USD
- btuusgdp - btu normalized by US gdp valued in USD
- mmtusgdp - mmt normalized by US gdp valued in USD
- nomusdbillion - nominal USD billion
- realusdbillion - real USD billion
Parameters
Unlike variables, parameters remain constant throughout the projections generated by a model.
Like variables, parameters can also be defined over sets. The syntax for parameter names is the same as that for variables.
When users adjust parameters, (usually) the entire dynamic of the model will change. Some parameters can be safely altered by users when design experiments. Those user-defined parameters can be altered in the user parameters file. Other parameters in the model must be calibrated using information in the model’s database files. The calibration of these parameters is done by the parameter calibration class in the model’s
python
folder. Care must be taken to ensure that those parameters are consistent with the model’s database and IO tables.
Equations
Like variables, equations are also defined over one or more sets.
The sets associated with an equation are explicit in the model documentation.
Model equations are divided into 4 groups:
- State variable equations
- Costate variable equations
- Equations describing the formation of expected next-period values for endogenous variables
- Endogenous variables
These groups of equations and the way that they are expressed and manipulated to solve the model are explained in the G-Cubed model solution documentation.