Model definitions

Table of contents

Model definitions

Each model defines its structure using the SYM language. The SYM language was developed by Peter Wilcoxen and its usage is explained in the SYM website. The SYM model definition is a set of SYM files, each of which has a sym file extension. For each model, the SYM files define:

the sets of regions, goods, sectors;
the parameters;
the variables; and
the equations

These SYM files need to be processed and analysed to generate the Python module that expresses all of the equations in the model and various CSV and other text file resources required to solve the model and perform simulation experiments.

Every time the SYM files are modified, the SYM processor needs to be rerun. This is important because the SYM model definition has been made modular, supporting different ways of representing variables and supporting different policy representations.

Organisation of the SYM files for a model

The SYM model definition is typically modularised, spread across a number of individual SYM files.

The SYM files for a typical model definition are shown below.

├── sym
│   │   ├── ggg-<VERSION>-<BUILD>.sym
│   │   ├── ggg-sets.sym
│   │   ├── linear
│   │   │   ├── ggg-configuration.sym
│   │   │   ├── ggg-main.sym
│   │   │   ├── *.sym
│   │   ├── log
│   │   │   ├── ggg-configuration.sym
│   │   │   ├── ggg-main.sym
│   │   │   ├── *.sym
│   │   ├── model_<VERSION>_<NUMBER>_eqnmap.csv
│   │   ├── model_<VERSION>_<NUMBER>_optmap.csv
│   │   ├── model_<VERSION>_<NUMBER>_varmap.csv
│   │   ├── model_<VERSION>_<NUMBER>_varinfo.csv
│   │   ├── model_<VERSION>_<NUMBER>_vars.csv
│   │   ├── model_<VERSION>_<NUMBER>.lis
│   │   ├── model_<VERSION>_<NUMBER>.py

Aside from the SYM files themselves, the sym folder, there are a number of other files with csv, lis and py extensions that are produced by the SYM processor. These are generated from the SYM files when the SYM processor is run.

The root SYM file

ggg-<VERSION>-<BUILD>.sym is the root SYM file. The SYM processor is always run agains the root SYM file. By following the SYM include statements, the SYM processer uses the root SYM file to discover of all of the other SYM files that together fully define a model. By commenting out include statements, the model definition can be altered, for example, to determine whether the linear or log version of a model is being used.

The set definitions file

The ggg-<VERSION>-<BUILD>-sets.sym is where the main sets are defined. These capture key details for a model version including the regions in the model, the sectors in the model and the goods in the model.

The linear and log forms of the model

The sym folder contains two subfolders, linear and log. Each of these contains its own configuration SYM file and its own main SYM file, along with a number of SYM file modules.

The configuration SYM file imports the associated main SYM file. The main SYM file contains most of the variable and parameter declarations and the model equations.

The configuration SYM file also determines which modules are being used in the model. Modules are components of the model that can be expressed in different ways. For example, some models have different monetary policy modules. All models have different fiscal closure modules. Most models have modules that allow for different ways of calculating tax revenue. Make sure that the configuration SYM file only imports one file for each module that is being used in the model.

Processing SYM files

Before running a model, the SYM files for that model need to be processed by the SYM processor. The SYM processor is software that parses and extracts information from the SYM model definition. It is compiled from source and installed on your virtual machine when first create it.

The SYM processor is invoked from the commandline. It produces lists of variables and parameters as well as a library of Python functions, one for each equation in the model. The output from running the SYM processor is:

a listing of the model sets, parameters, variables and equations in the .lis file;
an implementation of each of the model equations in Python in the .py file; and
various CSV files that document model details in a way that is easily loaded when running the G-Cubed model.

To run the SYM processor on a model, make sure your current folder set to be the same folder as the one that contains the SYM file that you will be processing. This is typically in the sym folder of the model you are working with. Then you can run the SYM processor against the root SYM file. For example, for model 2R build 181, that SYM file is ggg-2R-181.sym file. To process the model definition for the 2R 181 model build, run the following command at the terminal prompt in VS Code:

sym -python ggg-2R-181.sym model_2R_181.py

The generic formulation of this command, applicable to any model build, is:

sym -python ggg-<VERSION>-<BUILD>.sym model_<VERSION>_<BUILD>.py

To change a model’s equations, you need to edit the SYM model definition and then run the SYM processor. Changes to a model definition may also require alterations to the model database and the data used to calibrate parameters, depending on the changes that have been made.

HTML model documentation

The SYM processor can also be used to generate HTML documentation of the model. The HTML documentation sets out:

definitions of various sets
definitions of the model variables and their type and their units etc.
the model parameters
the model equations

The various links between the model elements facilitate rapid exploration of the G-Cubed model. This documentation is key to understanding the model and to designing your own simulations. But before using the model documentation, it is helpful to understand a bit more about the SYM model definition language and what to expect in a model definition. That understanding can be developed by reviewing the SYM language.

To create the HTML documentation for a model, run the following command, again from the model’s SYM folder:

sym -html ggg-<VERSION>-<BUILD>.sym model_<VERSION>_<BUILD>.html

The resulting HTML document can be found in the sym folder.

SYM - the model definition language

Sets

A given variable can be required in a model for various combinations of sectors and regions etc.

For example, in the teaching model, region USA, the United States and region ROW, the rest of the World, both track their nominal GDP. Thus, the nominal GDP variable must be defined over the regions set.

Nominal GDP for the USA will be named GDPN(USA) and nominal GDP for all other countries combined will be named GDPN(ROW).

A set is a collection of distinct objects or elements, which are used to define variables, parameters and equations.

In larger models, for example, with many sectors and many regions, these sets can be subsetted in various ways as part of the model definition. This subsetting is evident in the documentation of the model definition.

Region and sector codes

Two types of sets are central to G-Cubed models, a set for regions and a set for sectors.

The regions set

In the teaching model the regions set contains 2 regions:

USA, United States
ROW, Not United States

The sectors set

In the teaching model the sectors set contains 2 sectors:

a01 The sector that produces energy
a02 The sector that produces all other material goods and services (but not capital for production or households)

All G-Cubed models have two additional sectors built into them:

Y The sector that produces capital for firms
Z The sector that produces capital for households

Care needs to be taken because these two capital-producing sectors are not explicitly included in the sectors set of G-Cubed models.

Variables

G-Cubed models include many different variables. The name of each variable has two components:

A prefix, starting with a letter and then followed by zero or more letters and digits.
An optional suffix, contained within round brackets, (), that specifies which members of associated sets, the variable relates to, e.g. GDPN(USA) is nominal GDP, identified by the prefix, for the United States, identified by the suffix that contains the regions set member for the United States.

Variable Types

The variables have different types.

end - normal endogenous variables
ets - expected next period values of endogenous variables
cos - costate (or jumping variables - that depend on expectations of the future)
sta - state variables
stl - state variables lagged by one period
exo - exogenous variables.

Only exogenous variables do not have their own equations in the SYM model definition. Their values, in all years, are determined outside of the model.

Variable Names

Variable names consist of the main name followed by relevant qualifier based on predefined sets enclosed within parentheses.

Examples:

Government debt in the USA is BOND(USA) where BOND is the main variable identifier and the qualifier contained in () is USA which is the country code for the USA.

The capital stock in sector 1 in the USA is defined as CAP(a01,USA) where CAP is the main variable identifier the first qualifier in () is the sector number and the second qualifier is the region code USA.

Variable annotations

G-Cubed uses extension attributes on variable declarations to provide additional details about how those variables are used in the model. These are:

an attribute whose value indicates the variable type - the value can be one of:
- exo for exogenous variables
- end for endogenous variables
- sta for state variables
- stl for lead state variables
- ets for expected-next-period endogenous variables
- cos for costate variables
an attribute whose value indicates the units that the variable is measured in - the units can be one of:
- gdp for variables measured as a percent of GDP
- usgdp for variables measured as a percent of US GDP
- pct for variables measured as a percentage
- del for variables measured as in percentage points (e.g. a risk premium)
- dollar for variables measured in local currency units (LCU) - e.g. a dollar for the USA
- idx attribute that is used for variable that are indices with a specific base year (e.g. prices and exchange rates);
an intertemporal_constant attribute that is used for variables where an intertemporal constant is included in the linear approximation to the equation for that variable when solving the model;
exclude_dest_equals_orig is used for variables that are defined over a destination region and a source region but where the source region cannot match the destination region (e.g. imports); and
a logged attribute is used for variables that enter the model in natural logs. Where such variables are on the left-hand side of model equations, the right-hand side of the equation needs to be in logs too. This is done for all prices and for all exchange rates. In the linear version of the model, it is not done for quantity variables. In the log version of the model it is also done for quantity variables.

Variable units

idx - index (100 in base year in the database)
rate - percentage (e.g. tax rate)
del - percentage points (e.g. shock to target inflation rate)
gdp - normalized by local gdp (numerator and denominator both expressed in USD)
usgdp - normalized by US gdp (numerator and denominator both expressed in USD)
dollar - US dollars (e.g. dollar tax on Carbon emissions)
btu - quadrillion btu
mmt - million metric tons
gwh - Gigawatt hours
btugdp - btu normalized by local gdp valued in USD
mmtgdp - mmt normalized by local gdp valued in USD
gwhgdp - Gigawatt hours normalized by local gdp valued in USD
btuusgdp - btu normalized by US gdp valued in USD
mmtusgdp - mmt normalized by US gdp valued in USD
nomusdbillion - nominal USD billion
realusdbillion - real USD billion

Parameters

Unlike variables, parameters remain constant throughout the projections generated by a model.

Like variables, parameters can also be defined over sets. The syntax for parameter names is the same as that for variables.

When users adjust parameters, (usually) the entire dynamic of the model will change. Some parameters can be safely altered by users when design experiments. Those user-defined parameters can be altered in the user parameters file. Other parameters in the model must be calibrated using information in the model’s database files. The calibration of these parameters is done by the parameter calibration class in the model’s python folder. Care must be taken to ensure that those parameters are consistent with the model’s database and IO tables.

Equations

Like variables, equations are also defined over one or more sets.

The sets associated with an equation are explicit in the model documentation.

Model equations are divided into 4 groups:

State variable equations
Costate variable equations
Equations describing the formation of expected next-period values for endogenous variables
Endogenous variables

These groups of equations and the way that they are expressed and manipulated to solve the model are explained in the G-Cubed model solution documentation.