gcubed.derivations

Derived variable subsystem.

Overview

gcubed.derivations exposes the current derived-variable API. Use Derivations when constructing definitions explicitly, registry helpers such as create_derivations when selecting built-in prefixes by name, and workflow helpers when runner or reporting code needs projection and deviation frames with derived rows appended.

class DerivationRegistryError(builtins.ValueError):

Overview

Raised when configured derived-variable definitions cannot be resolved.

This includes unknown built-in prefixes, duplicate built-in prefixes, malformed configuration shapes, factories that require arguments, and factories that do not return DerivationDefinition objects.

@dataclass(frozen=True, kw_only=True)
class DerivationDefinition:

Overview

Metadata and calculation callables for one derived variable.

A DerivationDefinition does not calculate anything by itself. It records enough information for Derivations to validate the definition against a projection context, resolve output row names, run the level and deviation callables, and attach charting-compatible metadata.

Arguments

prefix: Prefix used for all generated output rows, for example GDPRGROWTH.

sets: Ordered tuple of set names that define the output domain. These can be model-native sets such as regions, sectors, or goods, or local sets supplied through local_sets.

label: Human-readable row description written to the charting metadata.

units: Units for level projections.

deviation_units: Units for deviations between two projection sets.

levels: Callable that accepts a DerivationContext and returns a dataframe containing one row per resolved output-domain member and one column per projection year.

deviations: Callable that accepts (new_context, original_context) and returns a dataframe containing derived deviations for the same output domain and projection-year columns.

local_sets: Optional mapping of definition-local set names to ordered members. Local set names must not collide with model-native set names.

local_set_descriptions: Descriptions for each local set.

local_set_member_descriptions: Descriptions for each local set member.

required_variable_prefixes: Model-variable prefixes that must exist in the projection context before this definition can be calculated.

required_derived_prefixes: Derived-variable prefixes that must be calculated before this definition. These drive dependency ordering.

domain_filter: Optional callable used to filter resolved domain tuples. It is called as (domain, set_names, sym_data) and should return True for rows that should be emitted.

custom_validation_rules: Optional tuple of callables that perform definition-specific validation against the context.

description_markdown: Optional markdown description used by definition inventories and documentation. If omitted, the levels docstring is used.

Returns

A frozen definition object. Mapping arguments are normalised into immutable mapping proxies and sequence arguments are normalised into tuples.

Exceptions

Raises ValueError if required strings are empty, tuple arguments are not tuples of strings, callables are missing, local set descriptions are incomplete, or duplicate required prefixes are supplied.

DerivationDefinition( *, prefix: str, sets: tuple[str, ...], label: str, units: str, deviation_units: str, levels: Callable[..., typing.Any], deviations: Callable[..., typing.Any], local_sets: Mapping[str, Sequence[str]] = <factory>, local_set_descriptions: Mapping[str, str] = <factory>, local_set_member_descriptions: Mapping[str, str] = <factory>, required_variable_prefixes: tuple[str, ...] = (), required_derived_prefixes: tuple[str, ...] = (), domain_filter: Callable[[tuple[str, ...], tuple[str, ...], Any], bool] | None = None, custom_validation_rules: tuple[Callable[..., None], ...] = (), description_markdown: str | None = None)
prefix: str
sets: tuple[str, ...]
label: str
units: str
deviation_units: str
levels: Callable[..., typing.Any]
deviations: Callable[..., typing.Any]
local_sets: Mapping[str, Sequence[str]]
local_set_descriptions: Mapping[str, str]
local_set_member_descriptions: Mapping[str, str]
required_variable_prefixes: tuple[str, ...] = ()
required_derived_prefixes: tuple[str, ...] = ()
domain_filter: Callable[[tuple[str, ...], tuple[str, ...], Any], bool] | None = None
custom_validation_rules: tuple[Callable[..., None], ...] = ()
description_markdown: str | None = None
def validate(self, context: Any) -> None:

Overview

Validate this definition against one model or projection context.

Arguments

context: A DerivationContext or compatible object exposing the model metadata and projection frames needed by validation.

Returns

None.

Exceptions

Raises DerivationValidationError when model-native sets, required model-variable prefixes, required derived prefixes, local sets, or custom validation rules do not match the context.

@dataclass(frozen=True, kw_only=True)
class DerivationContext:

Overview

Read-safe view of the projection state needed by derived definitions.

The context copies incoming dataframes on construction and returns cached derived rows by copy. The only intentional mutation is the internal cache used by Derivations while executing dependency-ordered definitions.

Arguments

configuration: Model configuration associated with the projections.

sym_data: SYM metadata used to resolve model sets and variable metadata.

parameters: Model parameters associated with the projections.

charting_projections: Charting-compatible projection dataframe containing metadata columns and four-digit projection-year columns.

projection_years: Tuple of projection-year labels used by derived calculations.

database_projections: Optional database-projection dataframe. Some custom definitions and satellite calculations require access to this lower-level frame.

Returns

A context object suitable for passing to definition levels, deviations, and validation callables.

DerivationContext( *, configuration: Any, sym_data: Any, parameters: Any, charting_projections: pandas.DataFrame, projection_years: tuple[typing.Any, ...], database_projections: pandas.DataFrame | None = None, _derived_level_rows: dict[str, pandas.DataFrame] = <factory>)
configuration: Any
sym_data: Any
parameters: Any
charting_projections: pandas.DataFrame
projection_years: tuple[typing.Any, ...]
database_projections: pandas.DataFrame | None = None
@classmethod
def from_projections(cls, projections: Any) -> DerivationContext:

Overview

Build a context from a G-Cubed projection-like object.

Arguments

projections: Object exposing charting_projections and either direct configuration, sym_data, and parameters attributes or a model attribute that exposes those values. database_projections is optional.

Returns

A DerivationContext with copied projection dataframes and resolved projection-year labels.

Exceptions

Raises DerivationFrameError when required projection attributes are missing or when projection frames are not pandas dataframes.

def derived_level_rows(self, prefix: str) -> pandas.DataFrame:

Overview

Return cached upstream level rows for one derived-variable prefix.

Definitions use this when their required_derived_prefixes depend on another derived variable's level rows.

Arguments

prefix: Prefix of the upstream derived variable.

Returns

A deep copy of the cached level rows for the prefix.

Exceptions

Raises KeyError if no upstream level rows have been cached for the requested prefix.

class DerivationDeviationError(builtins.ValueError):

Overview

Raised when new and original projections cannot be compared.

Examples include mismatched projection years, different charting metadata, inconsistent output domains, or a definition deviation callable that fails.

class DerivationFrameError(builtins.ValueError):

Overview

Raised when derived values cannot be aligned to charting rows.

The error usually means a definition returned a non-dataframe result, the wrong number of rows, missing projection-year columns, or a projection context without the expected frame attributes.

class DerivationValidationError(builtins.ValueError):

Overview

Raised when a derived variable definition is incompatible with a context.

Validation errors are deliberately aggregated where possible so scripts see all incompatible definitions in one failure rather than discovering them one at a time.

class DerivationWorkflowError(builtins.ValueError):

Overview

Raised when a workflow-level derived-variable operation is misconfigured.

Examples include passing both derivations and derived_variables, passing an unsupported resolved object, or using projection/runner objects without the frame attributes required by reporting workflows.

class Derivations:

Overview

Ordered collection of derived variable definitions.

The collection preserves caller order where possible, but dependency ordering can move an upstream derived variable earlier when another definition lists it in required_derived_prefixes.

Arguments

definitions: Iterable of DerivationDefinition objects to calculate.

Exceptions

Raises ValueError if more than one definition uses the same output prefix.

Derivations( definitions: Iterable[DerivationDefinition] = ())

Overview

Store definitions and validate collection-level invariants.

Arguments

definitions: Iterable of definition objects. It is consumed once and stored as a tuple.

Exceptions

Raises ValueError if duplicate prefixes are found.

definitions: tuple[DerivationDefinition, ...]

Overview

Return configured definitions in caller order.

Returns

A tuple of DerivationDefinition objects.

definition_details: pandas.DataFrame

Overview

Return a metadata inventory for configured derived definitions.

Returns

A dataframe indexed by derived-variable prefix. The columns include charting labels, units, dependency metadata, domain sets, and description_markdown.

def validate(self, context) -> None:

Overview

Validate all definitions against one model or projection context.

Arguments

context: A DerivationContext or compatible object that exposes sym_data, projection metadata, and charting rows.

Returns

None.

Exceptions

Raises DerivationValidationError when the dependency graph is invalid or a definition cannot be validated against the context.

def projections(self, projections) -> pandas.DataFrame:

Overview

Return charting projections followed by configured derived level rows.

Arguments

projections: Projection object exposing charting_projections and the model metadata required by DerivationContext.from_projections.

Returns

A dataframe containing the original charting rows followed by the configured derived-variable level rows.

Exceptions

Raises DerivationFrameError when the projection object or derived calculation frames do not satisfy the charting-frame contract.

Raises DerivationValidationError when configured definitions are not valid for the projection context.

def deviations(self, new_projections, original_projections) -> pandas.DataFrame:

Overview

Return original model-variable deviations followed by configured derived deviation rows.

Arguments

new_projections: Projection object for the scenario or later layer.

original_projections: Projection object for the baseline or previous layer.

Returns

A dataframe containing original G-Cubed deviations followed by derived deviation rows calculated from the same two projection contexts.

Exceptions

Raises DerivationDeviationError when the two projection objects are not comparable or a derived deviation frame cannot be built.

Raises DerivationValidationError when either context is invalid for the configured definitions.

def append_derived_deviations( new_projections: Any, original_projections: Any, derivations: Any = None, *, derived_variables: Any = None) -> pandas.DataFrame:

Overview

Return original model-variable deviations followed by derived deviations.

Arguments

new_projections: Projection object for the scenario or later layer.

original_projections: Projection object for the baseline or previous layer.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

Original model-variable deviations when no derived variables are configured, or that dataframe with derived deviation rows appended.

Exceptions

Raises DerivationWorkflowError for mutually exclusive or unsupported workflow inputs.

May raise DerivationRegistryError, DerivationDeviationError, or DerivationValidationError while resolving and calculating definitions.

def append_derived_projections( projections: Any, derivations: Any = None, *, derived_variables: Any = None) -> pandas.DataFrame:

Overview

Return charting projections followed by configured derived level rows.

Arguments

projections: Projection object exposing charting_projections and model metadata required by DerivationContext.from_projections.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration. Pass either derivations or derived_variables, not both.

Returns

Original charting projections when no derived variables are configured, or charting projections with derived level rows appended.

Exceptions

Raises DerivationWorkflowError when inputs cannot be resolved or the projection object lacks a charting dataframe.

May raise DerivationRegistryError, DerivationFrameError, or DerivationValidationError while resolving and calculating definitions.

def available_definition_prefixes() -> tuple[str, ...]:

Overview

Return built-in derived-variable prefixes that can be configured by name.

Returns

Sorted tuple of built-in prefix strings.

Exceptions

Raises DerivationRegistryError if the built-in factory catalogue cannot be loaded.

def baseline_deviation_frames( projections: Iterable[typing.Any], derivations: Any = None, *, derived_variables: Any = None) -> tuple[pandas.DataFrame, ...]:

Overview

Return deviations for each non-baseline projection relative to the baseline.

Arguments

projections: Iterable whose first item is the baseline projection and whose later items are compared to that baseline.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

Empty tuple when fewer than two projections are supplied. Otherwise, one deviation dataframe for each non-baseline projection.

Exceptions

Raises the same exceptions as append_derived_deviations.

def builtin_definition_factories() -> Mapping[str, Callable[[], DerivationDefinition]]:

Overview

Return built-in derived-variable definition factories keyed by prefix.

Returns

Read-only mapping from built-in derived-variable prefix to no-argument factory.

Exceptions

Raises DerivationRegistryError if the built-in factory catalogue contains duplicate prefixes or a factory cannot instantiate a valid definition.

def create_definition(prefix: str) -> DerivationDefinition:

Overview

Instantiate one built-in derived-variable definition by prefix.

Arguments

prefix: Built-in derived-variable prefix.

Returns

A new DerivationDefinition object.

Exceptions

Raises DerivationRegistryError if the prefix cannot be resolved.

def create_derivations( prefixes: Iterable[str] | None = None) -> Derivations:

Overview

Instantiate a Derivations collection from built-in definition prefixes.

Arguments

prefixes: Iterable of built-in derived-variable prefixes. None creates an empty collection.

Returns

A Derivations collection containing one freshly instantiated definition for each prefix.

Exceptions

Raises DerivationRegistryError if any prefix is unknown.

Raises ValueError if the resulting collection contains duplicate prefixes.

def create_derivations_from_config(config: Any = None) -> Derivations:

Overview

Resolve experiment/script configuration into a Derivations collection.

Arguments

config: None, a prefix string, a Derivations object, a DerivationDefinition, a no-argument factory, a sequence of those entries, or a mapping containing derived_variables or derivations.

Returns

A Derivations collection. Existing Derivations objects are returned unchanged.

Exceptions

Raises DerivationRegistryError for malformed configuration or entries.

Raises ValueError if the resulting collection contains duplicate prefixes.

def definition_factory( prefix: str) -> Callable[[], DerivationDefinition]:

Overview

Return the built-in factory for one derived-variable prefix.

Arguments

prefix: Built-in derived-variable prefix.

Returns

No-argument factory that returns a DerivationDefinition.

Exceptions

Raises DerivationRegistryError if prefix is empty or unknown.

def derived_frame(definition: Any, context: Any) -> pandas.DataFrame:

Overview

Calculate one definition and return a charting-compatible dataframe.

Arguments

definition: DerivationDefinition or compatible object exposing prefix, sets, metadata, optional local sets, optional domain filter, and a levels callable.

context: DerivationContext or compatible object exposing sym_data, charting_projections, and projection-year columns.

Returns

A dataframe with generated charting row names, charting metadata columns, and aligned projection-year values.

Exceptions

Raises DerivationFrameError when context attributes are missing or level outputs cannot be aligned to the resolved domain and projection years.

Raises ValueError from domain resolution when sets or local sets are invalid.

def domain_tuples( sym_data: Any, sets: tuple[str, ...], local_sets: Mapping[str, Sequence[str]] | None = None, domain_filter: Callable[[tuple[str, ...], tuple[str, ...], Any], bool] | None = None) -> tuple[tuple[str, ...], ...]:

Overview

Resolve ordered set names into Cartesian-product domain tuples.

Arguments

sym_data: SYM metadata object used to resolve model-native sets.

sets: Ordered tuple of set names. An empty tuple represents one scalar output row with an empty domain tuple.

local_sets: Optional mapping of definition-local set names to ordered members.

domain_filter: Optional callable invoked as (domain, set_names, sym_data). Domains for which it returns false are omitted.

Returns

Tuple of domain tuples in deterministic Cartesian-product order.

Exceptions

Raises ValueError for malformed sets, local-set collisions, unknown set names, or a non-callable domain filter.

def incremental_deviation_frames( projections: Iterable[typing.Any], derivations: Any = None, *, derived_variables: Any = None) -> tuple[pandas.DataFrame, ...]:

Overview

Return deviations for each layer relative to its immediately previous layer.

The first projection is the baseline and the second projection is the first simulation layer; incremental deviations start with the third projection relative to the second.

Arguments

projections: Iterable of projection objects ordered as produced by a runner.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

Empty tuple when fewer than three projections are supplied. Otherwise, one deviation dataframe for each layer after the first simulation layer.

Exceptions

Raises the same exceptions as append_derived_deviations.

def projection_frames( projections: Iterable[typing.Any], derivations: Any = None, *, derived_variables: Any = None) -> tuple[pandas.DataFrame, ...]:

Overview

Return combined projection frames for each supplied projection object.

Arguments

projections: Iterable of projection objects.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

Tuple of projection dataframes, each with derived level rows appended when configured.

Exceptions

Raises the same exceptions as append_derived_projections.

def resolve_set_members( sym_data: Any, set_name: str, local_sets: Mapping[str, Sequence[str]] | None = None) -> tuple[str, ...]:

Overview

Resolve one named set into ordered members.

The set can be model-native, usually from sym_data, or definition-local through local_sets. Local sets may extend the model domain, but they may not silently replace a model-native set.

Arguments

sym_data: SYM metadata object used to resolve model-native set members.

set_name: Name of the set to resolve.

local_sets: Optional mapping of definition-local set names to ordered members.

Returns

Ordered tuple of set members.

Exceptions

Raises ValueError if set_name is empty, if local sets are malformed, if a local set collides with a model-native set, or if no matching set can be found.

def resolve_definition_prefixes(config: Any = None) -> tuple[str, ...]:

Overview

Resolve a simple experiment/script configuration to definition prefixes.

Supported shapes are:

  • None
  • one prefix string
  • a sequence of prefix strings, DerivationDefinition objects, or factories
  • a mapping with derived_variables or derivations containing that shape

Arguments

config: Configuration object in one of the supported shapes.

Returns

Tuple of resolved derived-variable prefixes.

Exceptions

Raises DerivationRegistryError for malformed mappings, unknown prefixes, invalid factories, or unsupported entry types.

def resolve_derivations(derivations: Any = None, *, derived_variables: Any = None) -> Any:

Overview

Resolve workflow derivation inputs into a Derivations object or None.

Arguments

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

None if no derived variables are configured, an existing Derivations object if supplied, or a new Derivations object resolved from the configuration.

Exceptions

Raises DerivationWorkflowError if both arguments are supplied or if a non-registry exception occurs during resolution.

Raises DerivationRegistryError for registry-level configuration failures.

def runner_projection_frames( runner: Any, derivations: Any = None, *, derived_variables: Any = None) -> tuple[pandas.DataFrame, ...]:

Overview

Return combined projection frames for a completed runner's projections.

Arguments

runner: Completed runner object exposing all_projections.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

Tuple of projection frames for every runner projection.

Exceptions

Raises DerivationWorkflowError if the runner does not expose all_projections, plus any exceptions raised by projection_frames.

def validate_derivations_for_projection( projections: Any, derivations: Any = None, *, derived_variables: Any = None) -> None:

Overview

Validate configured derivations against one projection object.

Arguments

projections: Projection object used to build a DerivationContext.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

None. No validation is performed when no derived variables are configured.

Exceptions

Raises DerivationWorkflowError, DerivationRegistryError, DerivationFrameError, or DerivationValidationError when configuration or projection context is invalid.

def validate_derivations_for_runner( runner: Any, derivations: Any = None, *, derived_variables: Any = None) -> None:

Overview

Validate configured derivations against each completed runner projection.

Arguments

runner: Completed runner object exposing all_projections.

derivations: Optional Derivations object or registry-supported configuration.

derived_variables: Optional alias for registry-supported configuration.

Returns

None.

Exceptions

Raises DerivationWorkflowError if the runner does not expose all_projections, plus validation exceptions from validate_derivations_for_projection.

def variable_names(prefix: str, domains: Iterable[Sequence[str]]) -> tuple[str, ...]:

Overview

Format variable row names for charting-style dataframes.

Arguments

prefix: Variable prefix.

domains: Iterable of ordered domain-member sequences.

Returns

Tuple of names in PREFIX(member,member) form. Scalar domains are formatted as PREFIX().

Exceptions

Raises ValueError if the prefix or any domain member is not a non-empty string.