API Reference

GeneralizIT Class

class generalizit.GeneralizIT(data, design_str, response, variance_tuple_dictionary=None)[source]

Bases: object

High-level API for Generalizability Theory analysis.

GeneralizIT provides a user-friendly interface for conducting Generalizability Theory (G-Theory) analyses, including variance component estimation, reliability coefficient calculation, Decision studies (D-studies), and confidence interval estimation.

This class serves as a wrapper around the core analytical engine (Design class), handling data preparation, research design interpretation, and result presentation.

Parameters:

data (pd.DataFrame) – Dataset containing facet variables and response measurements.
design_str (str) – String specifying the research design in standard notation (e.g., “p x i” for persons crossed with items).
response (str) – Column name in data containing the response/measurement values.

design

The underlying Design object that performs calculations.

Type:: Design

Example

>>> import pandas as pd
>>> from generalizit import GeneralizIT
>>> data = pd.read_csv("my_data.csv")
>>> gt = GeneralizIT(data, "p x i", "score")
>>> gt.calculate_anova()
>>> gt.calculate_g_coefficients()
>>> gt.g_coefficients_summary()

anova_summary()[source]

calculate_anova()[source]

Calculate variance components using ANOVA.

This method is a wrapper for the Design.calculate_anova() method, which implements Henderson’s Method 1 to estimate variance components for each facet in the specified research design.

Returns:: Results are stored in the underlying Design object.
Return type:: None

Notes

This method must be called before calculating G-coefficients, confidence intervals, or D-study scenarios.

calculate_confidence_intervals(alpha=0.05, **kwargs)[source]

Calculate confidence intervals for facet level means.

This method is a wrapper for the Design.calculate_confidence_intervals() method, which computes confidence intervals for individual facets based on variance component analysis.

Parameters:

alpha (float, optional) – Significance level for confidence intervals. Default is 0.05 (producing 95% confidence intervals).
**kwargs – Optional keyword arguments passed to Design.calculate_confidence_intervals().

Returns:

Results are stored in the underlying Design object.

Return type:

None

Raises:

RuntimeError – If ANOVA table hasn’t been calculated first.

calculate_d_study(d_study_design=None, fixed_facets=None, **kwargs)[source]

Calculate G-coefficients for alternative research designs (D-Study).

This method is a wrapper for the Design.calculate_d_study() method, which examines multiple possible study designs by generating combinations of provided facet levels.

Parameters:

d_study_design (-) – Dictionary where keys are facet names and values are lists of integers representing different numbers of levels to test.
Optional (- fixed_facets) – List of facets to be treated as fixed.
**kwargs (-) –
Optional keyword arguments passed to Design.calculate_d_study().

Returns:

Results are stored in the underlying Design object.

Return type:

None

Raises:

RuntimeError – If ANOVA table hasn’t been calculated first.

calculate_g_coefficients(fixed_facets=None, **kwargs)[source]

Calculate generalizability coefficients.

This method is a wrapper for the Design.g_coeffs() method, which computes phi-squared (Φ²) and rho-squared (ρ²) coefficients for each potential object of measurement in the design.

Parameters:

Optional (- fixed_facets) – List of facets to be treated as fixed.
**kwargs –
Optional keyword arguments passed to Design.g_coeffs(). - error_variance (bool): If True, prints detailed information about

error variances during calculation. Default is False.
- Other parameters as documented in Design.g_coeffs().

Returns:

Results are stored in the underlying Design object.

Return type:

None

Raises:

RuntimeError – If ANOVA table hasn’t been calculated first.

confidence_intervals_summary()[source]

d_study_summary()[source]

g_coefficients_summary()[source]

g_coeffs(**kwargs)[source]

[DEPRECATED] Calculate generalizability coefficients.

This method is deprecated and will be removed in a future version. Please use calculate_g_coefficients() instead.

See calculate_g_coefficients() for parameter details.

variance_summary()[source]

Design Module

class generalizit.design.Design(data, variance_tuple_dictionary, response_col, missing_data=False)[source]

Bases: object

anova_summary()[source]: Print a summary of the ANOVA results, including string indices.

calculate_anova()[source]

Performs analogous ANOVA calculations using Henderson 1953 Method 1. Determines the variance components from variance coefficients and uncorrected sum of squares (T values) for each facet, iteractions, and means.

This method executes the steps necessary to estimate variance components based on Generalizability Theory. This method does not require corrected Sum of Squares or Mean Squares, and thus they are not calculated. It also does not calculate hypothesis tests or F-statistics, as these are not relevant in G-Theory.

Steps:

Calculate the T values using _calculate_T_values.
Create the variance coefficients table with _create_variance_coefficients_table.
Estimate variance components using _calculate_variance.

Note

This method does not require and does not calculate Sum of Squares or Mean Squares.
This method emphasizes variance component estimation over hypothesis testing.

calculate_confidence_intervals(alpha=0.05, **kwargs)[source]

Calculate confidence intervals for means of each facet level.

This method computes confidence intervals for individual facets based on variance component analysis using the formula from Cardinet et al. (1976):

X ± z_(α/2) × √(σ²)

where σ² represents the sum of variance components adjusted by the appropriate levels coefficients.

Parameters:

alpha (float, optional) – Significance level for confidence intervals. Default is 0.05 (producing 95% confidence intervals).
**kwargs – Optional keyword arguments. - variance_dictionary (dict): Custom variance components to use. If not provided, components from the ANOVA table are used. - levels_df (pd.DataFrame): Custom levels coefficients table. If not provided, self.levels_coeffs is used or calculated.

Returns:

Results are stored in self.confidence_intervals

Return type:

None

self.confidence_intervals

Dictionary where keys are facet names and values are DataFrames containing confidence intervals for each level. Each DataFrame contains columns: - lower_bound: Lower CI boundary - mean: Observed mean - upper_bound: Upper CI boundary

Type:: dict

Notes

Confidence intervals are not calculated for the facet with the largest

dimensionality (typically the interaction term containing all facets). - Negative variance components are automatically set to zero with a warning. - The method requires the ANOVA table to be calculated first unless a custom variance_dictionary is provided.

Raises:: ValueError – If alpha is not between 0 and 1, ANOVA hasn’t been calculated, or if invalid parameters are provided.

calculate_d_study(d_study_design, **kwargs)[source]

Implement a D-Study to determine optimal facet levels based on G-Study variance components.

This method examines multiple possible study designs by generating all combinations of the provided facet levels. It calculates G-coefficients for each design scenario using the variance components from a previously conducted G-Study.

Parameters:

d_study_design (dict) –
Dictionary where keys are facet names and values are lists of integers representing different numbers of levels to test. For example: {

’person’: [10], # Only testing 10 persons ‘item’: [2, 3], # Testing either 2 or 3 items ‘rater’: [2, 4, 6] # Testing 2, 4, or 6 raters

} This would generate 6 different study designs (1×2×3 combinations).
**kwargs – Optional additional parameters - fixed_facets Optional[List[str]]: Facets to be treated as fixed instead of random

Returns:

Results are stored in self.d_study_dict, where keys are string representations of each design scenario and values are DataFrames containing the corresponding G-coefficients.

Return type:

None

Raises:

ValueError – If d_study_design is not properly formatted or if required
precalculations haven't been performed. –

Notes

This method requires that variance components have been calculated via a G-Study
For each design scenario, new levels coefficients are calculated
All facet combinations in the original design must be maintained

confidence_intervals_summary()[source]: Print a summary of the confidence intervals for each facet.

d_study_summary()[source]: Print a summary of the D-Study results.

g_coeff_summary()[source]: Print a summary of the g_coeff results

g_coeffs(**kwargs)[source]

Calculate G-coefficients for various scenarios of fixed and random facets.

This method computes rho^2 (relative) and phi^2 (absolute) coefficients for each potential object of measurement in the design. The coefficients quantify the reliability of measurements across different facets.

Parameters:

**kwargs –

Optional keyword arguments. - variance_dictionary (dict): Custom variance components to use.

If provided, values must be non-negative. If not provided, components from the ANOVA table are used.

levels_df (pd.DataFrame): Custom levels coefficients table.
If not provided, self.levels_coeffs is used or calculated.
variance_tuple_dictionary (dict): Custom variance tuple dictionary.
If not provided, self.variance_tuple_dictionary is used.
d_study (bool): If True, returns the G-coefficients DataFrame directly
instead of storing it in self.g_coeffs_table. Default is False.
error_variance (bool): If True, prints detailed information about
the error variances for Tau (τ), Delta (Δ), and delta (δ) during the calculation of phi-squared and rho-squared coefficients. Default is False.

Returns:

If d_study=True, returns the G-coefficients DataFrame directly. Otherwise, results are stored in self.g_coeffs_table and None is returned.

Return type:

pd.DataFrame or None

Raises:

ValueError – If: - ANOVA table hasn’t been calculated and no variance_dictionary is provided - Any variance component is negative - Levels coefficients are invalid (non-square, negative values) - Keys in variance_dictionary don’t match variance_tuple_dictionary - Levels coefficients don’t match variance components

Notes

Negative variance components are automatically set to zero with a warning
The ‘mean’ component is removed from calculations
The method produces a DataFrame with rho^2 and phi^2 values for each facet

variance_summary()[source]: Print a summary of the variance components.

Design Utilities

generalizit.design_utils.create_corollary_dictionary(design_num, design_facets)[source]

Parse a research design string and return a dictionary mapping facet types (p, i, h) to their actual values based on the design pattern.

Parameters:

design_num (int) – The design number
design_facets (list) – List of facets extracted from the design string

Returns:

Dictionary mapping facet names from the research design to base design facets. Keys are ‘p’, ‘i’, and/or ‘h’

Return type:

Dict[str, str]

Raises:

ValueError – If the design pattern is invalid or can’t be parsed

generalizit.design_utils.create_variance_tuple_dictionary(design_num, corollary_dict)[source]

Create a variance tuple dictionary for a given study design structure.

This function constructs a dictionary representing variance components for a given study design, for example Brennan Design 2 (i:p), where p and i represent facets of variation. The dictionary maps each variance component to a tuple of relevant facets.

Parameters:

design_num (int) – The design number (2,4-8) or ‘crossed’ for fully crossed designs.
corollary_dict (dict) – A dictionary mapping corollary names to actual facet names. For example, {‘p’: ‘persons’, ‘i’: ‘items’}.
``` –

Returns:

A dictionary where keys are variance component names (strings), and: values are tuples of the corresponding facets.

Return type:

dict

Example

>>> create_variance_tuple_dictionary(2, {'p': 'persons', 'i': 'items'})
{
    'p': ('p',),
    'i:p': ('i', 'p'),
    'mean': ()
}

generalizit.design_utils.get_facets_from_variance_tuple_dictionary(variance_tuple_dict)[source]

Extracts the facets from a variance tuple dictionary.

Parameters:: variance_tuple_dict (Dict[str, tuple]) – A dictionary where keys are variance component names (strings), and values are tuples of the corresponding facets.
Returns:: A list of unique facets extracted from the variance tuple dictionary.
Return type:: List[str]

generalizit.design_utils.match_research_design(input_string)[source]

Matches a string input of a research design to one of 8 predefined designs.

Parameters:

input_string (str) – The research design input as a string. Valid operators are ‘x’ for crossing and ‘:’ for nesting. Some designs require parentheses to indicate grouping.

Returns:

The design number (1 to 8) that matches the input List: List of facets extracted from the design string

Return type:

int or str

Raises:

ValueError – If the input string is invalid or malformed
TypeError – If the input is not a string

Examples

>>> match_research_design("persons x raters")  # Design 1
1
>>> match_research_design("items:persons")     # Design 2
2
>>> match_research_design("p x i x h")         # Design 3
3
>>> match_research_design("p x (i:h)")         # Design 4
4

generalizit.design_utils.parse_facets(design_num, design_facets)[source]

Parses the facets of a research design and returns a dictionary of variance components.

This function combines the functionality of create_corollary_dictionary and create_variance_tuple_dictionary to generate a comprehensive dictionary of variance components for a given research design.

Parameters:

design_num (Union[int, str]) – The design number or ‘crossed’ for fully crossed designs.
design_facets (list) – List of facets extracted from the design string.

Returns:

A dictionary where keys are variance component names (strings), and: values are tuples of the corresponding facets.

Return type:

Dict[str, tuple]

Raises:

ValueError – If the design pattern is invalid or can’t be parsed.

Example

>>> parse_facets(2, ['items', 'persons'])
{
    'p': ('persons',),
    'i:p': ('items', 'persons'),
    'mean': ()
}

generalizit.design_utils.validate_research_design(design_number)[source]

Validates if a design number is valid (1-8).

Parameters:: design_number (Optional[int]) – The design number to validate
Returns:: True if valid, False otherwise
Return type:: bool

G-Theory Utilities

generalizit.g_theory_utils.adjust_for_fixed_effects(variance_tup_dict, variance_df, levels_df, fixed_facets, verbose=False)[source]

Adjust variance components for fixed facets in any design. Follows Brennan’s rule 4.3.1: For every alpha, absorb any variance component with alpha and fixed facet into the lower-order component. :type variance_tup_dict: Dict[str, tuple] :param variance_tup_dict: dict mapping component names to tuples of facets. :type variance_df: DataFrame :param variance_df: DataFrame with index as variance component names and a ‘Variance’ column. :type levels_df: DataFrame :param levels_df: DataFrame of levels coefficients (1/levels). :type fixed_facets: Optional[List[str]] :param fixed_facets: list of facets to fix (e.g., [‘i’]). :type verbose: bool :param verbose: bool, if True, print debug information.

Returns:: dict mapping adjusted component names to tuples of facets. adjusted_variance_df: DataFrame with adjusted variance components.
Return type:: adjusted_variance_tup_dict

generalizit.g_theory_utils.create_pseudo_df(d_study, variance_tup_dict)[source]

Create a pseudo DataFrame with all possible combinations of facet levels.

Parameters:

d_study (dict) – A dictionary representing the study design with facets as keys and the number of levels for each facet as values. Values can be either integers or lists of integers.
variance_tup_dict (dict) – A dictionary mapping facet names to tuples containing the component facets.

Returns:

A pseudo DataFrame with all possible combinations of facet levels.

Return type:

pd.DataFrame