API Reference
GeneralizIT Class
- class generalizit.GeneralizIT(data, design_str, response, variance_tuple_dictionary=None)[source]
Bases:
object
High-level API for Generalizability Theory analysis.
GeneralizIT provides a user-friendly interface for conducting Generalizability Theory (G-Theory) analyses, including variance component estimation, reliability coefficient calculation, Decision studies (D-studies), and confidence interval estimation.
This class serves as a wrapper around the core analytical engine (Design class), handling data preparation, research design interpretation, and result presentation.
- Parameters:
data (pd.DataFrame) – Dataset containing facet variables and response measurements.
design_str (str) – String specifying the research design in standard notation (e.g., “p x i” for persons crossed with items).
response (str) – Column name in data containing the response/measurement values.
Example
>>> import pandas as pd >>> from generalizit import GeneralizIT >>> data = pd.read_csv("my_data.csv") >>> gt = GeneralizIT(data, "p x i", "score") >>> gt.calculate_anova() >>> gt.calculate_g_coefficients() >>> gt.g_coefficients_summary()
- calculate_anova()[source]
Calculate variance components using ANOVA.
This method is a wrapper for the Design.calculate_anova() method, which implements Henderson’s Method 1 to estimate variance components for each facet in the specified research design.
- Returns:
Results are stored in the underlying Design object.
- Return type:
None
Notes
This method must be called before calculating G-coefficients, confidence intervals, or D-study scenarios.
- calculate_confidence_intervals(alpha=0.05, **kwargs)[source]
Calculate confidence intervals for facet level means.
This method is a wrapper for the Design.calculate_confidence_intervals() method, which computes confidence intervals for individual facets based on variance component analysis.
- Parameters:
alpha (float, optional) – Significance level for confidence intervals. Default is 0.05 (producing 95% confidence intervals).
**kwargs – Optional keyword arguments passed to Design.calculate_confidence_intervals().
- Returns:
Results are stored in the underlying Design object.
- Return type:
None
- Raises:
RuntimeError – If ANOVA table hasn’t been calculated first.
- calculate_d_study(d_study_design=None, fixed_facets=None, **kwargs)[source]
Calculate G-coefficients for alternative research designs (D-Study).
This method is a wrapper for the Design.calculate_d_study() method, which examines multiple possible study designs by generating combinations of provided facet levels.
- Parameters:
d_study_design (-) – Dictionary where keys are facet names and values are lists of integers representing different numbers of levels to test.
Optional (- fixed_facets) – List of facets to be treated as fixed.
**kwargs (-) –
Optional keyword arguments passed to Design.calculate_d_study().
- Returns:
Results are stored in the underlying Design object.
- Return type:
None
- Raises:
RuntimeError – If ANOVA table hasn’t been calculated first.
- calculate_g_coefficients(fixed_facets=None, **kwargs)[source]
Calculate generalizability coefficients.
This method is a wrapper for the Design.g_coeffs() method, which computes phi-squared (Φ²) and rho-squared (ρ²) coefficients for each potential object of measurement in the design.
- Parameters:
Optional (- fixed_facets) – List of facets to be treated as fixed.
**kwargs –
Optional keyword arguments passed to Design.g_coeffs(). - error_variance (bool): If True, prints detailed information about
error variances during calculation. Default is False.
Other parameters as documented in Design.g_coeffs().
- Returns:
Results are stored in the underlying Design object.
- Return type:
None
- Raises:
RuntimeError – If ANOVA table hasn’t been calculated first.
Design Module
- class generalizit.design.Design(data, variance_tuple_dictionary, response_col, missing_data=False)[source]
Bases:
object
- calculate_anova()[source]
Performs analogous ANOVA calculations using Henderson 1953 Method 1. Determines the variance components from variance coefficients and uncorrected sum of squares (T values) for each facet, iteractions, and means.
This method executes the steps necessary to estimate variance components based on Generalizability Theory. This method does not require corrected Sum of Squares or Mean Squares, and thus they are not calculated. It also does not calculate hypothesis tests or F-statistics, as these are not relevant in G-Theory.
- Steps:
Calculate the T values using _calculate_T_values.
Create the variance coefficients table with _create_variance_coefficients_table.
Estimate variance components using _calculate_variance.
Note
This method does not require and does not calculate Sum of Squares or Mean Squares.
This method emphasizes variance component estimation over hypothesis testing.
- calculate_confidence_intervals(alpha=0.05, **kwargs)[source]
Calculate confidence intervals for means of each facet level.
This method computes confidence intervals for individual facets based on variance component analysis using the formula from Cardinet et al. (1976):
X ± z_(α/2) × √(σ²)
where σ² represents the sum of variance components adjusted by the appropriate levels coefficients.
- Parameters:
alpha (float, optional) – Significance level for confidence intervals. Default is 0.05 (producing 95% confidence intervals).
**kwargs – Optional keyword arguments. - variance_dictionary (dict): Custom variance components to use. If not provided, components from the ANOVA table are used. - levels_df (pd.DataFrame): Custom levels coefficients table. If not provided, self.levels_coeffs is used or calculated.
- Returns:
Results are stored in self.confidence_intervals
- Return type:
None
- self.confidence_intervals
Dictionary where keys are facet names and values are DataFrames containing confidence intervals for each level. Each DataFrame contains columns: - lower_bound: Lower CI boundary - mean: Observed mean - upper_bound: Upper CI boundary
- Type:
dict
Notes
Confidence intervals are not calculated for the facet with the largest
dimensionality (typically the interaction term containing all facets). - Negative variance components are automatically set to zero with a warning. - The method requires the ANOVA table to be calculated first unless a custom variance_dictionary is provided.
- Raises:
ValueError – If alpha is not between 0 and 1, ANOVA hasn’t been calculated, or if invalid parameters are provided.
- calculate_d_study(d_study_design, **kwargs)[source]
Implement a D-Study to determine optimal facet levels based on G-Study variance components.
This method examines multiple possible study designs by generating all combinations of the provided facet levels. It calculates G-coefficients for each design scenario using the variance components from a previously conducted G-Study.
- Parameters:
d_study_design (dict) –
Dictionary where keys are facet names and values are lists of integers representing different numbers of levels to test. For example: {
’person’: [10], # Only testing 10 persons ‘item’: [2, 3], # Testing either 2 or 3 items ‘rater’: [2, 4, 6] # Testing 2, 4, or 6 raters
} This would generate 6 different study designs (1×2×3 combinations).
**kwargs – Optional additional parameters - fixed_facets Optional[List[str]]: Facets to be treated as fixed instead of random
- Returns:
Results are stored in self.d_study_dict, where keys are string representations of each design scenario and values are DataFrames containing the corresponding G-coefficients.
- Return type:
None
- Raises:
ValueError – If d_study_design is not properly formatted or if required
precalculations haven't been performed. –
Notes
This method requires that variance components have been calculated via a G-Study
For each design scenario, new levels coefficients are calculated
All facet combinations in the original design must be maintained
- g_coeffs(**kwargs)[source]
Calculate G-coefficients for various scenarios of fixed and random facets.
This method computes rho^2 (relative) and phi^2 (absolute) coefficients for each potential object of measurement in the design. The coefficients quantify the reliability of measurements across different facets.
- Parameters:
**kwargs –
Optional keyword arguments. - variance_dictionary (dict): Custom variance components to use.
If provided, values must be non-negative. If not provided, components from the ANOVA table are used.
- levels_df (pd.DataFrame): Custom levels coefficients table.
If not provided, self.levels_coeffs is used or calculated.
- variance_tuple_dictionary (dict): Custom variance tuple dictionary.
If not provided, self.variance_tuple_dictionary is used.
- d_study (bool): If True, returns the G-coefficients DataFrame directly
instead of storing it in self.g_coeffs_table. Default is False.
- error_variance (bool): If True, prints detailed information about
the error variances for Tau (τ), Delta (Δ), and delta (δ) during the calculation of phi-squared and rho-squared coefficients. Default is False.
- Returns:
If d_study=True, returns the G-coefficients DataFrame directly. Otherwise, results are stored in self.g_coeffs_table and None is returned.
- Return type:
pd.DataFrame or None
- Raises:
ValueError – If: - ANOVA table hasn’t been calculated and no variance_dictionary is provided - Any variance component is negative - Levels coefficients are invalid (non-square, negative values) - Keys in variance_dictionary don’t match variance_tuple_dictionary - Levels coefficients don’t match variance components
Notes
Negative variance components are automatically set to zero with a warning
The ‘mean’ component is removed from calculations
The method produces a DataFrame with rho^2 and phi^2 values for each facet
Design Utilities
- generalizit.design_utils.create_corollary_dictionary(design_num, design_facets)[source]
Parse a research design string and return a dictionary mapping facet types (p, i, h) to their actual values based on the design pattern.
- Parameters:
design_num (int) – The design number
design_facets (list) – List of facets extracted from the design string
- Returns:
Dictionary mapping facet names from the research design to base design facets. Keys are ‘p’, ‘i’, and/or ‘h’
- Return type:
Dict[str, str]
- Raises:
ValueError – If the design pattern is invalid or can’t be parsed
- generalizit.design_utils.create_variance_tuple_dictionary(design_num, corollary_dict)[source]
Create a variance tuple dictionary for a given study design structure.
This function constructs a dictionary representing variance components for a given study design, for example Brennan Design 2 (i:p), where p and i represent facets of variation. The dictionary maps each variance component to a tuple of relevant facets.
- Parameters:
design_num (int) – The design number (2,4-8) or ‘crossed’ for fully crossed designs.
corollary_dict (dict) – A dictionary mapping corollary names to actual facet names. For example, {‘p’: ‘persons’, ‘i’: ‘items’}.
``` –
- Returns:
- A dictionary where keys are variance component names (strings), and
values are tuples of the corresponding facets.
- Return type:
dict
Example
>>> create_variance_tuple_dictionary(2, {'p': 'persons', 'i': 'items'}) { 'p': ('p',), 'i:p': ('i', 'p'), 'mean': () }
- generalizit.design_utils.get_facets_from_variance_tuple_dictionary(variance_tuple_dict)[source]
Extracts the facets from a variance tuple dictionary.
- Parameters:
variance_tuple_dict (Dict[str, tuple]) – A dictionary where keys are variance component names (strings), and values are tuples of the corresponding facets.
- Returns:
A list of unique facets extracted from the variance tuple dictionary.
- Return type:
List[str]
- generalizit.design_utils.match_research_design(input_string)[source]
Matches a string input of a research design to one of 8 predefined designs.
- Parameters:
input_string (str) – The research design input as a string. Valid operators are ‘x’ for crossing and ‘:’ for nesting. Some designs require parentheses to indicate grouping.
- Returns:
The design number (1 to 8) that matches the input List: List of facets extracted from the design string
- Return type:
int or str
- Raises:
ValueError – If the input string is invalid or malformed
TypeError – If the input is not a string
Examples
>>> match_research_design("persons x raters") # Design 1 1 >>> match_research_design("items:persons") # Design 2 2 >>> match_research_design("p x i x h") # Design 3 3 >>> match_research_design("p x (i:h)") # Design 4 4
- generalizit.design_utils.parse_facets(design_num, design_facets)[source]
Parses the facets of a research design and returns a dictionary of variance components.
This function combines the functionality of create_corollary_dictionary and create_variance_tuple_dictionary to generate a comprehensive dictionary of variance components for a given research design.
- Parameters:
design_num (Union[int, str]) – The design number or ‘crossed’ for fully crossed designs.
design_facets (list) – List of facets extracted from the design string.
- Returns:
- A dictionary where keys are variance component names (strings), and
values are tuples of the corresponding facets.
- Return type:
Dict[str, tuple]
- Raises:
ValueError – If the design pattern is invalid or can’t be parsed.
Example
>>> parse_facets(2, ['items', 'persons']) { 'p': ('persons',), 'i:p': ('items', 'persons'), 'mean': () }
G-Theory Utilities
- generalizit.g_theory_utils.adjust_for_fixed_effects(variance_tup_dict, variance_df, levels_df, fixed_facets, verbose=False)[source]
Adjust variance components for fixed facets in any design. Follows Brennan’s rule 4.3.1: For every alpha, absorb any variance component with alpha and fixed facet into the lower-order component. :type variance_tup_dict:
Dict
[str
,tuple
] :param variance_tup_dict: dict mapping component names to tuples of facets. :type variance_df:DataFrame
:param variance_df: DataFrame with index as variance component names and a ‘Variance’ column. :type levels_df:DataFrame
:param levels_df: DataFrame of levels coefficients (1/levels). :type fixed_facets:Optional
[List
[str
]] :param fixed_facets: list of facets to fix (e.g., [‘i’]). :type verbose:bool
:param verbose: bool, if True, print debug information.- Returns:
dict mapping adjusted component names to tuples of facets. adjusted_variance_df: DataFrame with adjusted variance components.
- Return type:
adjusted_variance_tup_dict
- generalizit.g_theory_utils.create_pseudo_df(d_study, variance_tup_dict)[source]
Create a pseudo DataFrame with all possible combinations of facet levels.
- Parameters:
d_study (dict) – A dictionary representing the study design with facets as keys and the number of levels for each facet as values. Values can be either integers or lists of integers.
variance_tup_dict (dict) – A dictionary mapping facet names to tuples containing the component facets.
- Returns:
A pseudo DataFrame with all possible combinations of facet levels.
- Return type:
pd.DataFrame