Unbalanced and Missing Data Generalizability Analyses: A tutorial based on Henderson (1953), Brennan (2001a)

Tyler J. Smith and Theresa J.B. Kline
April 8, 2025

This tutorial assumes a fairly good working knowledge of generalizability theory and the terms used in its analysis. If the terms “facet”, “variance component”, “generalizability”, “dependability”, “G-Study” and “D-study” are unfamiliar, it is strongly advised that the reader review an introduction to generalizability procedures using balanced/non-missing data. There are many choices for this such as Briesch, Swaminathan, Welsh, and Chafouleas (2014).

While data sets with balanced facets and with no missing data points are relatively straightforward to analyze for generalizability, the same is not true for data sets not meeting these criteria. In fact many data sets (including those used in machine learning) that researchers wish to subject to generalizability analyses are either: 1) unbalanced (have different numbers of cases in each combination/nesting of variables); and/or 2) are missing data. There have been few freely available options to analyze such data – exceptions include the urGENOVA (Brennan, 2001b) and G-String_V (Bloch & Norman, 2012; 2023) programs. The urGENOVA program allows for both unbalanced data sets and missing data but has an esoteric user interface and does not calculate generalizability coefficients as part of the output. The G-String_V program has an intuitive user interface and produces generalizability and decision-study outputs but does not allow for missing data.

Generalizability analyses are dependent on variance components for their calculation. In balanced designs with no missing data, these are readily calculated using the Sums of Squares/Mean Squares generated in typical ANOVA analyses. When these criteria are not met, another approach (“analogous ANOVA”) has been used (Brennan, 2001a) and is based on the “Method 1” approach introduced by Henderson (1953) to generate variance components in data sets that are unbalanced and/or have missing data. It is mathematically simple - using cell frequency counts and squared/summed values of the data point values to calculate the variance components. The generalizability/dependability calculations can follow from there.

Henderson 1953

The first step in this document is to “walk” the reader through the “Method 1” approach to calculate variance components introduced by Henderson (1953) and the one utilized by Brennan (2001a) in creating the urGENOVA generalizability program. Note that calculating generalizability coefficients is NOT the end goal of Henderson - it is just to generate variance coefficients, which indicate how the variance in a set of data is distributed among its various components (or “facets” in the vernacular of generalizability theory). It will be simplest to show how these work with an actual data set.

Table 1 below shows the data from Henderson’s (1953) Table 1. Its cells contain the butterfat levels at first lactations of 4 cow herds inseminated by 3 different sires over 4 years. The entries in each cell are somewhat unique – they represent number of cows in each herd (first number) and butterfat content (second number) in the first lactation of the year. The entry “3 – 1414” represents the total butterfat (1414) summed over 3 different cows. Thus, cows are nested within Herd/Sire combinations (C: HS).

Table 1. Henderson’s butterfat data set “Year(A) X Herd (H) X Sire (S)”

HERD	SIRE	YEAR	YEAR	YEAR	YEAR	TOTAL
HERD	SIRE	1	2	3	4	TOTAL
1	1	3 - 1414	2 – 981			5 – 2395
1	2		4 – 1766	2 – 862		6 – 2628
1	3				5 - 1609	5 – 1609
2	1	1 - 405	3 – 1270			4 – 1674
2	2			5 – 2109		5 – 2109
2	3			4 – 1563	2 - 740	6 - 2303
3	1		3 – 1705			3 – 1705
3	2		4 – 2310	2 – 1134		6 – 3444
4	1	3 - 1113	5 - 1951			8 – 3064
4	3			3 - 1291	6 - 2457	9 – 3948
TOTAL		7 - 2931	21-9983	16 - 6959	13 - 4806	57-24679

It is worth noting:

Most of the data are missing.
The ‘cows’ variable is not of particular interest in the study. If it was, the actual butterfat values for each of the cows within the HS combinations would be listed. Thus, the focus of the rest of the analysis is on the crossed (but not fully crossed) Year(A) X Herd (H) X Sire (S). However, this issue DOES come up later in this tutorial. For now, we will set it aside.
That for Herds 3 and 4, only 2 sires are crossed with them; sires 1 and 2 for Herd 3 and sires 1 and 3 for Herd 4. This means that there are 10 HS combinations. It is as though there are “missing data” for sire 3 herd 3, and sire 2 herd 4.
Henderson writes the equation that represents the linear model of the data is as follows: Note that there is a ‘k’ in the error term – this term includes random error PLUS also the cows:HS nested term (which is not assessed or accounted for in the model).

\[Y_{hijk} = \mu + A_h(\text{Year}) + H_i(\text{Herd}) + S_j(\text{Sire}) + HS_{ij}(\text{Herd} \times \text{Sire}) + \text{Error}_{hijk}\]

Assumptions:

All variables are random
All effects (except the grand mean) are uncorrelated with each other, have means of 0 and have variances.

Step 1: Obtain the “T” values (uncorrected sums of squares) for each facet by summing over the squared values for each level of the facet and divide by each level’s relevant sample size (“^” indicates exponential). Henderson uses summation notation in his article – however actual values will make the computations more concrete.

\[ A \text{ (year)} = \sum \frac{(2931^2)}{7} + \frac{(9983^2)}{21} + \frac{(6959^2)}{16} + \frac{(4806^2)}{13} = 10,776,451 \]

\[ H \text{ (herd)} = \sum \frac{(2395+2628+1609)^2}{(5+6+5)} + \frac{(1674+2109+2303)^2}{(4+5+6)} + \frac{(1705+3444)^2}{(3+6)} + \frac{(3064+3748)^2}{(8+9)} = 10,893,666 \]

\[S \text{ (sire)} = \sum \frac{(2395+1674+1705+3064)^2}{(5+4+3+8)} + \frac{(2628+2109+3444)^2}{(6+5+6)} + \frac{(1609+2303+3748)^2}{(5+6+9)} = 10,776,278\]

\[HS \text{ (herd)} \times \text{(sire)} = \sum \frac{2395^2}{5} + \frac{2628^2}{6} + \frac{1609^2}{5} + \frac{1674^2}{4} + \frac{2109^2}{5} + \frac{2303^2}{6} + \frac{1705^2}{3} + \frac{3444^2}{6} + \frac{3064^2}{8} + \frac{3748^2}{9} = 10,970,369\]

\[\text{Mean} \text{ (Henderson uses "CF" for this term)} = \sum \text{ over all values } \frac{24679^2}{57} = 10,685,141\]

Total = 11,124,007

This would normally be accomplished by squaring and summing each individual cell value (all 57 of them). In this data set this would mean we would have to have the individual butterfat values for EACH “cow”. Instead, Henderson provides the sum of the butterfat values across cows in the herd. For example, in the cell 3-1414, this means that 3 cows provided 1414 butterfat. We DO NOT know how the 1414 butterfat is distributed between the 3 cows. If we did, we would be able to calculate the nested effect of cows within herds (c:H), that we are not provided. Henderson later in the paper does provide a T-value for the total (Table 6, p. 233) of 11,124,007. I tried generating this value a couple of different ways:

Taking the overall total 24679 and dividing by 57 (433). Then squaring this value 57 times and summing. This gave me a T-value of 10,686,873.
Taking each herd value, dividing it by the number of cows in the herd, then “counting” that value for each cow for that herd. This was repeated 57 times. Then these values were squared then summed. This gave me a T-value of 10,973,517.
Taking each herd value, dividing it by the number of cows in the herd. Then averaging across those values (440). Then squaring this value 57 times and summing. This gave me a T-value of 11,035,200.

Clearly the actual values for each cow would need to be provided to get the actual total T value for this analysis.

Step 2: Obtain the estimated coefficients by which the variance components will be multiplied

So far, things have been straightforward. The next step is more complicated and tedious. The coefficients of μ², σ²(year), σ²(herd), σ²(sire), σ²(herdXsire), σ²(error) are estimated using the SAMPLE SIZES on which they are based that contribute to the calculated T-values. This is tedious because the data set is unbalanced and there are lots of missing data! Fortunately, Henderson provides a slow and steady summation routine that allows for the estimations. Many of the cells can be filled using the simple overall sample size of the data set (N) (these are described below). The others (noted with a “?” in Table 2) have to be estimated individually. Again, Henderson provides the summation notation for calculating these terms for those interested in seeing them.

For the μ² term, ALL effects are based on the total number of data points involved in the study (denoted N).
For the Total effect, all variances are also based on the total number of data points involved in the study (N).
For each effect (except the Mean), the variance for that effect is also for the based on the total number of data points involved in the study (N). For example, the variance of σ²(year) for effect A (year) is N.
For the σ²(error) term column, the facet sample sizes are equal to the number of levels of that effect (Year = 4, Herd = 4, Sire = 3, HS = 10 (there are 10 different HS combinations), for the Mean it is always = 1, and for the Total it is always = N.
This leaves 16 “non-easy” ones that require calculation and are noted as such.

Facet	T-Value	μ²	σ²(Year)	σ²(Herd)	σ²(Sire)	σ²(HS)	σ²(error)
A (year)	10,776,451	N	N	? - 1	? - 2	? - 3	Levels of A (4)
H (herd)	10,893,666	N	? - 4	N	? - 5	? - 6	Levels of H (4)
S (sire)	10,776,278	N	? - 7	? - 8	N	? - 9	Levels of S (3)
HS (herdX sire)	10,970,369	N	? - 10	? - 11	? - 12	N	#Combinations of HS (10)
Mean	10,685,141	N	? - 13	? - 14	? - 15	? - 16	1
Total	11,124,007*	N	N	N	N	N	N

*See note above regarding where this value comes from

Let’s begin to “fill in” the 16 different “?” terms with “brute force” calculations….

The \(\sigma^2(\text{Herd})\) term on Year (data are collapsed across Sires). Sample size starts by calculating the squared sum of: cows that provided data in each herd within each year, divided by the total number of cows for that year. Then sum the quotients.

\[\text{Year 1: } \frac{(3^2) + (1^2) + (0^2) + (3^2)}{7} +\]

\[\text{Year 2: } \frac{(2+4)^2 + (3^2) + ((3+4)^2) + (5^2)}{21} +\]

\[\text{Year 3: } \frac{(2^2) + ((5+4)^2) + (2^2) + (3^2)}{16} +\]

\[\text{Year 4: } \frac{(5^2) + (2^2) + (0^2) + (6^2)}{13}\]

\[= 19.51\]

The \(\sigma^2(\text{Sire})\) term on Year (data are collapsed across Herds). Sample size starts by calculating the squared sum of: cows that provided data for each sire within each year, divided by the total number of cows for that year. Then sum the quotients.

\[\text{Year 1: } \frac{((3+1+0+3)^2) + ((0+0+0+0)^2) + ((0+0+0+0)^2))}{7} +\]

\[\text{Year 2: } \frac{((2+3+3+5)^2) + ((4+0+4+0)^2) + ((0+0+0+0)^2)}{21} +\]

\[\text{Year 3: } \frac{((0+0+0+0)^2) + ((2+5+2+0)^2) + ((0+4+0+3)^2))}{16} +\]

\[\text{Year 4: } \frac{((0+0+0+0)^2) + ((0+0+0+0)^2) + ((5+2+0+6)^2))}{13}\]

\[= 39.22\]

The \(\sigma^2(\text{HS})\) term on Year. Sample size starts by calculating the squared sum of: cows that provided data for each of the 10 herd/sire combinations within each year, divided by the total number of cows for that year. Then sum the quotients.

\[\text{Year 1: } \frac{((3^2) + (0^2) + (0^2) + (1^2) + (0^2) + (0^2) + (0^2) + (0^2) + (3^2)+ (0^2))}{7} +\]

\[\text{Year 2: } \frac{((2^2) + (4^2) + (0^2) + (3^2) +(0^2) + (0^2) + (3^2) + (4^2) + (5^2) +(0^2))}{21} +\]

\[\text{Year 3: } \frac{((0^2) + (2^2) +(0^2) + (0^2) + (5^2) + (4^2) + (0^2) + (2^2)+ (0^2) + (3^2))}{16} +\]

\[\text{Year 4: } \frac{((0^2) + (0^2) +(5^2) + (0^2) + (0^2) + (2^2) + (0^2) + (0^2) + (0^2) + (6^2))}{13}\]

\[= 15.10\]

The \(\sigma^2(\text{Year})\) term on Herd (data are collapsed across Sires). Sample size starts by calculating the squared sum of: cows that provided data in each year within each herd, divided by the total number of cows for that herd. Then sum the quotients.

\[\text{Herd 1: } \frac{((3^2) + ((2+4)^2) + (2^2) + (5^2))}{16} +\]

\[\text{Herd 2: } \frac{((1^2) + (3^2) + ((5+4)^2) + (2^2))}{15} +\]

\[\text{Herd 3: } \frac{((0^2) + ((3+4)^2) + (2^2) + (0^2))}{9} +\]

\[\text{Herd 4: } \frac{((3^2) + (5^2) + (3^2) + (6^2))}{17}\]

\[= 21.49\]

The \(\sigma^2(\text{Sire})\) term on Herd (data are collapsed across Years). Sample size starts by calculating the squared sum of: cows that provided data in each herd, divided by the total number of cows for that herd. Then sum the quotients.

\[\text{Herd 1: } \frac{(((3+2)^2) + ((4+2)^2) + (5^2))}{16} +\]

\[\text{Herd 2: } \frac{((3+1)^2) + (5^2) + ((4+2)^2))}{15} +\]

\[\text{Herd 3: } \frac{((3^2) + ((4+2)^2) + (0^2))}{9} +\]

\[\text{Herd 4: } \frac{(((3+5)^2) + (0^2) + ((3+6)^2))}{17}\]

\[= 24.04\]

The \(\sigma^2(\text{HS})\) term on Herd. Sample size starts by calculating the squared sum of: cows that provided data in each herd, divided by the total number of cows for that herd. Then sum the quotients.

Note that this is the SAME value as that for the \(\sigma^2(\text{Sire})\) effect on Herd. The difference is in how the numbers were arrived at. For the \(\sigma^2(\text{Sire})\) effect on Herd, we had to sum across the Years to get the number of cows. For the \(\sigma^2(\text{HS})\) effect we just have used the sum of cows across years (like having rolled the Years into a single column).

\[\text{Herd 1: } \frac{((5^2) + (6^2) + (5^2))}{16} +\]

\[\text{Herd 2: } \frac{((4^2) + (5^2) + (6^2))}{15} +\]

\[\text{Herd 3: } \frac{((3^2) + (6^2) + (0^2))}{9} +\]

\[\text{Herd 4: } \frac{((8^2) + (0^2) + (9^2))}{17}\]

\[= 24.04\]

The \(\sigma^2(\text{Year})\) term on Sire (data are collapsed across Herds). Sample size starts by calculating the squared sum of: cows that provided data in each year within each sire, divided by the total number of cows for that sire. Then sum the quotients.

\[\text{Sire 1: } \frac{((3+1+3)^2) + ((2+3+3+5)^2) + (0^2)+(0))}{20} +\]

\[\text{Sire 2: } \frac{((0^2) + ((4+4)^2) + ((2+5+2)^2) + ((0^2)))}{17} +\]

\[\text{Sire 3: } \frac{((0^2) + (0^2) + ((4+3)^2) + ((5+2+6)^2))}{20}\]

\[= 30.33\]

The \(\sigma^2(\text{Herd})\) term on Sire (data are collapsed across Years). Sample size starts by calculating the squared sum of: cows that provided data in each herd within each sire, divided by the total number of cows for that sire. Then sum the quotients.

\[\text{Sire 1: } \frac{((3+2)^2) + ((1+3)^2) + (3^2)+((3+5)^2))}{20} +\]

\[\text{Sire 2: } \frac{((4+2)^2) + (5^2) + ((4+2)^2) + (0^2))}{17} +\]

\[\text{Sire 3: } \frac{(5^2) + ((4+2)^2) + (0^2) + ((6+3)^2))}{20}\]

\[= 18.51\]

The \(\sigma^2(\text{HS})\) term on Sire. Sample size starts by calculating the squared sum of: cows that provided data in each sire, divided by the total number of cows for that sire. These quotients are then summed.

Note that this is the SAME value as that for the \(\sigma^2(\text{Herd})\) effect on Sire. The difference is in how the numbers were arrived at. For the \(\sigma^2(\text{Herd})\) effect on Sire, we had to sum across the Years to get the number of cows. For the \(\sigma^2(\text{HS})\) effect we just have used the sum of cows across years (like having rolled the Years into a single column).

\[\text{Sire 1: } \frac{(5^2) + (4^2) + (3^2)+(8^2))}{20} +\]

\[\text{Sire 2: } \frac{(6^2) + (5^2) + (6^2) + (0^2))}{17} +\]

\[\text{Sire 3: } \frac{(5^2) + (6^2) + (0^2) + (9^2))}{20}\]

\[= 18.51\]

The \(\sigma^2(\text{Year})\) term on HS. Sample size starts by calculating the squared sum of: cows that provided data across years for each Herd/Sire combination, divided by the total number of cows for that Herd/Sire combination. Then sum the quotients.

\[\text{Herd/Sire 1: } \frac{((3^2) + (2^2) + (0^2)+(0^2))}{5} +\]

\[\text{Herd/Sire 2: } \frac{((0^2) + (4^2) + (2^2)+(0^2))}{6} +\]

\[\text{Herd/Sire 3: } \frac{((0^2) + (0^2) + (0^2)+(5^2))}{5} +\]

\[\text{Herd/Sire 4: } \frac{((1^2) + (3^2) + (0^2)+(0^2))}{4} +\]

\[\text{Herd/Sire 5: } \frac{((0^2) + (0^2) + (5^2)+(0^2))}{5} +\]

\[\text{Herd/Sire 6: } \frac{((0^2) + (0^2) + (4^2)+(2^2))}{6} +\]

\[\text{Herd/Sire 7: } \frac{((0^2) + (3^2) + (0^2)+(0^2))}{3} +\]

\[\text{Herd/Sire 8: } \frac{((0^2) + (4^2) + (2^2)+(0^2))}{6} +\]

\[\text{Herd/Sire 9: } \frac{((5^2) + (3^2) + (0^2)+(0^2))}{8} +\]

\[\text{Herd/Sire 10: } \frac{((0^2) + (0^2) + (3^2)+(6^2))}{9}\]

\[= 37.35\]

The \(\sigma^2(\text{Herd})\) term on HS. Sample size is calculated by the squared sum of: cows that provided data across Herds for each Herd/Sire combination, divided by the total number of cows. This results in the N size.

\[= \frac{(57^2)}{57} = 57\]

The \(\sigma^2(\text{Sire})\) term on HS. Sample size is calculated by the squared sum of: cows that provided data across Sires for each Herd/Sire combination, divided by the total number of cows. This results in the N size.

\[= \frac{(57^2)}{57} = 57\]

The \(\sigma^2(\text{Year})\) term on Mean. Sample size is calculated by the squared sum of: cows in each year divided by the total number of cows.

\[= \frac{((7^2) + (21^2) + (16^2) + (13^2))}{57} = 16.05\]

The \(\sigma^2(\text{Herd})\) term on Mean. Sample size is calculated by the squared sum of: cows in each herd divided by the total number of cows.

\[= \frac{((16^2) + (15^2) + (9^2) + (17^2))}{57} = 14.93\]

The \(\sigma^2(\text{Sire})\) term on Mean. Sample size is calculated by the squared sum of: cows in each sire divided by the total number of cows.

\[= \frac{((20^2) + (17^2) + (20^2))}{57} = 19.11\]

The \(\sigma^2(\text{HS})\) term on Mean. Sample size is calculated by the squared sum of: cows in each Herd/Sire combination divided by the total number of cows.

\[= \frac{((5^2) + (6^2) + (5^2) +(4^2) + (5^2) + (6^2) + (3^2) + (6^2) + (8^2) +(9^2))}{57} = 6.19\]

All these calculated values can now be put into Table 2 and are presented in Table 3.

Table 3. Completed Hendersons T-values (uncorrected Sums of Squares) and coefficient sample size bases (contains the same data found in Table 6 p. 233 in Henderson 1953)

Facet	T-Values	μ²	^2(year)	σ²(herd)	σ²(sire)	σ²(HS)	σ²(error)
A (year)	10,776,451	57	57	19.51	39.22	15.1	4
H (herd)	10,893,666	57	21.49	57	24.04	24.04	4
S (sire)	10,776,278	57	30.33	18.51	57	18.51	3
HS (herdX sire)	10,970,369	57	37.35	57	57	57	10
Mean	10,685,141	57	16.05	14.93	19.11	6.19	1
Total	11,124,007	57	57	57	57	57	57

Although at this point Henderson creates a “Table 7” from which equations he says can be solved, this step is not necessary. What is necessary is to “solve” for the equations that are set up in Table 3. Specifically, what values need to be multiplied with each estimated coefficient down the entire column for all 6 equations that will solve the following equations? These will be the variance components: A, B, C, D, E, and F.

\[ 10,776,451 = A(57) + B(57) + C(19.51) + D(39.22) + E(15.10) +F(4) \]

\[ 10,893,666 = A(57) + B(21.49) + C(57) + D(24.04) + E(24.04) +F(4) \]

\[ 10,776,278 = A(57) + B(30.33) + C(18.51) + D(57) + E(18.51) +F(3) \]

\[ 10,970,369 = A(57) + B(37.35) + C(57) + D(57) + E(57) +F(10) \]

\[ 10,685,141 = A(57) + B(16.05) + C(14.93) + D(19.11) + E(6.19) +F(1) \]

\[ 11,124,007 = A(57) + B(57) + C(57) + D(57) + E(57) +F(57) \]

Using the data in Table 3, solve these equations simultaneously via matrix procedures to solve a system of equations (solving by linear least squares method). Alternatively, they can be solved by regressing the 6 T-values on the 6 variance estimates in a direct regression (all predictors entered simultaneously), and requesting to include an intercept in the model). This produces the following output (from SPSS – although other programs will provide the same results). Note that there is no “Mean” variance component calculated, as μ² is a constant and has no variance so is eliminated from the analysis. Since there are 6 predictors and 6 outcomes the data will “fit perfectly” (in fact, Excel WILL NOT run the regression because there are equal numbers of predictors and cell entries in the outcome). However, this is not relevant, as we are interested solely in the unstandardized B-values associated with each effect, which are the variance components. We now have the variance components (see Table 4) for Henderson’s data.

Table 4a. SPSS output of Henderson’s data

Model	Unstandardized Coefficients		Standardized Coefficients
	B	Std. Error	Beta
(Constant)	10572974.498	.000
Year	763.154	.000	.084
Herd	4531.304	.000	.615
Sire	1587.278	.000	.174
HS	-164.329	.000	-.023
Error	2949.830	.000	.402

Table 4b. Variance Components from Henderson 1953 data

Facet	Variance Component	Proportion of Variance
A (year)	763	.08
H (herd)	4531	.46
S (sire)	1587	.16
HS	-164	.30
Error	2950
Total

We can see from the results that most of the variance (46%) in the butterfat amount, is due to the “Herd” variable and next by the Herd X Sire interaction (30%). Not a lot of the variance is due to the Sire (16%) and almost none is due to Year (8%). As noted earlier, Henderson stops at this point.

Before leaving Henderson, there are a couple notes about his protocol.

It is obvious now why this approach allows for missing data and unbalanced designs in the estimate of variance components. While tedious, the calculation of the T-values and sample sizes for effects, do not require balanced, non-missing data sets that meet all assumptions of normality. This flexibility is a marvelous feature insofar as naturally occurring data sets often fall short of these restrictive assumptions.
ALL effects are assumed to be random. If an effect (such as “year”) IS actually fixed, then the other estimates are biased. This assumption continues into Brennan’s (2001a) work, and thus the into the urGENOVA and G-String_V programs.
The variance components are not used to create generalizability coefficients in Henderson’s article. They can be, but the generalizability literature had not begun to be published until after this article. Brennan (2001a) took up the calculations to extend generalizability’s reach into data sets that can be nested, unbalanced and/or have missing data.

Brennan’s (2001a) approach (he calls it analogous-ANOVA) is based on Henderson’s calculations of T-values. With unbalanced or missing data, “The primary theoretical problem presented by most unbalanced designs is that there are many possible estimators of random effects variance components and no unambiguously clear basis for determining which estimators are best. Among the practical problems with unbalanced designs are that some estimation methods require distributional form assumptions, which are often difficult to justify in generalizability analyses.” (Manual for urGENOVA Brennan, 2001b p. 1). Thus, he advises using the variance components generated from urGENOVA for generalizability estimates that are strictly “descriptive” of the specific data set and not inferential. The urGENOVA program does not allow for any computations for D-studies, nor are standard errors and confidence intervals of the variance components appropriate except for balanced designs under the assumptions of normality. A shortcoming of the urGENOVA program is that IT DOES NOT produce generalizability or dependability coefficients. These have to be calculated by hand by the end user.

The G-String_V program is more restrictive in input than urGENOVA. It does NOT allow for missing data. This simplifies the calculations (as we will see in a moment). The authors of this program do not adhere to Brennan’s (2001b) admonition of D-study, standard errors and confidence intervals of the variance components. In fact, it is relatively straightforward in the case of unbalanced, but no missing data, to estimate D-study generalizability coefficients using harmonic mean estimates of the unbalanced effects. However, these should be taken as estimates only, given the issues raised by Brennan. The G-String_V program DOES calculate generalizability and dependability coefficients for the facet of differentiation specified by the user.

Narayanan et al. (2010) and Brennan (2001a)

With this as a backdrop, the next step in this tutorial is to use a data set from Narayanan et al. (2010) (Table 5) to calculate the variance components using Henderson’s Method 1. It will be noted at this point that Narayanan uses a completely different approach to calculating generalizability in the paper. It is based on the variance, not the variance components, which is the hallmark of traditional approaches to calculating Generalizability. These data are unbalanced, but no data points are missing. There are 3 different doctors (d); rated by 16 patients (p) - note that patients are nested in doctors, and this facet is unbalanced; 8 patients rate Doctor A, 5 rate Doctor B and 3 rate Doctor C); all use 5 items (i) to rate the doctor. The model then is: i X (p:d). The dependent variable is “rating on a scale of 1-5” for each item.

Table 5. Narayanan et al. (2010) data set (i X p:d)

Doctor	Patient	item1	item2	item3	item4	item5
A	1	4	4	4	4	4
A	2	4	4	3	3	4
A	3	3	4	4	3	4
A	4	3	3	3	3	3
A	5	3	3	4	4	3
A	6	4	4	3	3	3
A	7	3	3	3	3	3
A	8	4	4	3	3	3
B	9	4	4	4	4	3
B	10	4	4	4	4	4
B	11	4	4	4	4	4
B	12	4	3	3	3	3
B	13	4	4	3	3	3
C	14	4	4	3	3	3
C	15	3	3	3	3	3
C	16	4	4	4	4	3

We are going to need a number of sums and their squares for the next analyses (Table 6):

Table 6. Narayanan et al. (2010) data set with added rows and columns

Doctor	Patient	item1	item2	item3	item4	item5	Sum across Patient Ratings	Squared sum across Patient Ratings	Squared sum across patient ratings/#ratings each completed
A	1	4	4	4	4	4	20	400	80
A	2	4	4	3	3	4	18	324	64.8
A	3	3	4	4	3	4	18	324	64.8
A	4	3	3	3	3	3	15	225	45
A	5	3	3	4	4	3	17	289	57.8
A	6	4	4	3	3	3	17	289	57.8
A	7	3	3	3	3	3	15	225	45
A	8	4	4	3	3	3	17	289	57.8
B	9	4	4	4	4	3	19	361	72.2
B	10	4	4	4	4	4	20	400	80
B	11	4	4	4	4	4	20	400	80
B	12	4	3	3	3	3	16	256	51.2
B	13	4	4	3	3	3	17	289	57.8
C	14	4	4	3	3	3	17	289	57.8
C	15	3	3	3	3	3	15	225	45
C	16	4	4	4	4	3	19	361	72.2
	Item Rating Sums	59	59	55	54	53
	Item Rating sums squared	3481	3481	3025	2916	2809
	Item sums squared/#ratings For each item	217.562	217.562	189.062	182.25	175.562

Doctor (d)

Sum all ratings across each item for each Patient; Sum the individual patient ratings for each doctor; square these sums; divide by the number of rating counts within each doctor; add the quotients.

\[\text{Doctor A} = (20+18+18+15+17+17+15+17) = 137; \frac{137^2}{40} = \frac{18769}{40} = 469.225\]

\[\text{Doctor B} = (19+20+20+16+17) = 92; \frac{92^2}{25} = \frac{8464}{25} = 338.560\]

\[\text{Doctor C} = (17+15+19) = 51; \frac{51^2}{15} = \frac{2601}{15} = 173.400\]

Sum across all 3 quotients = \(469.225 + 338.560 + 173.400 = 981.185\)

Patient:Doctor (p:d)

Sum all ratings across each item for each Patient; Square these sums; divide by the number of rating counts within each patient; sum across these quotients.

\[\text{p:d 1: } \frac{20^2}{5} = \frac{400}{5} = 80\]

\[\text{p:d 2: } \frac{18^2}{5} = \frac{324}{5} = 64.8\]

[continuing for all patients]

\[\text{p:d 15: } \frac{15^2}{5} = \frac{225}{5} = 45\]

\[\text{p:d 16: } \frac{19^2}{5} = \frac{361}{5} = 72.2\]

Sum across all 16 quotients = 989.200

Item (i)

Sum all ratings down each item; Square these sums; divide by the number of rating counts within each item; sum across these quotients.

\[\text{Item 1: } \frac{59^2}{16} = \frac{3481}{16} = 217.5625\]

\[\text{Item 2: } \frac{59^2}{16} = \frac{3481}{16} = 217.5625\]

\[\text{Item 3: } \frac{55^2}{16} = \frac{3025}{16} = 189.0625\]

\[\text{Item 4: } \frac{54^2}{16} = \frac{2916}{16} = 182.250\]

\[\text{Item 5: } \frac{53^2}{16} = \frac{2809}{16} = 175.5625\]

Sum across all 5 quotients = \((217.5625+217.5625+189.0625+182.250+175.5625) = 982.000\)

Doctor X Item (di)

Sum down each set of items for each doctor (there will be 15 combinations); square these sums; divide each sum by the number of ratings that go into each of the 15 combinations; sum the quotients.

\[\text{di1: } \frac{(4+4+3+3+3+4+3+4)^2}{8} = \frac{28^2}{8} = \frac{784}{8} = 98.000\]

\[\text{di2: } \frac{(4+4+4+3+3+4+3+4)^2}{8} = \frac{29^2}{8} = \frac{841}{8} = 105.125\]

[continuing for all doctor-item combinations]

\[\text{di14: } \frac{(3+3+4)^2}{3} = \frac{10^2}{3} = \frac{100}{3} = 33.333\]

\[\text{di15: } \frac{(3+3+3)^2}{3} = \frac{9^2}{3} = \frac{81}{3} = 27.000\]

Sum across all quotients = 983.808

PatientXItem:Doctor (pi:d) = TOTAL

Note that this is the total uncorrected sums of squares. Each individual rating is first squared. Then these are summed across all ratings.

\[4^2 + 4^2 + 4^2 + 4^2 + 4^2 + 4^2 + 4^2 + 3^2 + 3^2 + 4^2 + ... + 3^2 + 3^2 + 3^2 + 3^2 + 3^2 + 4^2 + 4^2 + 4^2 + 4^2 + 3^2\]

Summing across all values = 1000.000

Mean

Sum FIRST across all values; square this sum; divide by the total number of ratings that went into the sum.

\[\frac{(4 + 4 + 4 + 4 + 4 + ... + 3 + 3 + 3 + 3 + 3 + 4 + 4 + 4 + 4 + 3)^2}{80} = \frac{280^2}{80} = \frac{78400}{80} = 980.000\]

We can now put our Facet T-values into Table 7. below.

Facet	T-Value
D (doctor)	981.185
P:D (patient nested in doctor)	989.2
I (item)	982
DI (doctor X item)	983.808
PI:D (patient X item nested in doctor) (Equivalent to Henderson’s “total”)	1000
Mean	980

As we know from the Henderson example, the next part is tedious. We have to get the sample sizes for the variance coefficients. Like Henderson, we will first put in the “easy” ones.

For the μ² term, ALL coefficients are based on the total number of cases involved in the study (denoted N; 80 for this data set).
For the PI:D (Total which includes the highest level term plus error) facet, all coefficients are also based on the total number of cases involved in the study (denoted N).
For each facet (except the Mean), the coefficient for that variance is also the total number of cases involved in the study (denoted N). For example, the coefficient for σ²(d) for effect D is N.
For the σ²(pi:d) term column, the sample sizes are equal to the number of levels of that facet (D = 3, P:D = 16, I = 5, DI = 15 (there are 15 different DI combinations), for the Mean is always = 1, and for the highest order effect (PI:D – also known as the total by Henderson) is always = N.

Brennan (2001a, equations 7.2-7.6 p. 219) discusses how each of the cell entries are generated. These equations are difficult to “unpack” so I will be using Henderson’s “brute force” approach that allows for every count to be taken into consideration when calculating the terms.

When the data set is not missing any datapoints, Brennan uses a notation system for the coefficients in the other cells that need to be calculated. These are mostly simple, such as n(p) is the number of levels of the facet. In this design the n(d) = 3; n(p) = 16; n(i) = 5. However, n(r) is a unique notation. It is based on the number of “counts(individual ratings)” within each DI combination (recall there are 15 of them); square these values; sum them; divide by the number of counts in the entire data set.

\[ \frac{(8^2) + (8^2) + (8^2) + (8^2) + (8^2) + (5^2) + (5^2) + (5^2) + (5^2) + (5^2) + (3^2) + (3^2) + (3^2) + (3^2) + (3^2)}{80} = 6.125 \]

The results from Brennan’s notation are shown in Table 8 for this design. However, as noted, the entries ONLY work in a balanced design. When there are missing data, we need to go back to the Henderson approach.

Facet	σ²(d)	σ²(p:d)	σ²(i)	σ²(di)	σ²(pi:d)	μ²
D	N	n(i)*n(d)	n(p)	n(p)	n(d)	N
P:D	N	N	n(p)	n(p)	n(p)	N
I	n(r)*n(i)	n(i)	N	n(r)*n(i)	n(i)	N
DI	N	n(i)*n(d)	N	N	n(i)*n(d)	N
PI:D (total for Henderson)	N	N	N	N	N	N
Mean	n(r)*n(i)	n(i)	n(p)	n(r)	1	N

Henderson’s approach to completing the “non-simple” cells in the coefficient matrix follows. It allows for missing data in the calculations. There are 16 unique values (highlighted) to calculate.

Facet	σ²(d)	σ²(p:d)	σ²(i)	σ²(di)	σ²(pi:d)	μ²
D	N	1	2	3	n(d)	N
P:D	4	N	5	6	n(p)	N
I	7	8	N	9	n(i)	N
DI	10	11	12	N	n(i)*n(d)	N
PI:D (total for Henderson)	N	N	N	N	N	N
Mean	13	14	15	16	1	N

The σ²(p:d) term on D (data are collapsed across Items). Sample size starts by calculating the squared sum of:

counts of each p:d term within each D, divided by the total number of counts for that D. Then sum the quotients.

\[\text{D1: } \frac{(5^2) + (5^2) + (5^2) + (5^2) + (5^2) + (5^2) + (5^2) + (5^2)}{40} = \frac{200}{40} = 5 +\]

\[\text{D2: } \frac{(5^2) + (5^2) + (5^2) + (5^2) + (5^2)}{25} = \frac{125}{25} = 5 +\]

\[\text{D3: } \frac{(5^2) + (5^2) + (5^2)}{15} = \frac{75}{15} = 5\]

\[= 15\] 2. The σ²(i) term on D (data are collapsed across P:D). Sample size starts by calculating the squared sum of:

counts of each i term within each D, divided by the total number of counts for that D. Then sum the quotients.

\[\text{D1: } \frac{(8^2) + (8^2) + (8^2) + (8^2) + (8^2)}{40} = \frac{320}{40} = 8 +\]

\[\text{D2: } \frac{(5^2) + (5^2) + (5^2) + (5^2) + (5^2)}{25} = \frac{125}{25} = 5 +\]

\[\text{D3: } \frac{(3^2) + (3^2) + (3^2) + (3^2) + (3^2)}{15} = \frac{45}{15} = 3\]

\[= 16\]

The σ²(di) term on D (data are collapsed across P:D). Sample size starts by calculating the squared sum of:

counts of each di term within each D, divided by the total number of counts for that D. Then sum the quotients.

\[\text{D1: } \frac{(8^2) + (8^2) + (8^2) + (8^2) + (8^2)}{40} = \frac{320}{40} = 8 +\]

\[\text{D2: } \frac{(5^2) + (5^2) + (5^2) + (5^2) + (5^2)}{25} = \frac{125}{25} = 5 +\]

\[\text{D3: } \frac{(3^2) + (3^2) + (3^2) + (3^2) + (3^2)}{15} = \frac{45}{15} = 3\]

\[= 16\]

The σ²(d) term on P:D (data are collapsed across Items). Sample size starts by calculating the squared sum of:

counts of each d term within each P:D, divided by the total number of counts for that P:D. Then sum the quotients.

\[\text{PD 1: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 2: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 3: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 4: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 5: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 6: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 7: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 8: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 9: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 10: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 11: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 12: } \frac{(5^2)}{5} = 5+\]

\[\text{PD 13: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 14: } \frac{(5^2)}{5} = 5+\]

\[\text{PD 15: } \frac{(5^2)}{5} = 5 +\]

\[\text{PD 16: } \frac{(5^2)}{5} = 5\]

\[=80\]

The σ²(i) term on P:D. Sample size starts by calculating the squared sum of:

counts of each i term within each P:D, divided by the total number of counts for that P:D. Then sum the quotients.

\[\text{PD 1: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 2: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 3: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 4: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 5: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 6: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 7: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 8: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 9: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 10: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 11: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 12: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 13: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 14: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 15: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1 +\]

\[\text{PD 16: } \frac{((1^2) + (1^2) + (1^2) + 1^2) + (1^2))}{5} = 1\]

\[=16\]

The σ²(di) term on P:D. Sample size starts by calculating the squared sum of:

counts of each di term within each P:D, divided by the total number of counts for that P:D. Then sum the quotients.