Effect Size Transformation

Image credit: David Silverman/Getty Images

We are currently working on the meta-analyze (link). The effect size of our main interest is calculated for the interaction between two or three conditions. Unfortunately, many older papers do not report the effect sizes. Thereby, we had to compute them from the information included in the paper.

In this post, we discuss (1) how to obtain the effect size of the interaction between two conditions and (2) propose the transformation of the partial eta squared to Cohen’s d, to finally be able to calculate Hedges’ g.

First, we simulated the database for the 2 X 2 within-subject design:

# simulate data
set.seed(42)
nSubj <- 20

# parts of the covariance matrix
C <- matrix(0.5, nrow = 4, ncol = 4)
diag(C) <- 1

# simulate data for the two conditions
simu_dv <- mvtnorm::rmvnorm(nSubj, mean = c(10,12,14,16), sigma = C*4^2)
colnames(simu_dv) <- c("A1_B1", "A2_B1", "A1_B2", "A2_B2")

cor(simu_dv)
##           A1_B1     A2_B1     A1_B2     A2_B2
## A1_B1 1.0000000 0.5642990 0.6528123 0.3355439
## A2_B1 0.5642990 1.0000000 0.6959328 0.4346924
## A1_B2 0.6528123 0.6959328 1.0000000 0.5132476
## A2_B2 0.3355439 0.4346924 0.5132476 1.0000000
simu_long <- simu_dv %>% 
  as_tibble() %>% 
  mutate(Subject = 1:nSubj) %>% 
  pivot_longer(contains("_"), names_to = c("IV1","IV2"),
              names_sep = "_",
              names_transform = list(
                 IV1 = ~ readr::parse_factor(.x, levels = c("A1", "A2")),
                 IV2 = ~ readr::parse_factor(.x, levels = c("B1", "B2")), 
                 ordered = TRUE),
               values_to = "DV")

head(simu_long)
## # A tibble: 6 x 4
##   Subject IV1   IV2      DV
##     <int> <fct> <fct> <dbl>
## 1       1 A1    B1     15.5
## 2       1 A2    B1     12.0
## 3       1 A1    B2     16.6
## 4       1 A2    B2     19.4
## 5       2 A1    B1     12.6
## 6       2 A2    B1     13.2

The database has twenty participants. Each participant completed both conditions IV1 (2: A1 vs. A2) and IV2 (2: B1 vs. B2). There are eighty data points in total.

Analyzing Data: ANOVA

To analyze data with a 2 X 2 within-subject design, we usually use ANOVA. To do so, we used package ‘afex’ (Singmann et al., 2021):

ANOVAresults <-aov_4(DV ~ IV1*IV2+(IV1*IV2|Subject), data = simu_long)
nice(ANOVAresults, es = NULL, sig_symbols = rep("", 4))
## Anova Table (Type 3 tests)
## 
## Response: DV
##    Effect    df  MSE     F p.value
## 1     IV1 1, 19 9.35  8.53    .009
## 2     IV2 1, 19 8.51 35.78   <.001
## 3 IV1:IV2 1, 19 8.27  5.93    .025

Effect size: partial eta-squared.

One of the most common measures of the effect size in psychology is partial eta squared. The partial eta squared formula is relatively simple:

\[ \begin{aligned} \eta_{p}^2 &= \frac{\mathrm{SS}_\mathrm{effect}}{\mathrm{SS}_\mathrm{effect}+\mathrm{SS}_\mathrm{error}}\\ \end{aligned} \]

Most of the time, psychological papers do not report SS-values and we also do not have access to the actual data. Fortunately, we can calculate partial eta squared from F-value and degrees of freedom (Lakens, 2013).

\[ \begin{aligned} \eta_{p}^2 &= \frac{\mathrm{F*df}_\mathrm{num}}{\mathrm{F*df}_\mathrm{num}+\mathrm{df}_\mathrm{den}}\\ \mathrm{df}_\mathrm{num} &= \mathrm{(k_{IV1}-1)*(k_{IV2}-1)}\\ \mathrm{df}_\mathrm{den} &= \mathrm{N-(k_{IV1}*k_{IV2})}\\ \end{aligned} \]

where F is the F-value, k_IV1 and K_IV2 reflect the number of levels in IV1 (i.e., 2) and in condition IV2 (i.e., 2), N is the number of participants (i.e., 20).

Following this equation, partial eta squared of the interaction between condition IV1 and IV2 is:

n2p = 5.93*1/(5.93*1+19)
n2p
## [1] 0.237866

which is the same as partial eta squared, which we have already gotten:

nice(ANOVAresults, es = 'pes', sig_symbols = rep("", 4))
## Anova Table (Type 3 tests)
## 
## Response: DV
##    Effect    df  MSE     F  pes p.value
## 1     IV1 1, 19 9.35  8.53 .310    .009
## 2     IV2 1, 19 8.51 35.78 .653   <.001
## 3 IV1:IV2 1, 19 8.27  5.93 .238    .025

Effect size transformation

To conduct meta-analyses, it is necessary to combine effect sizes from different studies that used different metrics. To do so, we may normalize the effect sizes by transforming them into Hedges’ g (Hedges & Olkin, 1985).

Can we convert partial eta-squared to eta-squared?

First, we can try to transform partial eta squared to eta squared. Next transform eta squared into Cohen’s d and finally, transform Cohen’s d to Hedges’ g using the ‘esc’ R package (Lüdecke, 2019).

We know the following (Pearson, 1911):

\[ \begin{aligned} \eta_{p}^2 &= \eta^2 \end{aligned} \]

Note, that only in the case of the simple design (e.g., one-way ANOVA) the following formula is true. In our case, we are interested in the transformation of the effect size for the interaction between condition IV1 X IV2. The formula for the eta squared is (e.g., Cohen, 1973, 1988; Fisher, 1925, 1973):

\[ \begin{aligned} \eta^2 &= \frac{\mathrm{SS}_\mathrm{effect}}{\mathrm{SS}_\mathrm{total}}\\ \end{aligned} \]

Alternatively, Kennedy (1970) proposed:

\[ \begin{aligned} \eta^2 &= \frac{\mathrm{n}_\mathrm{1}F}{\mathrm{n}_\mathrm{1}F + \mathrm{n}_\mathrm{2}}\\ \end{aligned} \]

where n1 and n2 stand for the specific effect (numerator) and error (denominator) degrees of freedom respectively.

This could be simplified to:

\[ \begin{aligned} \eta^2 &= \frac{\mathrm{SS}_\mathrm{effect}}{\mathrm{SS}_\mathrm{effect}+\mathrm{SS}_\mathrm{error}}\\ \end{aligned} \]

However, Cohen (1973) pointed out that this formula represents partial eta squared, not the actual eta squared.

Taken together, we do not know the way how to convert the eta-squared to partial eta-squared given the information usually provided in the paper.

Now, we are going to consider an alternative approach.

Relationship between eta-squared and Cohen’s d

Let’s take the Cohen’s d from the following formula:

\[ \begin{aligned} \eta_{p}^2 &= \frac{\ d^2 * N}{\ d^2 * N + N\mathrm{-1}}\\ \end{aligned} \]

After transformations (see: Haiyang’s Blog) we can get Cohen’s d.

\[ \begin{aligned} \ d = \sqrt{(\frac{(\mathrm{N - 1})}{\mathrm{N}} * \frac{\eta_{p}^2}{(1-\eta_{p}^2)})}\\ \end{aligned} \]

We now consider a set of values that we got from our ANOVA analyses for the simulated dataset.

nice(ANOVAresults, es = 'pes', sig_symbols = rep("", 4))
## Anova Table (Type 3 tests)
## 
## Response: DV
##    Effect    df  MSE     F  pes p.value
## 1     IV1 1, 19 9.35  8.53 .310    .009
## 2     IV2 1, 19 8.51 35.78 .653   <.001
## 3 IV1:IV2 1, 19 8.27  5.93 .238    .025
CohenD <- sqrt((nSubj-1)/nSubj*.238/(1-.238))
CohenD 
## [1] 0.5447193

Cohen’s d is equal to 0.5447193. Let’s check if this calculation is correct.

\[ \begin{aligned} \ t &= \ d * \sqrt[]{N} \\ \end{aligned} \]

tvalue <- CohenD *sqrt(nSubj)
tvalue
## [1] 2.436059

\[ \begin{aligned} \ F &= \ t^2\\ \end{aligned} \]

Fvalue <- tvalue*tvalue
Fvalue
## [1] 5.934383
nice(ANOVAresults, es = 'pes', sig_symbols = rep("", 4))
## Anova Table (Type 3 tests)
## 
## Response: DV
##    Effect    df  MSE     F  pes p.value
## 1     IV1 1, 19 9.35  8.53 .310    .009
## 2     IV2 1, 19 8.51 35.78 .653   <.001
## 3 IV1:IV2 1, 19 8.27  5.93 .238    .025

The F-value is the same as the F-value presented in the ANOVA table!

Converting Cohen’s d to Hedges’ g.

To convert from d to Hedges’ g we use a correction factor, which is called J. Hedges and Olkin (1985) gives the exact formula for J, but in common practice researchers use an approximation (Bornstein et al., 2009, p27):

\[ \begin{aligned} \ J = \ 1-\frac{\mathrm{3}}{\mathrm{4*(N-1)-1}} \\\ \end{aligned} \]

The formula to calculate the Hedges’ g is:

\[ \begin{aligned} \ g = \ J*d \\\\ \end{aligned} \]

Note. Alternative correction is discussed by McGrath & Meyer (2006):

\[ \begin{aligned} \ g = \ d*(\frac{\mathrm{N - 3}}{\mathrm{N-2.25}})*\sqrt{(\frac{(\mathrm{N - 2})}{\mathrm{N}}} \\\\ \end{aligned} \]

In our work we used ‘esc’ package to transform Cohen’s d to Hedges’ g.

library('esc')
## 
## Attaching package: 'esc'
## The following objects are masked from 'package:effectsize':
## 
##     cohens_d, cohens_f, eta_squared, hedges_g
HedgesG <- hedges_g(CohenD, totaln = 20)
HedgesG 
## [1] 0.521703

Confidence intervals (CIs) for Hedges’ g

Hedge and Olkin (2014, p.86) provided a formula for estimating CI for effect size

\[ \begin{aligned} \sigma(g) = \sqrt{(\frac{(\mathrm{N_{1} + N_{2}})}{\mathrm{N_{1} * N_{2}}} + \frac{g^2}{2(\mathrm{N_{1} + N_{2}})})}\\ \end{aligned} \]

where 95% CI for Hedges’ g: [g − 1.96 × σ(g), g + 1.96 × σ(g)].

However, this formlua refers to between-subject desing. In our case, we want to calculate the CIs for within-subject desing.

Here we used the part of the ‘effectsize’ package source code to calculate the CIs (many thanks Mattan).

CIs are constructed using the non-central-parameter (ncp) method (see Howell, 2011). The idea is to find two ncps - one that gives a t-distribution whose alpha/2 percentile is the sample t, and one whose 1-alpha/2 percentile is the sample t. These ncps are then transformed back to Cohen’s d (as d and t are transformations of one another).

# Let's set-up the non-central-parameter (ncp) function
.get_ncp_t <- function(t, df_error, conf.level = 0.95) {
    alpha <- 1 - conf.level
    probs <- c(alpha / 2, 1 - alpha / 2)
    
    ncp <- suppressWarnings(optim(
      par = 1.1 * rep(t, 2),
      fn = function(x) {
        p <- pt(q = t, df = df_error, ncp = x)
        
        abs(max(p) - probs[2]) +
          abs(min(p) - probs[1])
      },
      control = list(abstol = 1e-09)
    ))
    t_ncp <- unname(sort(ncp$par))
    
    return(t_ncp)
}  

# We alredy now the following:
t <- tvalue
n = nSubj 
df <- n - 1
ci = 0.95
hn <- 1 / (n - 1)

# Now we can calculate the CIs 
ts <- .get_ncp_t(t,  df, ci)

CI_low <- ts[1] * sqrt(hn)
CI_high <- ts[2] * sqrt(hn)

CohenD 
## [1] 0.5447193
CI_low
## [1] 0.06948177
CI_high
## [1] 1.035564
hedges_g(c(CohenD, CI_low, CI_high), totaln = nSubj)
## [1] 0.52170303 0.06654592 0.99180783

To sum up, the Cohen’s d = 0.545, 95% CIs [0.07, 1.04] and Hedges’ g = 0.522, 95% CIs: [0.07, 0.99].

Note. For within-subject design, the CIs are computed as if the data are from a one-sample design. This is not perfect solution (see Fitts, 2020), but the only one avaible at the moment of writing this post.

Standard Error (SE) for Hedges’ g

Bornstein et al. (2009, p.27-30) provided a formula for estimating SE for Hedges’ g

\[ \begin{aligned} \ V_{d} = (\frac{\ 1}{n} + \frac{\ d^2}{2n})2(1-r) \\ \ J = \ 1-\frac{\mathrm{3}}{\mathrm{4*(N-1)-1}} \\\ \ V_{g} = \ J^2*V_{d} \\ \ SE_{g} = \sqrt{V_{g}} \\ \end{aligned} \] For the r we can use the following formula (Bornstein et al., 2009, p.48), which based on the convertion of the Cohen’s d:

\[ \begin{aligned} \ r = \frac{\ d}{\sqrt{d^2+a}} \\ \end{aligned} \] The correction factor (a) in our case (i.e., within-subject desing) is constant: a = 4.

\[ \begin{aligned} \ a = \ 4 \\ \end{aligned} \]

# 'effectsize' package provide the function for the conversion, which based on the equation showed above, where final product is r -1
r <- d_to_r(CohenD)
# Cohen's d variance 
J <- 1 - 3/(4*(nSubj-1)-1)
n = 20 # n is the number of pairs 
vd <- (1/n)+((CohenD^2)/(2*n))*(2*(r)) # remove 1 from equation as it was already done with 'effectsize' package
vg <- vd*J^2
SEG <- sqrt(vg)
SEG
## [1] 0.2228745

Let’s check if our calculation is correct. For simplicity we will use effect size of the main effect IV1.

DT <- aggregate( x = simu_long[c("DV")],
                      by = simu_long[c("IV1", "Subject" 
                                )], FUN = mean)

t.test(DT$DV ~ DT$IV1, paired = TRUE)
## 
##  Paired t-test
## 
## data:  DT$DV by DT$IV1
## t = -2.9199, df = 19, p-value = 0.008785
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.4270899 -0.5653032
## sample estimates:
## mean of the differences 
##               -1.996197
A1<- DT[DT$IV1 == "A1",]
A2<- DT[DT$IV1 == "A2",]

# Cohen's d: Bornstein et al., 2009 (4.26 P29)
diff <- mean(A1$DV - A2$DV)
sd_diff <- sd(A1$DV - A2$DV)
CohensD <- diff/sd_diff

# Take correlation between between pairs of observations
cor.test(A1$DV,A2$DV)  
## 
##  Pearson's product-moment correlation
## 
## data:  A1$DV and A2$DV
## t = 4.0656, df = 18, p-value = 0.0007256
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3594056 0.8684959
## sample estimates:
##       cor 
## 0.6918858
r = 0.6918858
# Variance of d: Bornstein et al., 2009 (4.26 P29)
vd <- (1/n)+((CohenD^2)/(2*n))*(2*(1-0.6918858)) # add 1 to equation
vg <- vd*J^2
SEG <- sqrt(vg)
SEG
## [1] 0.2242605

Broadly speaking we get similar values!

Example

Let’s consider another example. We found a paper where are provided information about two significant two-way interactions between ConditionA (2: level1 vs. level2) and ConditionB (2: level1 vs. level2) on accuracy for Caucasian (n = 21, F(1,20) = 18.436, MS = .056, p <.001) but not Asian (n = 21, F(1,20) = .073, MS = .001, p >.10) participants.

nSubjC <- 21 # Number of Caucasian participants
nSubjA <- 21 # Number of Asian participants

# Partial eta-squared 
n2pA <- .073*1/(.073*1+20)
n2pC <- 18.436*1/(18.436*1+20) 

# Cohen's d: Asian faces
CohenD_A <- sqrt((nSubjA -1)/nSubjA *0.004/(1-0.004))
CohenD_A
## [1] 0.06184515
# Cohen's d: Caucasian faces
CohenD_C <- sqrt((nSubjC-1)/nSubjC*n2pC/(1-n2pC))
CohenD_C
## [1] 0.9369657
#----------------------------------------------
# Confidence Intervals
#----------------------------------------------  

# Function
.get_ncp_t <- function(t, df_error, conf.level = 0.95) {
    alpha <- 1 - conf.level
    probs <- c(alpha / 2, 1 - alpha / 2)
    
    ncp <- suppressWarnings(optim(
      par = 1.1 * rep(t, 2),
      fn = function(x) {
        p <- pt(q = t, df = df_error, ncp = x)
        
        abs(max(p) - probs[2]) +
          abs(min(p) - probs[1])
      },
      control = list(abstol = 1e-09)
    ))
    t_ncp <- unname(sort(ncp$par))
    
    return(t_ncp)
}  

# g & CIs for Asian participants:
    
    # Basic formulas
    n = nSubjA 
    df <- n - 1
    ci = 0.95
    hn <- 1 / (n - 1)
    t <-sqrt(.073)
    
ts <- .get_ncp_t(t,  df, ci)

CI_lowA <- ts[1] * sqrt(hn)
CI_highA <- ts[2] * sqrt(hn)
CI_lowA
## [1] -0.3789895
CI_highA
## [1] 0.4983223
gA <- hedges_g(c(CohenD_A, CI_lowA,CI_highA), totaln = nSubjA)
gA
## [1]  0.05937135 -0.36382993  0.47838939
# g & CIs for Caucasian faces:
    n = nSubjC 
    t <- sqrt(18.436)
    
ts <- .get_ncp_t(t,  df, ci)

CI_lowC <- ts[1] * sqrt(hn)
CI_highC <- ts[2] * sqrt(hn)
CI_lowC
## [1] 0.4232165
CI_highC
## [1] 1.480449
gC<- hedges_g(c(CohenD_C,CI_lowC,CI_highC), totaln = nSubjC)
gC
## [1] 0.8994871 0.4062878 1.4212309
#----------------------------------------------
# Standard Error
#----------------------------------------------  
r <- d_to_r(CohenD_A)
# Cohen's d variance 
J <- 1 - 3/(4*(nSubj-1)-1)
n = 21 # n is the number of pairs 
vd <- (1/n)+((CohenD_A^2)/(2*n))*(2*(r)) 
vg <- vd*J^2
SEG_A <- sqrt(vg)
SEG_A
## [1] 0.2095016
r <- d_to_r(CohenD_C)
# Cohen's d variance 
J <- 1 - 3/(4*(nSubj-1)-1)
n = 21 # n is the number of pairs 
vd <- (1/n)+((CohenD_C^2)/(2*n))*(2*(r)) 
vg <- vd*J^2
SEG_C <- sqrt(vg)
SEG_C
## [1] 0.2454189

The Hedges’ g for the interaction between ConditionA and ConditionB was 0.059 (95% CIs [-.36, .48], SE = .21) and 0.899 (95% CIs [.41, 1.42], SE = .25) for Asian and Caucasian participants, respectively.

Function

To make things more accessible, Haiyang developed two functions for converting between partial eta squared and Cohen’s d. 

# load the functions
# source: https://gist.github.com/HaiyangJin/3334e4d6588cbfe36b69c1bf2540c2ea"
# set-up values or use gist
nSubj <- 20
Eta2_partial <- 0.384532
Cohens_d <- 0.770416
# convert partial eta squared to Cohen's d
pes2d(Eta2_partial, nSubj) 
## [1] 0.770416
# convert Cohen's d to partial eta square
d2pes(Cohens_d, nSubj) 
## [1] 0.384532

Some words at the end

We cannot guarantee that these conversions are applicable in all situations, but at least they work well for cases similar to the simulated data.

Additionally, we really appreciate if you could let us know if there is any error in this post.

References

Bornstein, M., Hedges, L. V., Higgins, J. P. T., Rothstein, H. (2009). Introduction to meta-analysis, Padstow: John Wiley & Sons.

Cohen, J. (1973). Eta-squared and partial eta-squared in fixed factor ANOVA designs. Educational and Psychological Measurement, 33(1), 107-112.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Fitts, D. A. (2020). Commentary on “A review of effect sizes and their confidence intervals, Part I: The Cohen’s d family”: The degrees of freedom for paired samples designs. The Quantitative Methods for Psychology, 16, 1-35.

Fisher, R.A. (1925). Statistical methods for research workers. Edinburgh: Oliver & Boyd.

Fisher, R. A. (1973). Statistical methods for research workers (14th ed.). New York: Hafner.

Hedges, L., & Olkin, I. (1985). Statistical Methods for Meta-analysis. San Diego, CA: Academic Press.

Hedge L., Olkin I. (2014). Statistical methods for meta-analysis. Orlando: Academic Press Inc.

Howell, D. C. (2011). Confidence intervals on effect size. Retrieved from http://www.uvm.edu/~dhowell/methods8/Supplements/Confidence%20Intervals%20on%20Effect%20Size.pdf

Kennedy, J. J. (1970). The eta coefficient in complex ANOVA designs. Educational and Psychological Measurement, 30(4), 885-889.

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. doi: 10.3389/fpsyg.2013.00863

McGrath, R. E., & Meyer, G. J. (2006). When effect sizes disagree: the case of r and d. Psychological Methods, 11(4), 386-401. doi:10.1037/1082-989X.11.4.386

Pearson, K. (1911). On a correction needful in the case of the correlation ratio. Biometrika, 8, 254–256.

Package:

Ben-Shachar M, Lüdecke D, Makowski D (2020). effectsize: Estimation of Effect Size Indices and Standardized Parameters. Journal of Open Source Software, 5(56), 2815.doi: 10.21105/joss.02815

Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686

Lüdecke D (2019). esc: Effect Size Computation for Meta Analysis (Version 0.5.1). doi:10.5281/zenodo.1249218 (URL: https://doi.org/10.5281/zenodo.1249218)

Singmann H, Bolker B, Westfall J, Aust F, & Ben-Shachar M S (2021). afex: Analysis of Factorial Experiments. R package version 0.28-1. https://CRAN.R-project.org/package=afex

Tobiasz Trawinski
Tobiasz Trawinski
Lecturer in Psychology

Related