A subscription to JoVE is required to view this content. Sign in or start your free trial.
Here we present a protocol for decomposing the variance in reading comprehension into the unique and common effects of language and decoding.
The Simple View of Reading is a popular model of reading that claims that reading is the product of decoding and language, with each component uniquely predicting reading comprehension. Although researchers have argued whether the sum rather than the product of the components is the better predictor, no researchers have partitioned the variance explained to examine the extent to which the components share variance in predicting reading. To decompose the variance, we subtract the R2 for the language-only model from the full model to obtain the unique R2 for decoding. Second, we subtract the R2 for the decoding-only model from the full model to obtain the unique R2 for language. Third, to obtain the common variance explained by language and decoding, we subtract the sum of the two unique R2 from the R2 for the full model. The method is demonstrated in a regression approach with data from students in grades 1 (n = 372), 6 (n = 309), and 10 (n = 122) using an observed measure of language (receptive vocabulary), decoding (timed word reading), and reading comprehension (standardized test). Results reveal a relatively large amount of variance in reading comprehension explained in grade 1 by the common variance in decoding and language. By grade 10, however, it is the unique effect of language and the common effect of language and decoding that explained the majority of variance in reading comprehension. Results are discussed in the context of an expanded version of the Simple View of Reading that considers unique and shared effects of language and decoding in predicting reading comprehension.
The Simple View of Reading1 (SVR) continues as a popular model of reading because of its simplicity-reading (R) is the product of decoding (D) and language (L)-and because SVR tends to explain, on average, approximately 60% of explained variance in reading comprehension2. SVR predicts that correlations between D and R will decline over time and that correlations between L and R will increase over time. Studies generally support this prediction3,4,5. There are disagreements, however, about the functional form of SVR, with additive models (D + L = R) explaining significantly more variance in reading comprehension than product models (D × L = R)6,7,8, and a combination of sum and product [R = D + L + (D × L) explaining the largest amount of variance in reading comprehension3,9.
Recently the SVR model has expanded beyond regressions based on observed variables to latent variable modeling using confirmatory factory analysis and structural equation modeling. D is typically measured with untimed or timed reading of real words and/or nonwords and R is usually measured by a standardized reading test that includes literacy and informational passages followed by multiple-choice questions. L is typically measured by tests of expressive and receptive vocabulary and, especially in the primary grades, by measures of expressive and receptive syntax and listening comprehension. Most longitudinal studies report that L is unidimensional10,11,12,13. However, another longitudinal study14 reports a two-factor structure for L in the primary grades and a unidimensional structure in grades 4 and 8. Recent cross-sectional studies report that a bifactor model best fits the data and predicts R15,16,17,18. For example, Foorman et al.16 compared unidimensional, three-factor, four-factor, and bifactor models of SVR in data from students in grades 4-10 and found that a bifactor model fit best and explained 72% to 99% of the variance in R. A general L factor explained variance in all seven grades and vocabulary and syntax uniquely explained variance only in one grade each. Although the D factor was moderately correlated with L and R in all grades (0.40-0.60 and 0.47-0.74, respectively), it was not uniquely correlated with R in the presence of the general L factor.
Even though latent variable modeling has expanded SVR by shedding light on the dimensionality of L and the unique role that L plays in predicting R beyond the primary grades, no studies of SVR except one by Foorman et al.19 have partitioned the variance in reading comprehension into what is due uniquely to D and L and what is shared in common. This is a big omission in the literature. Conceptually it makes sense that D and L would share variance in predicting written language because word recognition entails the linguistic skills of phonology, semantics, and discourse at the sentence and text levels20. Similarly, linguistic comprehension must be connected to orthographic representations of phonemes, morphemes, words, sentences, and discourse if text is to be understood21. Multiplying D by L does not yield the knowledge shared by these components. Only decomposition of the variance into what is unique and what is shared by D and L in predicting R will reveal the integrated knowledge crucial to the success of educational interventions.
The one study by Foorman et al.19 that decomposed the variance of reading comprehension into what is unique and what is shared in common by D and L employed a latent variable modeling approach. The following protocol demonstrates the technique with data from students in grades 1, 7, and 10 based on single observed variables for D (timed decoding), L (receptive vocabulary), and R (standardized reading comprehension test) to make the decomposition process easy to understand. The data represent a subset of the data from Foorman et al.19.
Note: The steps below describe decomposing total variance in a dependent variable (Y) into unique variance, common variance, and unexplained variance components based on two selected independent variables (called and
for this example) using software with a graphical user interface and data management software (see Table of Materials).
1. Reading Data into Software with a Graphical User Interface
2. Estimate the Variance Explained in the Dependent Variable (Y)
3. Computing the Unique, Common, and Unexplained Variance Components
4. Plot the UX1R2, UX2R2, CX1X2R2, and e values
Note: Values in cells D2, E2, F2, and G2 are plotted.
The objective of this study was to investigate the contributions of unique and common variance of language (L) and decoding (D) to predicting reading comprehension (R) in grades 1, 7, and 10 in Florida, a state whose demographics are representative of the nation as a whole. There were two hypotheses regarding predictions of the variance explained in reading comprehension. First, after the primary grades, the unique contribution of D will significantly decrease, and the unique contribution...
There are three critical steps in the protocol for decomposing the variance in R into unique and common variance due to L and D. First, subtract the R2 in the L-only model from the full model to obtain the unique R2 for D. Second, subtract the R2 for the D-only model from the full model to obtain the unique R2 for L. Third, to obtain the common variance explained by L and D, subtract the sum of the two unique R2 from the R2 for the full model.
The authors declare that they have no competing financial interests.
The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through a subaward to Florida State University from Grant R305F100005 to the Educational Testing Service as part of the Reading for Understanding Initiative. The opinions expressed are those of the authors and do not represent views of the Institute, the U.S. Department of Education, the Educational Testing Service, or Florida State University.
Name | Company | Catalog Number | Comments |
IBM SPSS Statistics Software | IBM | ||
Microsoft Office Excel | Microsoft |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved