Work in ProgressExamples are a work in progress. Please exercise caution when using code examples, as they may not be fully verified. If you spot gaps, errors, or have suggestions, we'd love your feedback—use the "Suggest changes" button to help us improve!
Overview
Residualized change scores quantify within-subject change while controlling for baseline levels by regressing follow-up values on initial values and extracting residuals that represent deviations from expected change. Unlike simple difference scores, this approach isolates true change from regression-to-the-mean effects. This tutorial analyzes height measurements from ABCD youth across two annual assessments, generating residualized change scores that capture individual deviations from expected growth and testing whether handedness predicts variability in height change beyond what baseline values explain.
When to Use:
Choose this when you have two timepoints and need to control for baseline levels while examining associations with follow-up outcomes.
Key Advantage:
Residualized change separates true change from regression-to-the-mean by regressing follow-up on baseline and analyzing the residuals.
What You'll Learn:
How to fit the baseline-adjusted model, extract residualized change scores, test predictors against those residuals, and visualize distributions.
Data Access
Data Download
ABCD data can be accessed through the DEAP platform or the NBDC Data Access Platform (LASSO), which provide user-friendly interfaces for creating custom datasets with point-and-click variable selection. For detailed instructions on accessing and downloading ABCD data, see the DEAP documentation.
Loading Data with NBDCtools
Once you have downloaded ABCD data files, the NBDCtools package provides efficient tools for loading and preparing your data for analysis. The package handles common data management tasks including:
Automatic data joining - Merges variables from multiple tables automatically
Built-in transformations - Converts categorical variables to factors, handles missing data codes, and adds variable labels
Event filtering - Easily selects specific assessment waves
### Model
# Predict follow-up (Year_1) height from Baseline height
baseline_model <- lm(Height_Year_1 ~ Height_Baseline, data = df_wide)
# Create simple and residualized change scores
df_wide <- df_wide %>%
mutate(residualized_change = residuals(baseline_model)) # Portion not explained by baseline
# Regress the residualized change scores on handedness
model <- lm(residualized_change ~ handedness + site, data = df_wide)
# 2. Extract and tidy model summary
tidy_model <- broom::tidy(model)
# 3. Format into a gt table
model_summary <- tidy_model %>%
gt() %>%
tab_header(title = "Regression Summary Table") %>%
fmt_number(
columns = c(estimate, std.error, statistic, p.value),
decimals = 3
) %>%
cols_label(
term = "Predictor",
estimate = "Estimate",
std.error = "Std. Error",
statistic = "t-Statistic",
p.value = "p-Value"
)
model_summary
# 5. Save as standalone HTML
gt::gtsave(
data = model_summary,
filename = "model_summary.html",
inline_css = FALSE
)
Regression Summary Table
Predictor
Estimate
Std. Error
t-Statistic
p-Value
(Intercept)
−0.252
0.096
−2.641
0.008
handednessLeft-handed
−0.055
0.062
−0.893
0.372
site2
0.277
0.124
2.241
0.025
site3
0.242
0.122
1.987
0.047
site4
0.250
0.117
2.136
0.033
site5
0.178
0.136
1.303
0.193
site6
0.023
0.121
0.194
0.846
site7
0.107
0.139
0.765
0.444
site8
0.232
0.138
1.685
0.092
site9
0.395
0.131
3.005
0.003
site10
−0.011
0.117
−0.097
0.922
site11
0.374
0.132
2.834
0.005
site12
0.151
0.124
1.226
0.220
site13
0.319
0.117
2.727
0.006
site14
0.337
0.122
2.761
0.006
site15
0.890
0.131
6.815
0.000
site16
0.352
0.111
3.176
0.001
site17
0.255
0.123
2.073
0.038
site18
0.116
0.135
0.863
0.388
site19
0.430
0.125
3.442
0.001
site20
0.095
0.119
0.801
0.423
site21
0.426
0.124
3.422
0.001
site22
0.453
0.378
1.198
0.231
Interpretation
Interpretation
Handedness does not significantly predict residualized height change: Compared to right-handed participants (the reference group), left-handed participants had a non-significant change in height (b = -0.05, p = 0.40). Mixed-handed participants also showed no significant difference in height change relative to right-handers (b = 0.05, p = 0.31). These results suggest that handedness does not meaningfully contribute to variability in height change over the one-year period.
Visualization
28 lines
# Select a random subset for visualization (e.g., 250 participants)
df_subset <- df_wide %>% sample_n(250)
# Create a violin plot to visualize residualized change scores by handedness
violin_plot <- ggplot(df_subset, aes(x = handedness, y = residualized_change, fill = handedness)) +
geom_violin(trim = FALSE, alpha = 0.7) + # Violin plot without trimming the tails
geom_jitter(position = position_jitter(width = 0.2), size = 1.2, alpha = 0.5) + # Add jittered points for individual observations
scale_fill_brewer(palette = "Set2") + # Use a color palette from RColorBrewer
labs(
title = "Residualized Change in Height by Handedness",
x = "Handedness",
y = "Height Residuals"
) +
theme_minimal() + # Apply a minimal theme for a clean look
theme(
axis.text.x = element_text(angle = 45, hjust = 1), # Rotate x-axis labels for better readability
legend.position = "none" # Remove the legend as it's redundant
)
print(violin_plot)
# Save as a high-resolution PNG file
ggsave(filename = "visualization.png",
plot = violin_plot,
width = 10, # Specify width in inches (or units)
height = 5, # Specify height in inches (or units)
units = "in", # Specify units (e.g., "in", "cm", "mm")
dpi = 300) # Specify resolution (e.g., 300 for good quality)
Interpretation
Interpretation
The violin plot displays the distribution of residualized height-change scores after baseline adjustment. Each group centers near zero and shows comparable spread, reinforcing that the regression residuals contain no systematic differences by handedness. Overlayed jittered points make it easy to spot individual outliers—none deviate meaningfully from the main mass—so the null findings are not being driven by a handful of unusual observations. Taken together, the figure and model output show that once baseline stature is controlled, subsequent growth is effectively independent of handedness.
Discussion
Residualized change scores allowed us to focus on deviations in growth that could not be explained by baseline height alone. After regressing Year 1 height on baseline and taking the residuals, we fit a linear model that included handedness along with site indicators. The resulting coefficients showed that neither left nor right-handed youth exhibited systematic departures from the expected growth curve once initial stature was accounted for, and site-level contrasts were similarly nonsignificant.
This null finding is informative: when the baseline covariate absorbs most predictable variability, remaining change reflects idiosyncratic influences or measurement noise. The analysis therefore demonstrates how residualized scores can guard against spurious associations that sometimes arise in raw change-score models. Researchers interested in other predictors can drop them into the same framework, interpret effects on the adjusted outcome, and still benefit from the familiar lm tooling for diagnostics and assumption checks.
Additional Resources
4
R Documentation: lm and residuals
DOCS
Official R documentation for the lm() function and residuals() method, essential for computing residualized change scores that control for baseline values.
Classic methodology paper discussing problems with change score analysis and advantages of residualized change scores for controlling baseline differences (Lord, 1967). Note: access may require institutional or paid subscription.