| Title: | Continuous Norming |
|---|---|
| Description: | Generates continuous test norms in psychometrics and biometrics, and analyzing model fit. The package offers both distribution-free modeling using Taylor polynomials and parametric modeling using the beta-binomial and the 'Sinh-Arcsinh' distribution. Originally developed for achievement tests, it is applicable to a wide range of mental, physical, or other test scores dependent on continuous or discrete explanatory variables. The package provides several advantages: It minimizes deviations from representativeness in subsamples, interpolates between discrete levels of explanatory variables, and significantly reduces the required sample size compared to conventional norming per age group. cNORM enables graphical and analytical evaluation of model fit, accommodates a wide range of scales including those with negative and descending values, and as well supports conventional norming. It generates norm tables including confidence intervals. Methods for addressing representativeness issues are available through Iterative Proportional Fitting. Based on Lenhard et al. (2016) <doi:10.1177/1073191116656437>, Lenhard et al. (2019) <doi:10.1371/journal.pone.0222279>, Lenhard and Lenhard (2021) <doi:10.1177/0013164420928457> and Gary et al. (2023) <doi:10.1007/s00181-023-02456-0>. |
| Authors: | Alexandra Lenhard [aut] (ORCID: <https://orcid.org/0000-0001-8680-4381>), Wolfgang Lenhard [cre, aut] (ORCID: <https://orcid.org/0000-0002-8184-6889>), Sebastian Gary [aut], WPS Publisher [fnd] (https://www.wpspublish.com/) |
| Maintainer: | Wolfgang Lenhard <[email protected]> |
| License: | AGPL-3 |
| Version: | 3.6.0 |
| Built: | 2026-05-26 15:04:20 UTC |
| Source: | https://github.com/wlenhard/cnorm |
Selects polynomial degrees for the two components of a beta-binomial model
(/ when mode = 2 or / when
mode = 1) by minimizing BIC.
autoselect.betabinomial( age, score, n = NULL, weights = NULL, mode = 2, max_alpha = 4, max_beta = 4, min_alpha = 1, min_beta = 1, control = NULL, scale = "T", parallel = TRUE, n_cores = NULL, plot = TRUE, verbose = TRUE )autoselect.betabinomial( age, score, n = NULL, weights = NULL, mode = 2, max_alpha = 4, max_beta = 4, min_alpha = 1, min_beta = 1, control = NULL, scale = "T", parallel = TRUE, n_cores = NULL, plot = TRUE, verbose = TRUE )
age, score
|
Numeric vectors of predictor and response values. |
n |
Maximum score. Defaults to |
weights |
Optional numeric vector of weights. |
mode |
1 for |
max_alpha, max_beta
|
Maximum polynomial degrees. Default 4. |
min_alpha, min_beta
|
Minimum polynomial degrees. Default 1. |
control |
Optional control list passed to |
scale |
Norm scale (default |
parallel |
Logical; attempt parallel execution. Default |
n_cores |
Number of cores. Defaults to all logical cores. |
plot |
Logical; plot the selected model. Default |
verbose |
Logical; print progress. Default |
Parallel execution is attempted by default. If the workers cannot access the cNORM namespace, the function transparently falls back to sequential execution.
The selected fitted model
Selects the polynomial degrees for the four components of a
cnorm.shash model (, , ,
) by minimizing BIC over a full grid.
autoselect.shash( age, score, weights = NULL, max_mu = 4, max_sigma = 3, max_epsilon = 3, max_delta = 1, min_mu = 1, min_sigma = 1, min_epsilon = 1, min_delta = 0, delta = 1, control = NULL, scale = "T", parallel = TRUE, n_cores = NULL, plot = TRUE, verbose = TRUE )autoselect.shash( age, score, weights = NULL, max_mu = 4, max_sigma = 3, max_epsilon = 3, max_delta = 1, min_mu = 1, min_sigma = 1, min_epsilon = 1, min_delta = 0, delta = 1, control = NULL, scale = "T", parallel = TRUE, n_cores = NULL, plot = TRUE, verbose = TRUE )
age, score
|
Numeric vectors of predictor and response values. |
weights |
Optional numeric vector of observation weights. |
max_mu, max_sigma, max_epsilon, max_delta
|
Maximum polynomial degrees for
the four shash parameters. Defaults: |
min_mu, min_sigma, min_epsilon
|
Minimum polynomial degrees (default 1). |
min_delta |
Minimum degree for |
delta |
Value of |
control |
Optional control list passed to |
scale |
Norm scale (default |
parallel |
Logical; attempt parallel execution. Default |
n_cores |
Number of cores. Defaults to all logical cores. |
plot |
Logical; plot the selected model. Default |
verbose |
Logical; print progress. Default |
A degree of 0 for is interpreted as a **fixed**
(passed to cnorm.shash as delta_degree = NULL)
with value taken from the delta argument. This allows the grid
search to choose between a fixed and a varying tail-weight model.
Parallel execution is attempted by default. If the workers cannot access the cNORM namespace, the function transparently falls back to sequential execution.
The selected fitted cnormShash model with an additional
element $selection containing:
evaluated: data frame of every combination tried, sorted by BIC.
selected: list with the chosen degrees and BIC.
cnorm.shash, autoselect.betabinomial
## Not run: m <- autoselect.shash(elfe$group, elfe$raw) m$selection$evaluated summary(m, age = elfe$group, score = elfe$raw) ## End(Not run)## Not run: m <- autoselect.shash(elfe$group, elfe$raw) m$selection$evaluated summary(m, age = elfe$group, score = elfe$raw) ## End(Not run)
Computes Taylor polynomial regression models by evaluating a series of models with increasing predictors. It aims to find a consistent model that effectively captures the variance in the data. It draws on the regsubsets function from the leaps package and builds up to 20 models for each number of predictors, evaluates these models regarding model consistency and selects consistent model with the highest R^2. This automatic model selection should usually be accompanied with visual inspection of the percentile plots and assessment of fit statistics. Set R^2 or number of terms manually to retrieve a more parsimonious model, if desired.
bestModel( data, raw = NULL, R2 = NULL, k = NULL, t = NULL, predictors = NULL, terms = 0, weights = NULL, force.in = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )bestModel( data, raw = NULL, R2 = NULL, k = NULL, t = NULL, predictors = NULL, terms = 0, weights = NULL, force.in = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )
data |
Preprocessed dataset with 'raw' scores, powers, interactions, and usually an explanatory variable (like age). |
raw |
Name of the raw score variable (default: 'raw'). |
R2 |
Adjusted R^2 stopping criterion for model building. |
k |
Power constant influencing model complexity (default: 4, max: 6). |
t |
Age power parameter. If unset, defaults to 'k'. |
predictors |
List of predictors or regression formula for model selection. Overrides 'k' and can include additional variables. |
terms |
Desired number of terms in the model. |
weights |
Optional case weights. If set to FALSE, default weights (if any) are ignored. |
force.in |
Variables forcibly included in the regression. |
plot |
If TRUE (default), displays a percentile plot of the model and information about the regression object. FALSE turns off plotting and report. |
extensive |
If TRUE (default), screen models for consistency and - if possible, exclude inconsistent ones |
subsampling |
(deprecated) If TRUE (default is FALSE), use 10-fold subsampled coefficient averaging in 'bestModel'. |
The functions rankBySlidingWindow, rankByGroup, bestModel,
computePowers and prepareData are usually not called directly, but accessed
through other functions like cnorm.
Additional functions like plotSubset(model) and cnorm.cv can aid in model evaluation.
The model. Further exploration can be done using plotSubset(model) and plotPercentiles(data, model).
plotSubset, plotPercentiles, plotPercentileSeries, checkConsistency
Other model:
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
# Example with sample data ## Not run: # It is not recommended to directly use this function. Rather use 'cnorm' instead. normData <- prepareData(elfe) model <- bestModel(normData) plotSubset(model) plotPercentiles(buildCnormObject(normData, model)) # Specifying variables explicitly preselectedModel <- bestModel(normData, predictors = c("L1", "L3", "L1A3", "A2", "A3")) print(regressionFunction(preselectedModel)) ## End(Not run)# Example with sample data ## Not run: # It is not recommended to directly use this function. Rather use 'cnorm' instead. normData <- prepareData(elfe) model <- bestModel(normData) plotSubset(model) plotPercentiles(buildCnormObject(normData, model)) # Specifying variables explicitly preselectedModel <- bestModel(normData, predictors = c("L1", "L3", "L1A3", "A2", "A3")) print(regressionFunction(preselectedModel)) ## End(Not run)
This function calculates the (a) and (b) parameters of a beta binomial
distribution, along with the mean (m), variance (var) based on the input vector 'x'
and the maximum number 'n'.
betaCoefficients(x, n = NULL)betaCoefficients(x, n = NULL)
x |
A numeric vector of non-negative integers representing observed counts. |
n |
The maximum number or the maximum possible value of 'x'. If not specified, uses max(x) instead. |
The beta-binomial distribution is a discrete probability distribution that models the number of successes in a fixed number of trials, where the probability of success varies from trial to trial. This variability in success probability is modeled by a beta distribution. Such a calculation is particularly relevant in scenarios where there is heterogeneity in success probabilities across trials, which is common in real-world situations, as for example the number of correct solutions in a psychometric test, where the test has a fixed number of items.
A numeric vector containing the calculated parameters in the following order: alpha (a), beta (b), mean (m), standard deviation (sd), and the maximum number (n).
Helper function to build a cnorm object from a data object and a model object from the bestModel function for compatibility reasons.
buildCnormObject(data, model)buildCnormObject(data, model)
data |
A data object from 'prepareData', or from 'rankByGroup' and 'computePower' |
model |
Object obtained from the bestModel function |
A cnorm object
## Not run: data <- prepareData(elfe) model <- bestModel(data, k = 4) model.cnorm <- buildCnormObject(data, model) ## End(Not run)## Not run: data <- prepareData(elfe) model <- bestModel(data, k = 4) model.cnorm <- buildCnormObject(data, model) ## End(Not run)
Build regression function for bestModel
buildFunction(raw, k, t, age)buildFunction(raw, k, t, age)
raw |
name of the raw score variable |
k |
the power degree for location |
t |
the power degree for age |
age |
use age |
regression function
The function is an inline for searching zeros in the inverse regression function. It collapses the regression function at a specific age and simplifies the coefficients.
calcPolyInL(raw, age, model)calcPolyInL(raw, age, model)
raw |
The raw value (subtracted from the intercept) |
age |
The age |
model |
The cNORM regression model |
The coefficients
The function is an inline for searching zeros in the inverse regression function. It collapses the regression function at a specific age and simplifies the coefficients. Optimized version of the prior 'calcPolyInLBase'
calcPolyInLBase2(raw, age, coeff, k)calcPolyInLBase2(raw, age, coeff, k)
raw |
The raw value (subtracted from the intercept) |
age |
The age |
coeff |
The cNORM regression model coefficients |
k |
The cNORM regression model power parameter |
The coefficients
By the courtesy of the Center of Disease Control (CDC), cNORM includes human growth data for children and adolescents age 2 to 25 that can be used to model trajectories of the body mass index and to estimate percentiles for clinical definitions of under- and overweight. The data stems from the NHANES surveys in the US and was published in 2012 as public domain. The data was cleaned by removing missing values and it includes the following variables from or based on the original dataset.
CDCCDC
A data frame with 45053 rows and 7 variables:
continuous age in years, based on the month variable
age group; chronological age in years at the time of examination
chronological age in month at the time of examination
sex of the participant, 1 = male, 2 = female
height of the participants in cm
weight of the participants in kg
the body mass index, computed by (weight in kg)/(height in m)^2
A data frame with 45035 rows and 7 columns
Center for Disease Control and Prevention (2012). National Health and Nutrition Examination Survey: Questionnaires, datasets and related documentation. U.S. Department of Health and Human Services (original source not available anymore).
While abilities increase and decline over age, within one age group, the norm scores always have to show a monotonic increase or decrease with increasing raw scores. Violations of this assumption are an indication for problems in modeling the relationship between raw and norm scores. There are several reasons, why this might occur:
Vertical extrapolation: Choosing extreme norm scores, e. g. values -3 <= x and x >= 3 In order to model these extreme values, a large sample dataset is necessary.
Horizontal extrapolation: Taylor polynomials converge in a certain radius. Using the model values outside the original dataset may lead to inconsistent results.
The data cannot be modeled with Taylor polynomials, or you need another power parameter (k) or R2 for the model.
checkConsistency( model, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, minRaw = NULL, maxRaw = NULL, stepAge = NULL, stepNorm = 1, warn = FALSE, silent = FALSE )checkConsistency( model, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, minRaw = NULL, maxRaw = NULL, stepAge = NULL, stepNorm = 1, warn = FALSE, silent = FALSE )
model |
The model from the bestModel function or a cnorm object |
minAge |
Age to start with checking |
maxAge |
Upper end of the age check |
minNorm |
Lower end of the norm value range |
maxNorm |
Upper end of the norm value range |
minRaw |
clipping parameter for the lower bound of raw scores |
maxRaw |
clipping parameter for the upper bound of raw scores |
stepAge |
Stepping parameter for the age check. values indicate higher precision / closer checks |
stepNorm |
Stepping parameter for the norm table check within age with lower scores indicating a higher precision. The choice depends of the norm scale used. With T scores a stepping parameter of 1 is suitable |
warn |
If set to TRUE, already minor violations of the model assumptions are displayed (default = FALSE) |
silent |
turn off messages |
In general, extrapolation (point 1 and 2) can carefully be done to a certain degree outside the original sample, but it should in general be handled with caution. Please note that at extreme values, the models most likely become independent and it is thus recommended to restrict the norm score range to the relevant range of abilities, e.g. +/- 2.5 SD via the minNorm and maxNorm parameter.
Boolean, indicating model violations (TRUE) or no problems (FALSE)
Other model:
bestModel(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
## Not run: model <- cnorm(raw = elfe$raw, group = elfe$group, plot = FALSE) modelViolations <- checkConsistency(model, minNorm = 25, maxNorm = 75) plotDerivative(model, minNorm = 25, maxNorm = 75) ## End(Not run)## Not run: model <- cnorm(raw = elfe$raw, group = elfe$group, plot = FALSE) modelViolations <- checkConsistency(model, minNorm = 25, maxNorm = 75) plotDerivative(model, minNorm = 25, maxNorm = 75) ## End(Not run)
Check, if NA or values <= 0 occur and issue warning
checkWeights(weights)checkWeights(weights)
weights |
Raking weights |
Conducts continuous norming in one step and returns an object including ranked raw data and the continuous norming model. Please consult the function description of 'rankByGroup', 'rankBySlidingWindow' and 'bestModel' for specifics of the steps in the data preparation and modeling process. In addition to the raw scores, either provide
a numeric vector for the grouping information (group)
a numeric age vector and the width of the sliding window (age, width)
for the ranking of the raw scores. You can adjust the grade of smoothing of the regression model by setting the k and terms parameter. In general, increasing k to more than 4 and the number of terms lead to a higher fit, while lower values lead to more smoothing. The power parameter for the age trajectory can be specified independently by 't'. If both parameters are missing, cnorm uses k = 5 and t = 3 by default.
cnorm( raw = NULL, group = NULL, age = NULL, width = NA, weights = NULL, scale = "T", method = 4, descend = FALSE, k = NULL, t = NULL, terms = 0, R2 = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )cnorm( raw = NULL, group = NULL, age = NULL, width = NA, weights = NULL, scale = "T", method = 4, descend = FALSE, k = NULL, t = NULL, terms = 0, R2 = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )
raw |
Numeric vector of raw scores |
group |
Numeric vector of grouping variable, e.g. grade. If no group or age variable is provided, conventional norming is applied. |
age |
Numeric vector with chronological age. If used without 'group', please additionally specify 'width'. |
width |
Size of the sliding window in case an age vector is used. |
weights |
Optional numeric vector of case weights for post-stratification. |
scale |
Type of norm scale, either "T" (default), "IQ", "z" or "percentile" (= no transformation); a numeric vector with mean and SD can also be provided, e.g. 'c(10, 3)' for Wechsler scale index points. |
method |
Ranking method in case of ties; an integer index from 1 (Blom 1958) through 7 (Yu & Huang 2001). Default is 4 (Rankit). |
descend |
If TRUE, inverts the ranking order so that higher raw scores receive lower norm scores (e.g. for error scores). |
k |
Power degree for the location dimension (max 6). |
t |
Power degree for the age dimension (max 6). |
terms |
If > 0, fix the model to this number of terms. |
R2 |
Stopping criterion (adjusted R-squared) for model selection. |
plot |
If TRUE (default), display percentile plot and report. |
extensive |
If TRUE (default), screen models for monotonic consistency. |
subsampling |
(deprecated) If TRUE (default is FALSE), use 10-fold subsampled coefficient averaging in 'bestModel'. |
cnorm object including the ranked raw data and the regression model.
Gary, S. & Lenhard, W. (2021). In norming we trust. Diagnostica.
Gary, S., Lenhard, W. & Lenhard, A. (2021). Modelling Norm Scores with the cNORM Package in R. Psych, 3(3), 501-521. https://doi.org/10.3390/psych3030033
Gary, S., Lenhard, W., Lenhard, A., & Herzberg, D. (2023). A tutorial on automatic post-stratification and weighting in conventional and regression-based norming of psychometric tests. Behavior Research Methods. https://doi.org/10.3758/s13428-023-02207-0
Gary, S., Lenhard, A., Lenhard, W., & Herzberg, D. S. (2023). Reducing the Bias of Norm Scores in Non-Representative Samples: Weighting as an Adjunct to Continuous Norming Methods. Assessment, 30(8), 2491–2509. https://doi.org/10.1177/10731911231153832
Lenhard, A., Lenhard, W., Suggate, S. & Segerer, R. (2016). A continuous solution to the norming problem. Assessment, Online first, 1-14. doi:10.1177/1073191116656437
Lenhard, A., Lenhard, W., Gary, S. (2018). Continuous Norming (cNORM). The Comprehensive R Network, Package cNORM, available: https://CRAN.R-project.org/package=cNORM
Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. PLoS ONE, 14(9), e0222279. doi:10.1371/journal.pone.0222279
Lenhard, W., & Lenhard, A. (2020). Improvement of Norm Score Quality via Regression-Based Continuous Norming. Educational and Psychological Measurement(Online First), 1-33. https://doi.org/10.1177/0013164420928457
rankByGroup, rankBySlidingWindow, computePowers, bestModel
## Not run: # Conventional norming cnorm(raw = elfe$raw) # Continuous norming via group m1 <- cnorm(raw = elfe$raw, group = elfe$group) # Continuous norming via continuous age + sliding window m2 <- cnorm(raw = ppvt$raw, age = ppvt$age, width = 1) # Norm tables with confidence intervals normTable(c(2.125, 2.375, 2.625), m1, CI = .9, reliability = .95) rawTable (c(2.125, 2.375, 2.625), m1, CI = .9, reliability = .95) ## End(Not run)## Not run: # Conventional norming cnorm(raw = elfe$raw) # Continuous norming via group m1 <- cnorm(raw = elfe$raw, group = elfe$group) # Continuous norming via continuous age + sliding window m2 <- cnorm(raw = ppvt$raw, age = ppvt$age, width = 1) # Norm tables with confidence intervals normTable(c(2.125, 2.375, 2.625), m1, CI = .9, reliability = .95) rawTable (c(2.125, 2.375, 2.625), m1, CI = .9, reliability = .95) ## End(Not run)
This function fits a beta-binomial regression model where both the and
parameters of the beta-binomial distribution are modeled as polynomial functions
of the predictor variable (typically age). Setting mode to 1 fits a beta-binomial
model on the basis of and , setting it to 2 (default) fits a beta-binomial
model directly on the basis of and .
cnorm.betabinomial( age, score, n = NULL, weights = NULL, mode = 2, alpha = 3, beta = 3, control = NULL, scale = "T", plot = T )cnorm.betabinomial( age, score, n = NULL, weights = NULL, mode = 2, alpha = 3, beta = 3, control = NULL, scale = "T", plot = T )
age |
A numeric vector of predictor values (e.g., age). |
score |
A numeric vector of response values. |
n |
The maximum score (number of trials in the beta-binomial distribution). If NULL, max(score) is used. |
weights |
A numeric vector of weights for each observation. Default is NULL (equal weights). |
mode |
Integer specifying the mode of the model. Default is 2 (direct modelling of |
alpha |
Integer specifying the degree of the polynomial for the alpha model.
Default is 3. If mode is set to 1, this parameter is used to specify the degree
of the polynomial for the |
beta |
Integer specifying the degree of the polynomial for the beta model. Default is 3.
If mode is set to 1, this parameter is used to specify the degree of the polynomial
for the |
control |
A list of control parameters to be passed to the 'optim' function. If NULL, default values are used, namely control = list(reltol = 1e-8, maxit = 1000) for mode 1 and control = list(factr = 1e-8, maxit = 1000) for mode 2. |
scale |
Type of norm scale, either "T" (default), "IQ", "z" or a double vector with the mean and standard deviation. |
plot |
Logical indicating whether to plot the model. Default is TRUE. |
The function standardizes the input variables, fits polynomial models for both the alpha and beta parameters, and uses maximum likelihood estimation to find the optimal parameters. The optimization is performed using the L-BFGS-B method.
A list of class "cnormBetaBinomial" or "cnormBetaBinomial2". In case of mode 2 containing:
alpha_est |
Estimated coefficients for the alpha model |
beta_est |
Estimated coefficients for the beta model |
se |
Standard errors of the estimated coefficients |
alpha_degree |
Degree of the polynomial for the alpha model |
beta_degree |
Degree of the polynomial for the beta model |
result |
Full result from the optimization procedure |
## Not run: # Fit a beta-binomial regression model to the PPVT data model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Use weights for post-stratification marginals <- data.frame(var = c("sex", "sex", "migration", "migration"), level = c(1,2,0,1), prop = c(0.51, 0.49, 0.65, 0.35)) weights <- computeWeights(ppvt, marginals) model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228, weights = weights) ## End(Not run)## Not run: # Fit a beta-binomial regression model to the PPVT data model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Use weights for post-stratification marginals <- data.frame(var = c("sex", "sex", "migration", "migration"), level = c(1,2,0,1), prop = c(0.51, 0.49, 0.65, 0.35)) weights <- computeWeights(ppvt, marginals) model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228, weights = weights) ## End(Not run)
This function fits a beta-binomial regression model where both the alpha and beta
parameters of the beta-binomial distribution are modeled as polynomial functions
of the predictor variable (typically age). While 'cnorm-betabinomial' fits a beta-binomial model
on the basis of and , this function fits a beta-binomial model directly on the basis
of and .
cnorm.betabinomial2( age, score, n = NULL, weights = NULL, alpha_degree = 3, beta_degree = 3, control = NULL, scale = "T", plot = TRUE )cnorm.betabinomial2( age, score, n = NULL, weights = NULL, alpha_degree = 3, beta_degree = 3, control = NULL, scale = "T", plot = TRUE )
age |
A numeric vector of predictor values (e.g., age). |
score |
A numeric vector of response values. |
n |
The maximum score (number of trials in the beta-binomial distribution). If NULL, max(score) is used. |
weights |
A numeric vector of weights for each observation. Default is NULL (equal weights). |
alpha_degree |
Integer specifying the degree of the polynomial for the alpha model. Default is 3. |
beta_degree |
Integer specifying the degree of the polynomial for the beta model. Default is 3. |
control |
A list of control parameters to be passed to the 'optim' function. If NULL, default values are used. |
scale |
Type of norm scale, either "T" (default), "IQ", "z" or a double vector with the mean and standard deviation. |
plot |
Logical indicating whether to plot the model. Default is TRUE. |
The function standardizes the input variables, fits polynomial models for both the alpha and beta parameters, and uses maximum likelihood estimation to find the optimal parameters. The optimization is performed using the L-BFGS-B method.
A list of class "cnormBetaBinomial2" containing:
alpha_est |
Estimated coefficients for the alpha model |
beta_est |
Estimated coefficients for the beta model |
se |
Standard errors of the estimated coefficients |
alpha_degree |
Degree of the polynomial for the alpha model |
beta_degree |
Degree of the polynomial for the beta model |
result |
Full result from the optimization procedure |
Assists in determining the optimal number of terms for the regression model using repeated Monte Carlo cross-validation. It leverages an 80-20 split between training and validation data, with stratification by norm group or random sample in case of using sliding window ranking.
cnorm.cv( data, formula = NULL, repetitions = 5, norms = TRUE, min = 1, max = 12, cv = "full", pCutoff = NULL, width = NA, raw = NULL, group = NULL, age = NULL, weights = NULL )cnorm.cv( data, formula = NULL, repetitions = 5, norms = TRUE, min = 1, max = 12, cv = "full", pCutoff = NULL, width = NA, raw = NULL, group = NULL, age = NULL, weights = NULL )
data |
Data frame of norm sample or a cnorm object. Should have ranking, powers, and interaction of L and A. |
formula |
Formula from an existing regression model; min/max functions ignored. If using a cnorm object, this is automatically fetched. |
repetitions |
Number of repetitions for cross-validation. |
norms |
If TRUE, computes norm score crossfit and R^2. Note: Computationally intensive. |
min |
Start with a minimum number of terms (default = 1). |
max |
Maximum terms in model, up to (k + 1) * (t + 1) - 1. |
cv |
"full" (default) splits data into training/validation, then ranks. Otherwise, expects a pre-ranked dataset. |
pCutoff |
Checks stratification for unbalanced data. Performs a t-test per group. Default set to 0.2 to minimize beta error. |
width |
If provided, ranking done via 'rankBySlidingWindow'. Otherwise, by group. |
raw |
Name of the raw score variable. |
group |
Name of the grouping variable. |
age |
Name of the age variable. |
weights |
Name of the weighting parameter. |
Successive models, with an increasing number of terms, are evaluated, and the RMSE for raw scores plotted. This encompasses the training, validation, and entire dataset. If 'norms' is set to TRUE (default), the function will also calculate the mean norm score reliability and crossfit measures. Note that due to the computational requirements of norm score calculations, execution can be slow, especially with numerous repetitions or terms.
When 'cv' is set to "full" (default), both test and validation datasets are ranked separately, providing comprehensive cross-validation. For a more streamlined validation process focused only on modeling, a pre-ranked dataset can be used. The output comprises RMSE for raw score models, norm score R^2, delta R^2, crossfit, and the norm score SE according to Oosterhuis, van der Ark, & Sijtsma (2016).
This function is not yet prepared for the 'extensive' search strategy, introduced in version 3.3, but instead relies on the first model per number of terms, without consistency check.
For assessing overfitting:
A CROSSFIT > 1 suggests overfitting, < 1 suggests potential underfitting, and values around 1 are optimal, given a low raw score RMSE and high norm score validation R^2.
Suggestions for ideal model selection:
Visual inspection of percentiles with 'plotPercentiles' or 'plotPercentileSeries'.
Pair visual inspection with repeated cross-validation (e.g., 10 repetitions).
Aim for low raw score RMSE and high norm score R^2, avoiding terms with significant overfit (e.g., crossfit > 1.1).
Table with results per term number: RMSE for raw scores, R^2 for norm scores, and crossfit measure.
Oosterhuis, H. E. M., van der Ark, L. A., & Sijtsma, K. (2016). Sample Size Requirements for Traditional and Regression-Based Norms. Assessment, 23(2), 191–202. https://doi.org/10.1177/1073191115580638
Other model:
bestModel(),
checkConsistency(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
## Not run: # Example: Plot cross-validation RMSE by number of terms (up to 9) with three repetitions. result <- cnorm(raw = elfe$raw, group = elfe$group) cnorm.cv(result$data, min = 2, max = 9, repetitions = 3) # Using a cnorm object examines the predefined formula. cnorm.cv(result, repetitions = 1) ## End(Not run)## Not run: # Example: Plot cross-validation RMSE by number of terms (up to 9) with three repetitions. result <- cnorm(raw = elfe$raw, group = elfe$group) cnorm.cv(result$data, min = 2, max = 9, repetitions = 3) # Using a cnorm object examines the predefined formula. cnorm.cv(result, repetitions = 1) ## End(Not run)
Launcher for the graphical user interface of cNORM for distribution free continuous norming
cNORM.GUI(launch.browser = TRUE)cNORM.GUI(launch.browser = TRUE)
launch.browser |
Default TRUE; automatically open browser for GUI |
## Not run: # Launch graphical user interface cNORM.GUI() ## End(Not run)## Not run: # Launch graphical user interface cNORM.GUI() ## End(Not run)
Launch the cNORM Parametric Modeling Shiny Application
cNORM.GUI2(launch.browser = TRUE)cNORM.GUI2(launch.browser = TRUE)
launch.browser |
Logical, whether to launch browser automatically (default: TRUE) |
## Not run: # Launch graphical user interface cNORM.GUI2() ## End(Not run)## Not run: # Launch graphical user interface cNORM.GUI2() ## End(Not run)
This function fits a Sinh-Arcsinh (shash; Jones & Pewsey, 2009) regression model for continuous norm score modeling, where the distribution parameters vary smoothly as polynomial functions of age or other predictors. The shash distribution is well-suited for psychometric data as it can flexibly model skewness and tail weight independently, making it ideal for handling floor effects, ceiling effects, and varying degrees of individual differences across age groups. In a simulation study (Lenhard et al, 2019), the shash model demonstrated superior performance compared to other parametric approaches from the Box Cox family of functions. In contrast to Box Cox, Sinh-Arcsinh can model distributions including zero and negativ values.
cnorm.shash( age, score, weights = NULL, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1, delta = 1, control = NULL, scale = "T", plot = TRUE )cnorm.shash( age, score, weights = NULL, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1, delta = 1, control = NULL, scale = "T", plot = TRUE )
age |
A numeric vector of predictor values (typically age, but can be any continuous predictor). |
score |
A numeric vector of response values (raw test scores). Must be the same length as age. The value range is unresticted and it can include zeros and negative values. |
weights |
An optional numeric vector of weights for each observation. Useful for incorporating sampling weights. If NULL (default), all observations are weighted equally. |
mu_degree |
Integer specifying the degree of the polynomial for modeling the location parameter mu(age). Default is 3. Higher degrees allow more flexible modeling of how the central tendency changes with age, but may lead to overfitting with small samples. Common choices:
|
sigma_degree |
Integer specifying the degree of the polynomial for modeling the scale parameter sigma(age). Default is 2. This controls how the variability (spread) of scores changes with age. Lower degrees are often sufficient as variability typically changes more smoothly than location. |
epsilon_degree |
Integer specifying the degree of the polynomial for modeling the skewness parameter epsilon(age). Default is 2. This controls how the asymmetry of the distribution changes with age. |
delta_degree |
Integer specifying the plynomial for modelling the tail weight parameter delte(age). Default is 1. The tail weight can be fixed as well in case of numerical instability. In that case, set 'delta_degree' to NULL and specify a value for delta instead. Recommendation: Keep delta_degree low to avoid overfitting. |
delta |
Fixed tail weight parameter (must be > 0). Default is 1. This parameter controls the heaviness of the distribution tails and is kept constant across all ages in this implementation. It is only used, if 'delta_degree' is set to NULL. Common values:
|
control |
An optional list of control parameters passed to the
Increase |
scale |
Character string or numeric vector specifying the type of norm scale for output. This affects the scaling of derived norm scores but does not influence model fitting:
|
plot |
Logical indicating whether to automatically display a diagnostic plot of the fitted model. Default is TRUE. |
This implementation uses the Jones & Pewsey (2009) parameterization of the Sinh-Arcsinh distribution. Parameters are estimated using maximum likelihood via the L-BFGS-B algorithm. In case, optimization fails, try reducing model complexity by reducing polynomial degrees or fixing the delta parameter.
The shash distribution is defined by the transformation:
where Y is a standard normal variable, Y ~ N(0,1).
This transformation provides:
mu: Location parameter (similar to mean)
sigma: Scale parameter (similar to standard deviation)
epsilon: Skewness parameter (epsilon = 0 for symmetry)
delta: Tail weight parameter (delta = 1 for normal-like tails)
Choose polynomial degrees based on:
Sample size (higher degrees need more data)
Theoretical expectations about developmental trajectories
Model comparison criteria (AIC, BIC)
Visual inspection of fitted curves
For most applications, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1 provides a good balance of flexibility and parsimony.
An object of class "cnormShash" containing the fitted model results. This is a list with components:
mu_est |
Numeric vector of estimated coefficients for the location parameter mu(age). The first coefficient is the intercept, subsequent coefficients correspond to polynomial terms. |
sigma_est |
Numeric vector of estimated coefficients for the scale parameter log(sigma(age)). Note: These are coefficients for log(sigma) to ensure sigma > 0. |
epsilon_est |
Numeric vector of estimated coefficients for the skewness parameter epsilon(age). |
delta |
The fixed tail weight parameter value used in fitting. |
delta_est |
Numeric vector of estimated coefficients for the tail weight parameter delta(age) - in case, a degree has been set. |
se |
Numeric vector of standard errors for all estimated coefficients (if Hessian computation succeeds). |
mu_degree, sigma_degree, epsilon_degree
|
The polynomial degrees used for each parameter. |
result |
Complete output from the |
The function requires the input data to have sufficient variability. Very small datasets or datasets with little age spread may cause convergence problems.
Polynomial models can exhibit edge effects at the boundaries of the age range. Predictions outside the observed age range should be made cautiously.
If convergence fails, try: (1) reducing polynomial degrees, (2) adjusting the delta parameter, (3) providing custom control parameters, or (4) checking for data quality issues.
The tail weight parameter delta is fixed across ages by default. For applications where tail behavior changes substantially with age, consider setting the delta_degree paramerer to 1 or 2.
Wolfgang Lenhard
Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. *Biometrika*, 96(4), 761-780.
Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. *PLoS ONE*, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279
plot for plotting fitted models,
predict for generating predictions,
cnorm.betabinomial2 for discrete beta-binomial alternative
## Not run: # Basic usage with default settings model <- cnorm.shash(age = children$age, score = children$raw_score) # Custom polynomial degrees for complex developmental pattern model_complex <- cnorm.shash( age = adolescents$age, score = adolescents$vocabulary_score, mu_degree = 4, # Complex mean trajectory sigma_degree = 3, # Changing variability pattern epsilon_degree = 2, # Skewness shifts delta_degree = NULL, # set to NULL to activate fixed delta delta = 1.3 # Slightly heavy tails ) # With sampling weights model_weighted <- cnorm.shash( age = survey$age, score = survey$score, weights = survey$sample_weight ) # Custom optimization control for difficult convergence model_robust <- cnorm.shash( age = mixed$age, score = mixed$score, control = list(factr = 1e-6, maxit = 2000), delta = 1.5 ) # Compare model fit compare(model, model_complex) ## End(Not run)## Not run: # Basic usage with default settings model <- cnorm.shash(age = children$age, score = children$raw_score) # Custom polynomial degrees for complex developmental pattern model_complex <- cnorm.shash( age = adolescents$age, score = adolescents$vocabulary_score, mu_degree = 4, # Complex mean trajectory sigma_degree = 3, # Changing variability pattern epsilon_degree = 2, # Skewness shifts delta_degree = NULL, # set to NULL to activate fixed delta delta = 1.3 # Slightly heavy tails ) # With sampling weights model_weighted <- cnorm.shash( age = survey$age, score = survey$score, weights = survey$sample_weight ) # Custom optimization control for difficult convergence model_robust <- cnorm.shash( age = mixed$age, score = mixed$score, control = list(factr = 1e-6, maxit = 2000), delta = 1.5 ) # Compare model fit compare(model, model_complex) ## End(Not run)
This function creates a visualization comparing two norm models by displaying their percentile curves. The first model is shown with solid lines, the second with dashed lines. If age and score vectors are provided, manifest percentiles are displayed as dots. The function works with regular cnorm models, beta-binomial models, and shash models, allowing comparison between different model types.
compare( model1, model2, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), age = NULL, score = NULL, weights = NULL, title = NULL, subtitle = NULL )compare( model1, model2, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), age = NULL, score = NULL, weights = NULL, title = NULL, subtitle = NULL )
model1 |
First model object (distribution free, beta-binomial, or shash) |
model2 |
Second model object (distribution free, beta-binomial, or shash) |
percentiles |
Vector with percentile scores, ranging from 0 to 1 (exclusive) |
age |
Optional vector with manifest age or group values |
score |
Optional vector with manifest raw score values |
weights |
Optional vector with manifest weights |
title |
Custom title for plot (optional) |
subtitle |
Custom subtitle for plot (optional) |
A ggplot object showing the comparison of both models
Other plot:
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # Compare different types of models model1 <- cnorm(group = elfe$group, raw = elfe$raw) model2 <- cnorm.betabinomial(elfe$group, elfe$raw) model3 <- cnorm.shash(elfe$group, elfe$raw) # Compare traditional cnorm with shash compare(model1, model3, age = elfe$group, score = elfe$raw) # Compare beta-binomial with shash compare(model2, model3, age = elfe$group, score = elfe$raw) ## End(Not run)## Not run: # Compare different types of models model1 <- cnorm(group = elfe$group, raw = elfe$raw) model2 <- cnorm.betabinomial(elfe$group, elfe$raw) model3 <- cnorm.shash(elfe$group, elfe$raw) # Compare traditional cnorm with shash compare(model1, model3, age = elfe$group, score = elfe$raw) # Compare beta-binomial with shash compare(model2, model3, age = elfe$group, score = elfe$raw) ## End(Not run)
The function computes powers of the norm variable e. g. T scores (location, L),
an explanatory variable, e. g. age or grade of a data frame (age, A) and the
interactions of both (L X A). The k variable indicates the degree up to which
powers and interactions are build. These predictors can be used later on in the
bestModel function to model the norm sample. Higher values of k
allow for modeling the norm sample closer, but might lead to over-fit. In general
k = 3 or k = 4 (default) is sufficient to model human performance data. For example,
k = 2 results in the variables L1, L2, A1, A2, and their interactions L1A1, L2A1, L1A2
and L2A2 (but k = 2 is usually not sufficient for the modeling). Please note, that
you do not need to use a normal rank transformed scale like T or IQ; you can
use the percentiles for the 'normValue' as well.
computePowers(data, k = 5, norm = NULL, age = NULL, t = 3, silent = FALSE)computePowers(data, k = 5, norm = NULL, age = NULL, t = 3, silent = FALSE)
data |
data.frame with the norm data |
k |
degree of the location polynomial (1..6) |
norm |
name of the norm variable in the data.frame (T scores, IQ, percentiles, ...). If 'NULL', the '"normValue"' attribute of 'data' is used. |
age |
explanatory variable (e.g. age or grade). May be a column name, a numeric vector of 'nrow(data)', 'FALSE' to disable age handling, or 'NULL' to fall back to the '"age"' attribute of 'data'. |
t |
age power parameter (1..6). If 'NULL', falls back to 'k'. |
silent |
set to TRUE to suppress messages |
The functions rankBySlidingWindow, rankByGroup, bestModel,
computePowers and prepareData are usually not called directly, but accessed
through other functions like cnorm.
data.frame with the powers and interactions of location and explanatory variable / age
bestModel
Other prepare:
prepareData(),
rankByGroup(),
rankBySlidingWindow()
## Not run: # Dataset with grade levels as grouping data.elfe <- rankByGroup(elfe) data.elfe <- computePowers(data.elfe) # Dataset with continuous age variable and k = 5 data.ppvt <- rankByGroup(ppvt) data.ppvt <- computePowers(data.ppvt, age = "age", k = 5) ## End(Not run)## Not run: # Dataset with grade levels as grouping data.elfe <- rankByGroup(elfe) data.elfe <- computePowers(data.elfe) # Dataset with continuous age variable and k = 5 data.ppvt <- rankByGroup(ppvt) data.ppvt <- computePowers(data.ppvt, age = "age", k = 5) ## End(Not run)
Computes and standardizes weights via raking to compensate for non-stratified samples. It is based on the implementation in the survey R package. It reduces data collection #' biases in the norm data by the means of post stratification, thus reducing the effect of unbalanced data in percentile estimation and norm data modeling.
computeWeights(data, population.margins, standardized = TRUE)computeWeights(data, population.margins, standardized = TRUE)
data |
data.frame with norm sample data. |
population.margins |
A data.frame including three columns, specifying the variable name in the original dataset used for data stratification, the factor level of the variable and the according population share. Please ensure, the original data does not include factor levels, not present in the population.margins. Additionally, summing up the shares of the different levels of a variable should result in a value near 1.0. The first column must specify the name of the stratification variable, the second the level and the third the proportion |
standardized |
If TRUE (default), the raking weights are scaled to weights/min(weights) |
This function computes standardized raking weights to overcome biases in norm samples. It generates weights, by drawing on the information of population shares (e. g. for sex, ethnic group, region ...) and subsequently reduces the influence of over-represented groups or increases underrepresented cases. The returned weights are either raw or standardized and scaled to be larger than 0.
Raking in general has a number of advantages over post stratification and it additionally allows cNORM to draw on larger datasets, since less cases have to be removed during stratification. To use this function, additionally to the data, a data frame with stratification variables has to be specified. The data frame should include a row with (a) the variable name, (b) the level of the variable and (c) the according population proportion.
a vector with the standardized weights
## Not run: # cNORM features a dataset on vocabulary development (ppvt) # that includes variables like sex or migration. In order # to weight the data, we have to specify the population shares. # According to census, the population includes 52% boys # (factor level 1 in the ppvt dataset) and 70% / 30% of persons # without / with a a history of migration (= 0 / 1 in the dataset). # First we set up the popolation margins with all shares of the # different levels: margins <- data.frame(variables = c("sex", "sex", "migration", "migration"), levels = c(1, 2, 0, 1), share = c(.52, .48, .7, .3)) head(margins) # Now we use the population margins to generate weights # through raking weights <- computeWeights(ppvt, margins) # There are as many different weights as combinations of # factor levels, thus only four in this specific case unique(weights) # To include the weights in the cNORM modelling, we have # to pass them as weights. They are then used to set up # weighted quantiles and as weights in the regession. model <- cnorm(raw = ppvt$raw, group=ppvt$group, weights = weights) ## End(Not run)## Not run: # cNORM features a dataset on vocabulary development (ppvt) # that includes variables like sex or migration. In order # to weight the data, we have to specify the population shares. # According to census, the population includes 52% boys # (factor level 1 in the ppvt dataset) and 70% / 30% of persons # without / with a a history of migration (= 0 / 1 in the dataset). # First we set up the popolation margins with all shares of the # different levels: margins <- data.frame(variables = c("sex", "sex", "migration", "migration"), levels = c(1, 2, 0, 1), share = c(.52, .48, .7, .3)) head(margins) # Now we use the population margins to generate weights # through raking weights <- computeWeights(ppvt, margins) # There are as many different weights as combinations of # factor levels, thus only four in this specific case unique(weights) # To include the weights in the cNORM modelling, we have # to pass them as weights. They are then used to set up # weighted quantiles and as weights in the regession. model <- cnorm(raw = ppvt$raw, group=ppvt$group, weights = weights) ## End(Not run)
In order to check model assumptions, a table of the first order derivative of the model coefficients is created.
derivationTable(A, model, minNorm = NULL, maxNorm = NULL, step = 0.1)derivationTable(A, model, minNorm = NULL, maxNorm = NULL, step = 0.1)
A |
the age |
model |
The regression model or a cnorm object |
minNorm |
The lower bound of the norm value range |
maxNorm |
The upper bound of the norm value range |
step |
Stepping parameter with lower values indicating higher precision |
data.frame with norm scores and the predicted scores based on the derived regression function
plotDerivative, derive
Other predict:
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # retrieve function for time point 6 d <- derivationTable(6, cnorm.elfe, step = 0.5) ## End(Not run)## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # retrieve function for time point 6 d <- derivationTable(6, cnorm.elfe, step = 0.5) ## End(Not run)
Calculates the derivative of the location / norm value from the regression model with the first derivative as the default. This is useful for finding violations of model assumptions and problematic distribution features as f. e. bottom and ceiling effects, non-progressive norm scores within an age group or in general #' intersecting percentile curves.
derive(model, order = 1)derive(model, order = 1)
model |
The regression model or a cnorm object |
order |
The degree of the derivate, default: 1 |
The derived coefficients
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
## Not run: m <- cnorm(raw = elfe$raw, group = elfe$group) derivedCoefficients <- derive(m, order=1) ## End(Not run)## Not run: m <- cnorm(raw = elfe$raw, group = elfe$group) derivedCoefficients <- derive(m, order=1) ## End(Not run)
This function provides diagnostic information for a fitted beta-binomial model from the cnorm.betabinomial function. It returns various metrics related to model convergence, fit, and complexity. In case, age and raw scores are provided, the function as well computes R2, rmse and bias for the norm scores based on the manifest and predicted norm scores.
diagnostics.betabinomial(model, age = NULL, score = NULL, weights = NULL)diagnostics.betabinomial(model, age = NULL, score = NULL, weights = NULL)
model |
An object of class "cnormBetaBinomial", typically the result of a call to cnorm.betabinomial(). |
age |
An optional vector with age values |
score |
An optional vector with raw values |
weights |
An optional vector with weights |
The AIC and BIC are calculated as: AIC = 2k - 2ln(L) BIC = ln(n)k - 2ln(L) where k is the number of parameters, L is the maximum likelihood, and n is the number of observations.
A list containing the following diagnostic information:
converged: Logical indicating whether the optimization algorithm converged.
n_evaluations: Number of function evaluations performed during optimization.
n_gradient: Number of gradient evaluations performed during optimization.
final_value: Final value of the objective function (negative log-likelihood).
message: Any message returned by the optimization algorithm.
AIC: Akaike Information Criterion.
BIC: Bayesian Information Criterion.
max_gradient: Maximum absolute gradient at the solution (if available).
## Not run: # Fit a beta-binomial model model <- cnorm.betabinomial(ppvt$age, ppvt$raw) # Get diagnostic information diag_info <- diagnostics.betabinomial(model) # Print the diagnostic information print(diag_info) # Summary the diagnostic information summary(diag_info) # Check if the model converged if(diag_info$converged) { cat("Model converged successfully.\n") } else { cat("Warning: Model did not converge.\n") } # Compare AIC and BIC cat("AIC:", diag_info$AIC, "\n") cat("BIC:", diag_info$BIC, "\n") ## End(Not run)## Not run: # Fit a beta-binomial model model <- cnorm.betabinomial(ppvt$age, ppvt$raw) # Get diagnostic information diag_info <- diagnostics.betabinomial(model) # Print the diagnostic information print(diag_info) # Summary the diagnostic information summary(diag_info) # Check if the model converged if(diag_info$converged) { cat("Model converged successfully.\n") } else { cat("Warning: Model did not converge.\n") } # Compare AIC and BIC cat("AIC:", diag_info$AIC, "\n") cat("BIC:", diag_info$BIC, "\n") ## End(Not run)
A dataset containing the raw data of 1400 students from grade 2 to 5 in the sentence comprehension test from ELFE 1-6 (Lenhard & Schneider, 2006). In this test, students are presented lists of sentences with one gap. The student has to fill in the correct solution by selecting from a list of 5 alternatives per sentence. The alternatives include verbs, adjectives, nouns, pronouns and conjunctives. Each item stems from the same word type. The text is speeded, with a time cutoff of 180 seconds. The variables are as follows:
elfeelfe
A data frame with 1400 rows and 3 variables:
ID of the student
grade level, with x.5 indicating the end of the school year and x.0 indicating the middle of the school year
the raw score of the student, spanning values from 0 to 28
A data frame with 1400 rows and 3 columns
https://www.psychometrica.de/elfe2.html
Lenhard, W. & Schneider, W.(2006). Ein Leseverstaendnistest fuer Erst- bis Sechstklaesser. Goettingen/Germany: Hogrefe.
## Not run: # prepare data, retrieve model and plot percentiles model <- cnorm(group = elfe$group, raw = elfe$raw) ## End(Not run)## Not run: # prepare data, retrieve model and plot percentiles model <- cnorm(group = elfe$group, raw = elfe$raw) ## End(Not run)
Splits a continuous variable into n groups (either equidistant or
with equal number of observations) and returns the group mean for every
observation.
getGroups(x, n = NULL, equidistant = FALSE)getGroups(x, n = NULL, equidistant = FALSE)
x |
Numeric vector to be split. |
n |
Desired number of groups. If |
equidistant |
If |
Numeric vector of the same length as x with the group mean
assigned to each observation.
## Not run: x <- rnorm(1000, mean = 50, sd = 10) m <- getGroups(x, n = 10) ## End(Not run)## Not run: x <- rnorm(1000, mean = 50, sd = 10) m <- getGroups(x, n = 10) ## End(Not run)
As with this continuous norming regression approach, raw scores are modeled as a function of age and norm score (location), getNormCurve is a straightforward approach to show the raw score development over age, while keeping the norm value constant. This way, e. g. academic performance or intelligence development of a specific ability is shown.
getNormCurve( norm, model, minAge = NULL, maxAge = NULL, step = 0.1, minRaw = NULL, maxRaw = NULL )getNormCurve( norm, model, minAge = NULL, maxAge = NULL, step = 0.1, minRaw = NULL, maxRaw = NULL )
norm |
The specific norm score, e. g. T value |
model |
The model from the regression modeling obtained with the cnorm function |
minAge |
Age to start from |
maxAge |
Age to stop at |
step |
Stepping parameter for the precision when retrieving of the values, lower values indicate higher precision (default 0.1). |
minRaw |
lower bound of the range of raw scores (default = 0) |
maxRaw |
upper bound of raw scores |
data.frame of the variables raw, age and norm
Other predict:
derivationTable(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) getNormCurve(35, cnorm.elfe) ## End(Not run)## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) getNormCurve(35, cnorm.elfe) ## End(Not run)
Calculates the standard error (SE) or root mean square error (RMSE) of the norm scores In case of large datasets, both results should be almost identical
getNormScoreSE(model, type = 2)getNormScoreSE(model, type = 2)
model |
a cnorm object |
type |
either '1' for the standard error senso Oosterhuis et al. (2016) or '2' for the RMSE (default) |
The standard error (SE) of the norm scores sensu Oosterhuis et al. (2016) or the RMSE
Oosterhuis, H. E. M., van der Ark, L. A., & Sijtsma, K. (2016). Sample Size Requirements for Traditional and Regression-Based Norms. Assessment, 23(2), 191–202. https://doi.org/10.1177/1073191115580638
Prints the results and regression function of a cnorm model
modelSummary(object, ...)modelSummary(object, ...)
object |
A regression model or cnorm object |
... |
additional parameters |
A report on the regression function, weights, R2 and RMSE
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
This function generates a norm table for a specific age based on the regression model by assigning raw scores to norm scores. Please specify the range of norm scores, you want to cover. A T value of 25 corresponds to a percentile of .6. As a consequence, specifying a range of T = 25 to T = 75 would cover 98.4 the population. Please be careful when extrapolating vertically (at the lower and upper end of the age specific distribution). Depending on the size of your standardization sample, extreme values with T < 20 or T > 80 might lead to inconsistent results. In case a confidence coefficient (CI, default .9) and the reliability is specified, confidence intervals are computed for the true score estimates, including a correction for regression to the mean (Eid & Schmidt, 2012, p. 272).
normTable( A, model, minNorm = NULL, maxNorm = NULL, minRaw = NULL, maxRaw = NULL, step = NULL, monotonuous = TRUE, CI = 0.9, reliability = NULL, pretty = TRUE )normTable( A, model, minNorm = NULL, maxNorm = NULL, minRaw = NULL, maxRaw = NULL, step = NULL, monotonuous = TRUE, CI = 0.9, reliability = NULL, pretty = TRUE )
A |
the age as single value or a vector of age values |
model |
The regression model from the cnorm function |
minNorm |
The lower bound of the norm score range |
maxNorm |
The upper bound of the norm score range |
minRaw |
clipping parameter for the lower bound of raw scores |
maxRaw |
clipping parameter for the upper bound of raw scores |
step |
Stepping parameter with lower values indicating higher precision |
monotonuous |
corrects for decreasing norm scores in case of model inconsistencies (default) |
CI |
confidence coefficient, ranging from 0 to 1, default .9 |
reliability |
coefficient, ranging between 0 to 1 |
pretty |
Format table by collapsing intervals and rounding to meaningful precision |
either data.frame with norm scores, predicted raw scores and percentiles in case of simple A value or a list of norm tables if vector of A values was provided
Eid, M. & Schmidt, K. (2012). Testtheorie und Testkonstruktion. Hogrefe.
rawTable
Other predict:
derivationTable(),
getNormCurve(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # create single norm table norms <- normTable(3.5, cnorm.elfe, minNorm = 25, maxNorm = 75, step = 0.5) # create list of norm tables norms <- normTable(c(2.5, 3.5, 4.5), cnorm.elfe, minNorm = 25, maxNorm = 75, step = 1, minRaw = 0, maxRaw = 26 ) # conventional norming, set age to arbitrary value model <- cnorm(raw = elfe$raw) normTable(0, model) ## End(Not run)## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # create single norm table norms <- normTable(3.5, cnorm.elfe, minNorm = 25, maxNorm = 75, step = 0.5) # create list of norm tables norms <- normTable(c(2.5, 3.5, 4.5), cnorm.elfe, minNorm = 25, maxNorm = 75, step = 1, minRaw = 0, maxRaw = 26 ) # conventional norming, set age to arbitrary value model <- cnorm(raw = elfe$raw) normTable(0, model) ## End(Not run)
This function generates a norm table for a specific ages based on the beta binomial regression model. In case a confidence coefficient (CI, default .9) and the reliability is specified, confidence intervals are computed for the true score estimates, including a correction for regression to the mean (Eid & Schmidt, 2012, p. 272).
normTable.betabinomial( model, ages, n = NULL, m = NULL, range = 3, CI = 0.9, reliability = NULL )normTable.betabinomial( model, ages, n = NULL, m = NULL, range = 3, CI = 0.9, reliability = NULL )
model |
The model, which was fitted using the 'optimized.model' function. |
ages |
A numeric vector of age points at which to make predictions. |
n |
The number of items resp. the maximum score. |
m |
An optional stop criterion in table generation. Positive integer lower than n. |
range |
The range of the norm scores in standard deviations. Default is 3. Thus, scores in the range of +/- 3 standard deviations are considered. |
CI |
confidence coefficient, ranging from 0 to 1, default .9 |
reliability |
coefficient, ranging between 0 to 1 |
A list of data frames with columns: x, Px, Pcum, Percentile, z, norm score and possibly confidence interval
Generates norm tables for specific ages based on a fitted SinH-ArcSinH (shash) regression model. Computes probabilities, percentiles, z-scores, and norm scores for a specified range of raw scores. Optionally includes confidence intervals when reliability is provided.
normTable.shash( model, ages, start = NULL, end = NULL, step = 1, CI = 0.9, reliability = NULL )normTable.shash( model, ages, start = NULL, end = NULL, step = 1, CI = 0.9, reliability = NULL )
model |
Fitted shash model object of class "cnormShash" |
ages |
Numeric vector of age points for norm table generation |
start |
Minimum raw score value for the norm table |
end |
Maximum raw score value for the norm table |
step |
Step size between consecutive raw scores (default: 1) |
CI |
Confidence coefficient (0-1, default: 0.9) for confidence intervals |
reliability |
Reliability coefficient (0-1) for true score confidence intervals |
For continuous shash distributions, probability densities are computed and converted to cumulative probabilities and percentiles. When reliability is specified, confidence intervals include correction for regression to the mean.
List of data frames (one per age) containing:
x |
Raw scores |
Px |
Probability density values |
Pcum |
Cumulative probabilities |
Percentile |
Percentile ranks (0-100) |
z |
Standardized z-scores |
norm |
Norm scores in specified scale |
lowerCI, upperCI
|
Confidence intervals (if reliability provided) |
lowerCI_PR, upperCI_PR
|
CI as percentile ranks (if reliability provided) |
## Not run: # Basic norm table model <- cnorm.shash(age, score) tables <- normTable.shash(model, ages = c(7, 8, 9), start = 0, end = 50) # With confidence intervals and finer granularity tables_ci <- normTable.shash(model, ages = c(8, 9), start = 10, end = 40, step = 0.5, CI = 0.95, reliability = 0.85) ## End(Not run)## Not run: # Basic norm table model <- cnorm.shash(age, score) tables <- normTable.shash(model, ages = c(7, 8, 9), start = 0, end = 50) # With confidence intervals and finer granularity tables_ci <- normTable.shash(model, ages = c(8, 9), start = 10, end = 40, step = 0.5, CI = 0.95, reliability = 0.85) ## End(Not run)
S3 function for plotting cnorm objects
## S3 method for class 'cnorm' plot(x, y, ...)## S3 method for class 'cnorm' plot(x, y, ...)
x |
the cnorm object |
y |
the type of plot as a string, can be one of 'raw' (1), 'norm' (2), 'curves' (3), 'percentiles' (4), 'series' (5), 'subset' (6), or 'derivative' (7), either as a string or the according index |
... |
additional parameters for the specific plotting function |
Other plot:
compare(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
This function creates a visualization of a fitted cnormBetaBinomial model, including the original data points manifest percentiles and specified percentile lines. Note that the beta-binomial model aims at discrete raw scores. We decided to display continuous percentile lines nonetheless on order to maintain visual comparability with other modelling techniques. If you prefer discretization, set the "discrete" parameter to TRUE.
## S3 method for class 'cnormBetaBinomial' plot(x, ...)## S3 method for class 'cnormBetaBinomial' plot(x, ...)
x |
A fitted model object of class "cnormBetaBinomial" or "cnormBetaBinomial2". |
... |
Additional arguments passed to the plot method.
|
A ggplot object.
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
This function creates a visualization of a fitted cnormBetaBinomial model, including the original data points manifest percentiles and specified percentile lines.
## S3 method for class 'cnormBetaBinomial2' plot(x, ...)## S3 method for class 'cnormBetaBinomial2' plot(x, ...)
x |
A fitted model object of class "cnormBetaBinomial" or "cnormBetaBinomial2". |
... |
Additional arguments passed to the plot method.
|
A ggplot object.
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
Plot SinH-ArcSinH Model with Data and Percentile Lines
## S3 method for class 'cnormShash' plot(x, ...)## S3 method for class 'cnormShash' plot(x, ...)
x |
A fitted model object of class "cnormShash" |
... |
Additional arguments including age, score, weights, percentiles, points |
A ggplot object
General convenience plotting function
plotCnorm(x, y, ...)plotCnorm(x, y, ...)
x |
a cnorm object |
y |
the type of plot as a string, can be one of 'raw' (1), 'norm' (2), 'curves' (3), 'percentiles' (4), 'series' (5), 'subset' (6), or 'derivative' (7), either as a string or the according index |
... |
additional parameters for the specific plotting function |
This function plots density curves based on the regression model against the raw scores. It supports both traditional continuous norming models and beta-binomial models. The function allows for customization of the plot range and groups to be displayed.
plotDensity( model, minRaw = NULL, maxRaw = NULL, minNorm = NULL, maxNorm = NULL, group = NULL )plotDensity( model, minRaw = NULL, maxRaw = NULL, minNorm = NULL, maxNorm = NULL, group = NULL )
model |
The model from the bestModel function, a cnorm object, a cnormBetaBinomial, a cnormBetaBinomial2 or cnormShash object. |
minRaw |
Lower bound of the raw score. If NULL, it's automatically determined based on the model type. |
maxRaw |
Upper bound of the raw score. If NULL, it's automatically determined based on the model type. |
minNorm |
Lower bound of the norm score. If NULL, it's automatically determined based on the model type. |
maxNorm |
Upper bound of the norm score. If NULL, it's automatically determined based on the model type. |
group |
Numeric vector specifying the age groups to plot. If NULL, groups are automatically selected. |
The function generates density curves for specified age groups, allowing for easy comparison of score distributions across different ages.
For beta-binomial models, the density is based on the probability mass function, while for traditional models, it uses a normal distribution based on the norm scores.
A ggplot object representing the density functions.
Please check for inconsistent curves, especially those showing implausible shapes such as violations of biuniqueness in the cnorm models.
plotNormCurves, plotPercentiles
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # For traditional continuous norming model result <- cnorm(raw = elfe$raw, group = elfe$group) plotDensity(result, group = c(2, 4, 6)) # For beta-binomial model bb_model <- cnorm.betabinomial(age = ppvt$age, score = ppvt$raw, n = 228) plotDensity(bb_model) ## End(Not run)## Not run: # For traditional continuous norming model result <- cnorm(raw = elfe$raw, group = elfe$group) plotDensity(result, group = c(2, 4, 6)) # For beta-binomial model bb_model <- cnorm.betabinomial(age = ppvt$age, score = ppvt$raw, n = 228) plotDensity(bb_model) ## End(Not run)
This function plots the scores obtained via the first order derivative of the regression model in dependence of the norm score.
plotDerivative( model, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, stepAge = NULL, stepNorm = NULL, order = 1 )plotDerivative( model, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, stepAge = NULL, stepNorm = NULL, order = 1 )
model |
The model from the bestModel function, a cnorm object. |
minAge |
Minimum age to start checking. If NULL, it's automatically determined from the model. |
maxAge |
Maximum age for checking. If NULL, it's automatically determined from the model. |
minNorm |
Lower end of the norm score range. If NULL, it's automatically determined from the model. |
maxNorm |
Upper end of the norm score range. If NULL, it's automatically determined from the model. |
stepAge |
Stepping parameter for the age check, usually 1 or 0.1; lower values indicate higher precision. |
stepNorm |
Stepping parameter for norm scores. |
order |
Degree of the derivative (default = 1). |
The results indicate the progression of the norm scores within each age group. The regression-based modeling approach relies on the assumption of a linear progression of the norm scores. Negative scores in the first order derivative indicate a violation of this assumption. Scores near zero are typical for bottom and ceiling effects in the raw data.
The regression models usually converge within the range of the original values. In case of vertical and horizontal extrapolation, with increasing distance to the original data, the risk of assumption violation increases as well.
A ggplot object representing the derivative of the regression function.
This function is currently incompatible with reversed raw score scales ('descent' option).
checkConsistency, bestModel, derive
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # For traditional continuous norming model result <- cnorm(raw = elfe$raw, group = elfe$group) plotDerivative(result, minAge=2, maxAge=5, stepAge=.2, minNorm=25, maxNorm=75, stepNorm=1) ## End(Not run)## Not run: # For traditional continuous norming model result <- cnorm(raw = elfe$raw, group = elfe$group) plotDerivative(result, minAge=2, maxAge=5, stepAge=.2, minNorm=25, maxNorm=75, stepNorm=1) ## End(Not run)
This function plots the manifest norm score against the fitted norm score from the inverse regression model per group. This helps to inspect the precision of the modeling process. The scores should not deviate too far from the regression line. Applicable for Taylor polynomial, beta-binomial, and shash models.
plotNorm( model, age = NULL, score = NULL, width = NULL, weights = NULL, group = FALSE, minNorm = NULL, maxNorm = NULL, type = 0 )plotNorm( model, age = NULL, score = NULL, width = NULL, weights = NULL, group = FALSE, minNorm = NULL, maxNorm = NULL, type = 0 )
model |
The regression model, usually from the 'cnorm', 'cnorm.betabinomial', or 'cnorm.shash' function |
age |
In case of parametric models, please provide the age vector |
score |
In case of parametric models, please provide the score vector |
width |
In case of parametric models, please provide the width for the sliding window. If null, the function tries to determine a sensible setting. |
weights |
Vector or variable name in the dataset with weights for each individual case. If NULL, no weights are used. |
group |
On optional grouping variable, use empty string for no group, the variable name for Taylor polynomial models or a vector with the groups for parametric models |
minNorm |
lower bound of fitted norm scores |
maxNorm |
upper bound of fitted norm scores |
type |
Type of display: 0 = plot manifest against fitted values, 1 = plot manifest against difference values |
A ggplot object representing the norm scores plot.
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # Load example data set, compute model and plot results # Taylor polynomial model model <- cnorm(raw = elfe$raw, group = elfe$group) plot(model, "norm") # Beta binomial models; maximum number of items in elfe is n = 28 model.bb <- cnorm.betabinomial(elfe$group, elfe$raw, n = 28) plotNorm(model.bb, age = elfe$group, score = elfe$raw) ## End(Not run)## Not run: # Load example data set, compute model and plot results # Taylor polynomial model model <- cnorm(raw = elfe$raw, group = elfe$group) plot(model, "norm") # Beta binomial models; maximum number of items in elfe is n = 28 model.bb <- cnorm.betabinomial(elfe$group, elfe$raw, n = 28) plotNorm(model.bb, age = elfe$group, score = elfe$raw) ## End(Not run)
This function plots the norm curves based on the regression model. It supports both Taylor polynomial models, beta-binomial models, and shash models.
plotNormCurves( model, normList = NULL, minAge = NULL, maxAge = NULL, step = 0.1, minRaw = NULL, maxRaw = NULL )plotNormCurves( model, normList = NULL, minAge = NULL, maxAge = NULL, step = 0.1, minRaw = NULL, maxRaw = NULL )
model |
The model from the bestModel function, a cnorm object, or a cnormBetaBinomial / cnormBetaBinomial2 / cnormShash object. |
normList |
Vector with norm scores to display. If NULL, default values are used. |
minAge |
Age to start with checking. If NULL, it's automatically determined from the model. |
maxAge |
Upper end of the age check. If NULL, it's automatically determined from the model. |
step |
Stepping parameter for the age check, usually 1 or 0.1; lower scores indicate higher precision. |
minRaw |
Lower end of the raw score range, used for clipping implausible results. If NULL, it's automatically determined from the model. |
maxRaw |
Upper end of the raw score range, used for clipping implausible results. If NULL, it's automatically determined from the model. |
Please check the function for inconsistent curves: The different curves should not intersect. Violations of this assumption are a strong indication of violations of model assumptions in modeling the relationship between raw and norm scores.
Common reasons for inconsistencies include: 1. Vertical extrapolation: Choosing extreme norm scores (e.g., scores <= -3 or >= 3). 2. Horizontal extrapolation: Using the model scores outside the original dataset. 3. The data cannot be modeled with the current approach, or you need another power parameter (k) or R2 for the model.
A ggplot object representing the norm curves.
checkConsistency, plotDerivative, plotPercentiles
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # For Taylor continuous norming model m <- cnorm(raw = ppvt$raw, group = ppvt$group) plotNormCurves(m, minAge=2, maxAge=5) # For beta-binomial model bb_model <- cnorm.betabinomial(age = ppvt$age, score = ppvt$raw, n = 228) plotNormCurves(bb_model) ## End(Not run)## Not run: # For Taylor continuous norming model m <- cnorm(raw = ppvt$raw, group = ppvt$group) plotNormCurves(m, minAge=2, maxAge=5) # For beta-binomial model bb_model <- cnorm.betabinomial(age = ppvt$age, score = ppvt$raw, n = 228) plotNormCurves(bb_model) ## End(Not run)
The function plots the norm curves based on the regression model against the actual percentiles from the raw data. As in 'plotNormCurves', please check for inconsistent curves, especially intersections. Violations of this assumption are a strong indication for problems in modeling the relationship between raw and norm scores. In general, extrapolation (point 1 and 2) can carefully be done to a certain degree outside the original sample, but it should in general be handled with caution. The original percentiles are displayed as distinct points in the according color, the model based projection of percentiles are drawn as lines. Please note, that the estimation of the percentiles of the raw data is done with the quantile function with the default settings. In case, you get 'jagged' or disorganized percentile curve, try to reduce the 'k' and/or 't' parameter in modeling.
plotPercentiles( model, minRaw = NULL, maxRaw = NULL, minAge = NULL, maxAge = NULL, raw = NULL, group = NULL, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), scale = NULL, title = NULL, subtitle = NULL, points = FALSE )plotPercentiles( model, minRaw = NULL, maxRaw = NULL, minAge = NULL, maxAge = NULL, raw = NULL, group = NULL, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), scale = NULL, title = NULL, subtitle = NULL, points = FALSE )
model |
The Taylor polynomial regression model object from the cNORM |
minRaw |
Lower bound of the raw score (default = 0) |
maxRaw |
Upper bound of the raw score |
minAge |
Variable to restrict the lower bound of the plot to a specific age |
maxAge |
Variable to restrict the upper bound of the plot to a specific age |
raw |
The name of the raw variable |
group |
The name of the grouping variable; the distinct groups are automatically determined |
percentiles |
Vector with percentile scores, ranging from 0 to 1 (exclusive) |
scale |
The norm scale, either 'T', 'IQ', 'z', 'percentile' or self defined with a double vector with the mean and standard deviation, f. e. c(10, 3) for Wechsler scale index points; if NULL, scale information from the data preparation is used (default) |
title |
custom title for plot |
subtitle |
custom title for plot |
points |
Logical indicating whether to plot the data points. Default is TRUE. |
plotNormCurves, plotPercentileSeries
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotRaw(),
plotSubset()
## Not run: # Load example data set, compute model and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotPercentiles(result) ## End(Not run)## Not run: # Load example data set, compute model and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotPercentiles(result) ## End(Not run)
This functions makes use of 'plotPercentiles' to generate a series of plots with different number of predictors. It draws on the information provided by the model object to determine the bounds of the modeling (age and standard score range). It can be used as an additional model check to determine the best fitting model. Please have a look at the ' plotPercentiles' function for further information.
plotPercentileSeries( model, start = 1, end = NULL, group = NULL, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), filename = NULL )plotPercentileSeries( model, start = 1, end = NULL, group = NULL, percentiles = c(0.025, 0.1, 0.25, 0.5, 0.75, 0.9, 0.975), filename = NULL )
model |
The Taylor polynomial regression model object from the cNORM |
start |
Number of predictors to start with |
end |
Number of predictors to end with |
group |
The name of the grouping variable; the distinct groups are automatically determined |
percentiles |
Vector with percentile scores, ranging from 0 to 1 (exclusive) |
filename |
Prefix of the filename. If specified, the plots are saves as png files in the directory of the workspace, instead of displaying them |
the complete list of plots
plotPercentiles
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentiles(),
plotRaw(),
plotSubset()
## Not run: # Load example data set, compute model and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotPercentileSeries(result, start=4, end=6) ## End(Not run)## Not run: # Load example data set, compute model and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotPercentileSeries(result, start=4, end=6) ## End(Not run)
The function plots the raw data against the fitted scores from the regression model per group. This helps to inspect the precision of the modeling process. The scores should not deviate too far from regression line.
plotRaw(model, group = FALSE, type = 0)plotRaw(model, group = FALSE, type = 0)
model |
The regression model from the 'cnorm' function |
group |
Should the fit be displayed by group? |
type |
Type of display: 0 = plot manifest against fitted values, 1 = plot manifest against difference values |
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotSubset()
## Not run: # Compute model with example dataset and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotRaw(result) ## End(Not run)## Not run: # Compute model with example dataset and plot results result <- cnorm(raw = elfe$raw, group = elfe$group) plotRaw(result) ## End(Not run)
This function plots various information criteria and model fit statistics against the number of predictors or adjusted R-squared, depending on the type of plot selected. It helps in model selection by visualizing different aspects of model performance. Models, which did not pass the initial consistency check are depicted with an empty circle.
plotSubset(model, type = 0)plotSubset(model, type = 0)
model |
The regression model from the bestModel function or a cnorm object. |
type |
Integer specifying the type of plot to generate:
|
The function generates different plots to help in model selection:
- For types 1 and 2 (Mallow's Cp and BIC), look for the "elbow" in the curve where the information criterion begins to drop. This often indicates a good balance between model fit and complexity. - For type 0 (Adjusted R2), higher values indicate better fit, but be cautious of overfitting with values approaching 1. - For types 3 and 4 (RMSE and RSS), lower values indicate better fit. - For type 5 (F-test), higher values suggest significant improvement with added predictors. - For type 6 (p-values), values below the significance level (typically 0.05) suggest significant improvement with added predictors.
A ggplot object representing the selected information criterion plot.
It's important to balance statistical measures with practical considerations and
to visually inspect the model fit using functions like plotPercentiles.
bestModel, plotPercentiles, printSubset
Other plot:
compare(),
plot.cnorm(),
plot.cnormBetaBinomial(),
plot.cnormBetaBinomial2(),
plotDensity(),
plotDerivative(),
plotNorm(),
plotNormCurves(),
plotPercentileSeries(),
plotPercentiles(),
plotRaw()
## Not run: # Compute model with example data and plot information function cnorm.model <- cnorm(raw = elfe$raw, group = elfe$group) plotSubset(cnorm.model) # Plot BIC against adjusted R-squared plotSubset(cnorm.model, type = 2) # Plot RMSE against number of predictors plotSubset(cnorm.model, type = 3) ## End(Not run)## Not run: # Compute model with example data and plot information function cnorm.model <- cnorm(raw = elfe$raw, group = elfe$group) plotSubset(cnorm.model) # Plot BIC against adjusted R-squared plotSubset(cnorm.model, type = 2) # Plot RMSE against number of predictors plotSubset(cnorm.model, type = 3) ## End(Not run)
A dataset based on an unstratified sample of PPVT4 data (German adaption). The PPVT4 consists of blocks of items with 12 items each. Each item consists of 4 pictures. The test taker is given a word orally and he or she has to point out the picture matching the oral word. Bottom and ceiling blocks of items are determined according to age and performance. For instance, when a student knows less than 4 word from a block of 12 items, the testing stops. The sample is not identical with the norm sample and includes doublets of cases in order to align the sample size per age group. It is primarily intended for running the cNORM analyses with regard to modeling and stratification.
ppvtppvt
A data frame with 4542 rows and 6 variables:
the chronological age of the child
the sex of the test taker, 1=male, 2=female
migration status of the family, 0=no, 1=yes
factor specifying the region, the data were collected; grouped into south, north, east and west
the raw score of the student, spanning values from 0 to 228
age group of the child, determined by the getGroups()-function with 12 equidistant age groups
A data frame with 5600 rows and 9 columns
https://www.psychometrica.de/ppvt4.html
Lenhard, A., Lenhard, W., Segerer, R. & Suggate, S. (2015). Peabody Picture Vocabulary Test - Revision IV (Deutsche Adaption). Frankfurt a. M./Germany: Pearson Assessment.
## Not run: # Example with continuous age variable, ranked with sliding window model.ppvt.sliding <- cnorm(age=ppvt$age, raw=ppvt$raw, width=1) # Example with age groups; you might first want to experiment with # the granularity of the groups via the 'getGroups()' function model.ppvt.group <- cnorm(group=ppvt$group, raw=ppvt$raw) # with predefined groups model.ppvt.group <- cnorm(group=getGroups(ppvt$age, n=15, equidistant = T), raw=ppvt$raw) # groups built 'on the fly' # plot information function plot(model.ppvt.group, "subset") # check model consistency checkConsistency(model.ppvt.group) # plot percentiles plot(model.ppvt.group, "percentiles") ## End(Not run)## Not run: # Example with continuous age variable, ranked with sliding window model.ppvt.sliding <- cnorm(age=ppvt$age, raw=ppvt$raw, width=1) # Example with age groups; you might first want to experiment with # the granularity of the groups via the 'getGroups()' function model.ppvt.group <- cnorm(group=ppvt$group, raw=ppvt$raw) # with predefined groups model.ppvt.group <- cnorm(group=getGroups(ppvt$age, n=15, equidistant = T), raw=ppvt$raw) # groups built 'on the fly' # plot information function plot(model.ppvt.group, "subset") # check model consistency checkConsistency(model.ppvt.group) # plot percentiles plot(model.ppvt.group, "percentiles") ## End(Not run)
This function calculates norm scores based on raw scores, age, and a fitted cnormBetaBinomial model.
## S3 method for class 'cnormBetaBinomial' predict(object, ...)## S3 method for class 'cnormBetaBinomial' predict(object, ...)
object |
A fitted model object of class 'cnormBetaBinomial' or 'cnormBetaBinomial2'. |
... |
Additional arguments passed to the prediction method:
|
The function first predicts the alpha and beta parameters of the beta-binomial distribution for each age using the provided model. It then calculates the cumulative probability for each raw score given these parameters. Finally, it converts these probabilities to the norm scale specified in the model.
A numeric vector of norm scores.
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Assuming you have a fitted model named 'bb_model': model <- cnorm.betabinomial(ppvt$age, ppvt$raw) raw <- c(100, 121, 97, 180) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw) ## End(Not run)## Not run: # Assuming you have a fitted model named 'bb_model': model <- cnorm.betabinomial(ppvt$age, ppvt$raw) raw <- c(100, 121, 97, 180) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw) ## End(Not run)
This function calculates norm scores based on raw scores, age, and a fitted cnormBetaBinomial model.
## S3 method for class 'cnormBetaBinomial2' predict(object, ...)## S3 method for class 'cnormBetaBinomial2' predict(object, ...)
object |
A fitted model object of class 'cnormBetaBinomial' or 'cnormBetaBinomial2'. |
... |
Additional arguments passed to the prediction method:
|
The function first predicts the alpha and beta parameters of the beta-binomial distribution for each age using the provided model. It then calculates the cumulative probability for each raw score given these parameters. Finally, it converts these probabilities to the norm scale specified in the model.
A numeric vector of norm scores.
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormShash(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Assuming you have a fitted model named 'bb_model': model <- cnorm.betabinomial(ppvt$age, ppvt$raw) raw <- c(100, 121, 97, 180) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw) ## End(Not run)## Not run: # Assuming you have a fitted model named 'bb_model': model <- cnorm.betabinomial(ppvt$age, ppvt$raw) raw <- c(100, 121, 97, 180) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw) ## End(Not run)
This function calculates norm scores based on raw scores, age, and a fitted cnormShash model.
## S3 method for class 'cnormShash' predict(object, ...)## S3 method for class 'cnormShash' predict(object, ...)
object |
A fitted model object of class 'cnormShash'. |
... |
Additional arguments passed to the prediction method:
|
The function predicts the SinH-ArcSinH (shash) distribution parameters (mu, sigma, epsilon, delta) for each age using the provided model. It then calculates the cumulative probability for each raw score given these parameters using the continuous shash distribution. Finally, it converts these probabilities to the norm scale specified in the model.
A numeric vector of norm scores.
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predictNorm(),
predictRaw(),
rawTable()
## Not run: # Assuming you have a fitted model named 'shash_model': model <- cnorm.shash(children$age, children$score) raw_scores <- c(25.5, 30.2, 18.7, 45.3) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw_scores) ## End(Not run)## Not run: # Assuming you have a fitted model named 'shash_model': model <- cnorm.shash(children$age, children$score) raw_scores <- c(25.5, 30.2, 18.7, 45.3) ages <- c(7, 8, 9, 10) norm_scores <- predict(model, ages, raw_scores) ## End(Not run)
This functions numerically determines the norm score for raw scores depending on the level of the explanatory variable A, e. g. norm scores for raw scores at given ages.
predictNorm( raw, A, model, minNorm = NULL, maxNorm = NULL, force = TRUE, silent = FALSE )predictNorm( raw, A, model, minNorm = NULL, maxNorm = NULL, force = TRUE, silent = FALSE )
raw |
The raw value, either single numeric or numeric vector |
A |
the explanatory variable (e. g. age), either single numeric or numeric vector |
model |
The regression model or a cnorm object |
minNorm |
The lower bound of the norm score range |
maxNorm |
The upper bound of the norm score range |
force |
Try to resolve missing norm scores in case of inconsistent models |
silent |
set to TRUE to suppress messages |
The predicted norm score for a raw score, either single value or vector
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictRaw(),
rawTable()
## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # return norm value for raw value 21 for grade 2, month 9 specificNormValue <- predictNorm(raw = 21, A = 2.75, cnorm.elfe) ## End(Not run)## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # return norm value for raw value 21 for grade 2, month 9 specificNormValue <- predictNorm(raw = 21, A = 2.75, cnorm.elfe) ## End(Not run)
Most elementary function to predict raw score based on Location (L, T score), Age (grouping variable) and the coefficients from a regression model.
predictRaw(norm, age, coefficients, minRaw = -Inf, maxRaw = Inf)predictRaw(norm, age, coefficients, minRaw = -Inf, maxRaw = Inf)
norm |
The norm score, e. g. a specific T score or a vector of scores |
age |
The age value or a vector of scores |
coefficients |
The a cnorm object or the coefficients from the regression model |
minRaw |
Minimum score for the results; can be used for clipping unrealistic outcomes, usually set to the lower bound of the range of values of the test (default: 0) |
maxRaw |
Maximum score for the results; can be used for clipping unrealistic outcomes usually set to the upper bound of the range of values of the test |
the predicted raw score or a data.frame of scores in case, lists of norm scores or age is used
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
rawTable()
## Not run: # Prediction of single scores model <- cnorm(raw = elfe$raw, group = elfe$group) predictRaw(35, 3.5, model) ## End(Not run)## Not run: # Prediction of single scores model <- cnorm(raw = elfe$raw, group = elfe$group) predictRaw(35, 3.5, model) ## End(Not run)
This is a convenience method to either load the inbuilt sample dataset, or to provide a data frame with the variables "raw" (for the raw scores) and "group" The function ranks the data within groups, computes norm values, powers of the norm scores and interactions. Afterwards, you can use these preprocessed data to determine the best fitting model.
prepareData( data = NULL, group = "group", raw = "raw", age = "group", k = 4, t = NULL, width = NA, weights = NULL, scale = "T", descend = FALSE, silent = FALSE )prepareData( data = NULL, group = "group", raw = "raw", age = "group", k = 4, t = NULL, width = NA, weights = NULL, scale = "T", descend = FALSE, silent = FALSE )
data |
data.frame with a grouping variable named 'group' and a raw score variable named 'raw'. |
group |
grouping variable in the data, e. g. age groups, grades ... Setting group = FALSE deactivates modeling in dependence of age. Use this in case you do want conventional norm tables. |
raw |
the raw scores |
age |
the continuous explanatory variable; by default set to "group" |
k |
The power parameter, default = 4 |
t |
the age power parameter (default NULL). If not set, cNORM automatically uses k. The age power parameter can be used to specify the k to produce rectangular matrices and specify the course of scores per independently from k |
width |
if a width is provided, the function switches to rankBySlidingWindow to determine the observed raw scores, otherwise, ranking is done by group (default) |
weights |
Vector or variable name in the dataset with weights for each individual case. It can be used to compensate for moderate imbalances due to insufficient norm data stratification. Weights should be numerical and positive. Please use the 'computeWeights' function for this purpose. |
scale |
type of norm scale, either T (default), IQ, z or percentile (= no transformation); a double vector with the mean and standard deviation can as well, be provided f. e. c(10, 3) for Wechsler scale index point |
descend |
ranking order (default descent = FALSE): inverses the ranking order with higher raw scores getting lower norm scores; relevant for example when norming error scores, where lower scores mean higher performance |
silent |
set to TRUE to suppress messages |
The functions rankBySlidingWindow, rankByGroup, bestModel,
computePowers and prepareData are usually not called directly, but accessed
through other functions like cnorm.
data frame including the norm scores, powers and interactions of the norm score and grouping variable
Other prepare:
computePowers(),
rankByGroup(),
rankBySlidingWindow()
## Not run: # conducts ranking and computation of powers and interactions with the 'elfe' dataset data.elfe <- prepareData(elfe) # use vectors instead of data frame data.elfe <- prepareData(raw=elfe$raw, group=elfe$group) # variable names can be specified as well, here with the BMI data included in the package data.bmi <- prepareData(CDC, group = "group", raw = "bmi", age = "age") ## End(Not run) # modeling with only one group with the 'elfe' dataset as an example # this results in conventional norming data.elfe2 <- prepareData(data = elfe, group = FALSE) m <- bestModel(data.elfe2)## Not run: # conducts ranking and computation of powers and interactions with the 'elfe' dataset data.elfe <- prepareData(elfe) # use vectors instead of data frame data.elfe <- prepareData(raw=elfe$raw, group=elfe$group) # variable names can be specified as well, here with the BMI data included in the package data.bmi <- prepareData(CDC, group = "group", raw = "bmi", age = "age") ## End(Not run) # modeling with only one group with the 'elfe' dataset as an example # this results in conventional norming data.elfe2 <- prepareData(data = elfe, group = FALSE) m <- bestModel(data.elfe2)
After conducting the model fitting procedure on the data set, the best fitting model has to be chosen. The print function shows the R2 and other information on the different best fitting models with increasing number of predictors.
## S3 method for class 'cnorm' print(x, ...)## S3 method for class 'cnorm' print(x, ...)
x |
The model from the 'bestModel' function or a cnorm object |
... |
additional parameters |
A table with information criteria
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
printSubset(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
Print method for SinH-ArcSinH objects
## S3 method for class 'cnormShash' print(x, ...)## S3 method for class 'cnormShash' print(x, ...)
x |
A cnormShash object |
... |
Additional arguments |
Displays R^2 and other metrics for models with varying predictors, aiding in choosing the best-fitting model after model fitting.
printSubset(x, ...)printSubset(x, ...)
x |
Model output from 'bestModel' or a cnorm object. |
... |
Additional parameters. |
Table with model information criteria.
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
rangeCheck(),
regressionFunction(),
summary.cnorm()
## Not run: # Using cnorm object from sample data result <- cnorm(raw = elfe$raw, group = elfe$group) printSubset(result) ## End(Not run)## Not run: # Using cnorm object from sample data result <- cnorm(raw = elfe$raw, group = elfe$group) printSubset(result) ## End(Not run)
Regression model only work in a specific range and extrapolation horizontally (outside the original range) or vertically (extreme norm scores) might lead to inconsistent results. The function generates a message, indicating extrapolation and the range of the original data.
rangeCheck( object, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, digits = 3, ... )rangeCheck( object, minAge = NULL, maxAge = NULL, minNorm = NULL, maxNorm = NULL, digits = 3, ... )
object |
The regression model or a cnorm object |
minAge |
The lower age bound |
maxAge |
The upper age bound |
minNorm |
The lower norm value bound |
maxNorm |
The upper norm value bound |
digits |
The precision for rounding the norm and age data |
... |
additional parameters |
the report
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
regressionFunction(),
summary.cnorm()
## Not run: m <- cnorm(raw = elfe$raw, group = elfe$group) rangeCheck(m) ## End(Not run)## Not run: m <- cnorm(raw = elfe$raw, group = elfe$group) rangeCheck(m) ## End(Not run)
This is the initial step, usually done in all kinds of test norming projects, after the scale is constructed and the norm sample is established. First, the data is grouped according to a grouping variable and afterwards, the percentile for each raw value is retrieved. The percentile can be used for the modeling procedure, but in case, the samples to not deviate too much from normality, T, IQ or z scores can be computed via a normal rank procedure based on the inverse cumulative normal distribution. In case of bindings, we use the medium rank and there are different methods for estimating the percentiles (default RankIt).
rankByGroup( data = NULL, group = "group", raw = "raw", weights = NULL, method = 4, scale = "T", descend = FALSE, descriptives = TRUE, na.rm = TRUE, silent = FALSE )rankByGroup( data = NULL, group = "group", raw = "raw", weights = NULL, method = 4, scale = "T", descend = FALSE, descriptives = TRUE, na.rm = TRUE, silent = FALSE )
data |
data.frame with norm sample data. If no data.frame is provided, the raw score and group vectors are directly used |
group |
name of the grouping variable (default 'group') or numeric vector, e. g. grade, setting group to FALSE cancels grouping (data is treated as one group) |
raw |
name of the raw value variable (default 'raw') or numeric vector |
weights |
Vector or variable name in the dataset with weights for each individual case. It can be used to compensate for moderate imbalances due to insufficient norm data stratification. Weights should be numerical and positive. Please use the 'computeWeights' function for this purpose. |
method |
Ranking method in case of bindings, please provide an index, choosing from the following methods: 1 = Blom (1958), 2 = Tukey (1949), 3 = Van der Warden (1952), 4 = Rankit (default), 5 = Levenbach (1953), 6 = Filliben (1975), 7 = Yu & Huang (2001) |
scale |
type of norm scale, either T (default), IQ, z or percentile (= no transformation); a double vector with the mean and standard deviation can as well, be provided f. e. c(10, 3) for Wechsler scale index points |
descend |
ranking order (default descent = FALSE): inverses the ranking order with higher raw scores getting lower norm scores; relevant for example when norming error scores, where lower scores mean higher performance |
descriptives |
If set to TRUE (default), information in n, mean, median and standard deviation per group is added to each observation |
na.rm |
remove values, where the percentiles could not be estimated, most likely happens in the context of weighting |
silent |
set to TRUE to suppress messages |
the dataset with the percentiles and norm scales per group
So far the inclusion of a binary covariate is experimental and far from optimized. The according variable name has to be specified in the ranking procedure and the modeling includes this in the further process. At the moment, during ranking the data are split into the according cells group x covariate, which leads to small sample sizes. Please take care to have enough cases in each combination. Additionally, covariates can lead to unstable modeling solutions. The question, if it is really reasonable to include covariates when norming a test is a decision beyond the pure data modeling. Please use with care or alternatively split the dataset into the two groups beforehand and model them separately.
The functions rankBySlidingWindow, rankByGroup, bestModel,
computePowers and prepareData are usually not called directly, but accessed
through other functions like cnorm.
rankBySlidingWindow, computePowers, computeWeights, weighted.rank
Other prepare:
computePowers(),
prepareData(),
rankBySlidingWindow()
## Not run: # Transformation with default parameters: RankIt and converting to T scores data.elfe <- rankByGroup(elfe, group = "group") # using a data frame with vector names data.elfe2 <- rankByGroup(raw=elfe$raw, group=elfe$group) # use vectors for raw score and group # Transformation into Wechsler scores with Yu & Huang (2001) ranking procedure data.elfe <- rankByGroup(raw = elfe$raw, group = elfe$group, method = 7, scale = c(10, 3)) # cNORM can as well be used for conventional norming, in case no group is given d <- rankByGroup(raw = elfe$raw) d <- computePowers(d) m <- bestModel(d) rawTable(0, m) # please use an arbitrary value for age when generating the tables ## End(Not run)## Not run: # Transformation with default parameters: RankIt and converting to T scores data.elfe <- rankByGroup(elfe, group = "group") # using a data frame with vector names data.elfe2 <- rankByGroup(raw=elfe$raw, group=elfe$group) # use vectors for raw score and group # Transformation into Wechsler scores with Yu & Huang (2001) ranking procedure data.elfe <- rankByGroup(raw = elfe$raw, group = elfe$group, method = 7, scale = c(10, 3)) # cNORM can as well be used for conventional norming, in case no group is given d <- rankByGroup(raw = elfe$raw) d <- computePowers(d) m <- bestModel(d) rawTable(0, m) # please use an arbitrary value for age when generating the tables ## End(Not run)
The function retrieves all individuals in the predefined age range (x +/- width/2) around each case and ranks that individual based on this individually drawn sample. This function can be directly used with a continuous age variable in order to avoid grouping. When collecting data on the basis of a continuous age variable, cases located far from the mean age of the group receive distorted percentiles when building discrete groups and generating percentiles with the traditional approach. The distortion increases with distance from the group mean and this effect can be avoided by the sliding window. Nonetheless, please ensure, that the optional grouping variable in fact represents the correct mean age of the respective age groups, as this variable is later on used for displaying the manifest data in the percentile plots.
rankBySlidingWindow( data = NULL, age = "age", raw = "raw", weights = NULL, width, method = 4, scale = "T", descend = FALSE, descriptives = TRUE, nGroup = 0, group = NA, na.rm = TRUE, silent = FALSE )rankBySlidingWindow( data = NULL, age = "age", raw = "raw", weights = NULL, width, method = 4, scale = "T", descend = FALSE, descriptives = TRUE, nGroup = 0, group = NA, na.rm = TRUE, silent = FALSE )
data |
data.frame with norm sample data |
age |
the continuous age variable. Setting 'age' to FALSE inhibits computation of powers of age and the interactions |
raw |
name of the raw value variable (default 'raw') |
weights |
Vector or variable name in the dataset with weights for each individual case. It can be used to compensate for moderate imbalances due to insufficient norm data stratification. Weights should be numerical and positive. It can be resource intense when applied to the sliding window. Please use the 'computeWeights' function for this purpose. |
width |
the width of the sliding window |
method |
Ranking method in case of bindings, please provide an index, choosing from the following methods: 1 = Blom (1958), 2 = Tukey (1949), 3 = Van der Warden (1952), 4 = Rankit (default), 5 = Levenbach (1953), 6 = Filliben (1975), 7 = Yu & Huang (2001) |
scale |
type of norm scale, either T (default), IQ, z or percentile (= no transformation); a double vector with the mean and standard deviation can as well, be provided f. e. c(10, 3) for Wechsler scale index points |
descend |
ranking order (default descent = FALSE): inverses the ranking order with higher raw scores getting lower norm scores; relevant for example when norming error scores, where lower scores mean higher performance |
descriptives |
If set to TRUE (default), information in n, mean, median and standard deviation per group is added to each observation |
nGroup |
If set to a positive value, a grouping variable is created with the desired number of equi distant groups, named by the group mean age of each group. It creates the column 'group' in the data.frame and in case, there is already one with that name, overwrites it. |
group |
Optional parameter for providing the name of the grouping variable (if present; overwritten if ngroups is used) |
na.rm |
remove values, where the percentiles could not be estimated, most likely happens in the context of weighting |
silent |
set to TRUE to suppress messages |
In case of bindings, the function uses the medium rank and applies the algorithms
already described in the rankByGroup function. At the upper and lower end of the
data sample, the sliding stops and the sample is drawn from the interval min + width and
max - width, respectively.
the dataset with the individual percentiles and norm scores
So far the inclusion of a binary covariate is experimental and far from optimized. The according variable name has to be specified in the ranking procedure and the modeling includes this in the further process. At the moment, during ranking the data are split into the according degrees of the covariate and the ranking is done separately. This may lead to small sample sizes. Please take care to have enough cases in each combination. Additionally, covariates can lead to unstable modeling solutions. The question, if it is really reasonable to include covariates when norming a test is a decision beyond the pure data modeling. Please use with care or alternatively split the dataset into the two groups beforehand and model them separately.
The functions rankBySlidingWindow, rankByGroup, bestModel,
computePowers and prepareData are usually not called directly, but accessed
through other functions like cnorm.
rankByGroup, computePowers, computeWeights, weighted.rank, weighted.quantile
Other prepare:
computePowers(),
prepareData(),
rankByGroup()
## Not run: # Transformation using a sliding window data.elfe2 <- rankBySlidingWindow(elfe, raw = "raw", age = "group", width = 0.5) # Comparing this to the traditional approach should give us exactly the same # values, since the sample dataset only has a grouping variable for age data.elfe <- rankByGroup(elfe, group = "group") mean(data.elfe$normValue - data.elfe2$normValue) ## End(Not run)## Not run: # Transformation using a sliding window data.elfe2 <- rankBySlidingWindow(elfe, raw = "raw", age = "group", width = 0.5) # Comparing this to the traditional approach should give us exactly the same # values, since the sample dataset only has a grouping variable for age data.elfe <- rankByGroup(elfe, group = "group") mean(data.elfe$normValue - data.elfe2$normValue) ## End(Not run)
This function is comparable to 'normTable', despite it reverses the assignment: A table with raw scores and the according norm scores for a specific age based on the regression model is generated. This way, the inverse function of the regression model is solved numerically with brute force. Please specify the range of raw values, you want to cover. With higher precision and smaller stepping, this function becomes computational intensive. In case a confidence coefficient (CI, default .9) and the reliability is specified, confidence intervals are computed for the true score estimates, including a correction for regression to the mean (Eid & Schmidt, 2012, p. 272).
rawTable( A, model, minRaw = NULL, maxRaw = NULL, minNorm = NULL, maxNorm = NULL, step = 1, monotonuous = TRUE, CI = 0.9, reliability = NULL, pretty = TRUE )rawTable( A, model, minRaw = NULL, maxRaw = NULL, minNorm = NULL, maxNorm = NULL, step = 1, monotonuous = TRUE, CI = 0.9, reliability = NULL, pretty = TRUE )
A |
the age, either single value or vector with age values |
model |
The regression model or a cnorm object |
minRaw |
The lower bound of the raw score range |
maxRaw |
The upper bound of the raw score range |
minNorm |
Clipping parameter for the lower bound of norm scores (default 25) |
maxNorm |
Clipping parameter for the upper bound of norm scores (default 25) |
step |
Stepping parameter for the raw scores (default 1) |
monotonuous |
corrects for decreasing norm scores in case of model inconsistencies (default) |
CI |
confidence coefficient, ranging from 0 to 1, default .9 |
reliability |
coefficient, ranging between 0 to 1 |
pretty |
Format table by collapsing intervals and rounding to meaningful precision |
either data.frame with raw scores and the predicted norm scores in case of simple A value or a list of norm tables if vector of A values was provided
Eid, M. & Schmidt, K. (2012). Testtheorie und Testkonstruktion. Hogrefe.
normTable
Other predict:
derivationTable(),
getNormCurve(),
normTable(),
predict.cnormBetaBinomial(),
predict.cnormBetaBinomial2(),
predict.cnormShash(),
predictNorm(),
predictRaw()
## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # generate a norm table for the raw value range from 0 to 28 for the time point month 7 of grade 3 table <- rawTable(3 + 7 / 12, cnorm.elfe, minRaw = 0, maxRaw = 28) # generate several raw tables table <- rawTable(c(2.5, 3.5, 4.5), cnorm.elfe, minRaw = 0, maxRaw = 28) # additionally compute confidence intervals table <- rawTable(c(2.5, 3.5, 4.5), cnorm.elfe, minRaw = 0, maxRaw = 28, CI = .9, reliability = .94) # conventional norming, set age to arbitrary value model <- cnorm(raw = elfe$raw) rawTable(0, model) ## End(Not run)## Not run: # Generate cnorm object from example data cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group) # generate a norm table for the raw value range from 0 to 28 for the time point month 7 of grade 3 table <- rawTable(3 + 7 / 12, cnorm.elfe, minRaw = 0, maxRaw = 28) # generate several raw tables table <- rawTable(c(2.5, 3.5, 4.5), cnorm.elfe, minRaw = 0, maxRaw = 28) # additionally compute confidence intervals table <- rawTable(c(2.5, 3.5, 4.5), cnorm.elfe, minRaw = 0, maxRaw = 28, CI = .9, reliability = .94) # conventional norming, set age to arbitrary value model <- cnorm(raw = elfe$raw) rawTable(0, model) ## End(Not run)
The method builds the regression function for the regression model, including the beta weights. It can be used to predict the raw scores based on age and location.
regressionFunction(model, raw = NULL, digits = NULL)regressionFunction(model, raw = NULL, digits = NULL)
model |
The regression model from the bestModel function or a cnorm object |
raw |
The name of the raw value variable (default 'raw') |
digits |
Number of digits for formatting the coefficients |
The regression formula as a string
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
summary.cnorm()
## Not run: result <- cnorm(raw = elfe$raw, group = elfe$group) regressionFunction(result) ## End(Not run)## Not run: result <- cnorm(raw = elfe$raw, group = elfe$group) regressionFunction(result) ## End(Not run)
Density, distribution function, quantile function and random generation
for the Sinh-Arcsinh distribution with location parameter mu,
scale parameter sigma, skewness parameter epsilon, and
tail weight parameter delta.
dshash(x, mu = 0, sigma = 1, epsilon = 0, delta = 1, log = FALSE) pshash( q, mu = 0, sigma = 1, epsilon = 0, delta = 1, lower.tail = TRUE, log.p = FALSE ) qshash( p, mu = 0, sigma = 1, epsilon = 0, delta = 1, lower.tail = TRUE, log.p = FALSE ) rshash(n, mu = 0, sigma = 1, epsilon = 0, delta = 1)dshash(x, mu = 0, sigma = 1, epsilon = 0, delta = 1, log = FALSE) pshash( q, mu = 0, sigma = 1, epsilon = 0, delta = 1, lower.tail = TRUE, log.p = FALSE ) qshash( p, mu = 0, sigma = 1, epsilon = 0, delta = 1, lower.tail = TRUE, log.p = FALSE ) rshash(n, mu = 0, sigma = 1, epsilon = 0, delta = 1)
x, q
|
vector of quantiles |
mu |
location parameter (default: 0) |
sigma |
scale parameter (must be > 0, default: 1) |
epsilon |
skewness parameter (default: 0, symmetric distribution) |
delta |
tail weight parameter (must be > 0, default: 1 for normal-like tails) |
log, log.p
|
logical; if TRUE, probabilities p are given as log(p) |
lower.tail |
logical; if TRUE (default), probabilities are P[X <= x] otherwise, P[X > x] |
p |
vector of probabilities |
n |
number of observations. If |
The Sinh-Arcsinh distribution (Jones & Pewsey, 2009) is defined by the transformation:
where is a standard normal variable.
The four parameters control:
mu: Location (similar to mean)
sigma: Scale (similar to standard deviation)
epsilon: Skewness (epsilon = 0 gives symmetry)
delta: Tail weight (delta = 1 gives normal-like tails,
delta > 1 gives heavier tails, delta < 1 gives lighter tails)
dshash gives the density, pshash gives the distribution
function, qshash gives the quantile function, and rshash
generates random deviates.
The length of the result is determined by n for rshash, and
is the maximum of the lengths of the numerical arguments for the other functions.
Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. Biometrika, 96(4), 761-780. doi:10.1093/biomet/asp053
Normal for the normal distribution.
## Not run: # Generate random samples x <- rshash(1000, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Density plot(density(x)) curve(dshash(x, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2), add = TRUE, col = "red") # Cumulative probability pshash(0, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Quantiles qshash(c(0.025, 0.5, 0.975), mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Compare with normal distribution (epsilon = 0, delta = 1) par(mfrow = c(2, 2)) x_vals <- seq(-4, 4, length.out = 200) plot(x_vals, dshash(x_vals), type = "l", main = "Symmetric (like normal)") plot(x_vals, dshash(x_vals, epsilon = 1), type = "l", main = "Right skewed") plot(x_vals, dshash(x_vals, delta = 2), type = "l", main = "Heavy tails") plot(x_vals, dshash(x_vals, delta = 0.5), type = "l", main = "Light tails") ## End(Not run)## Not run: # Generate random samples x <- rshash(1000, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Density plot(density(x)) curve(dshash(x, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2), add = TRUE, col = "red") # Cumulative probability pshash(0, mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Quantiles qshash(c(0.025, 0.5, 0.975), mu = 0, sigma = 1, epsilon = 0.5, delta = 1.2) # Compare with normal distribution (epsilon = 0, delta = 1) par(mfrow = c(2, 2)) x_vals <- seq(-4, 4, length.out = 200) plot(x_vals, dshash(x_vals), type = "l", main = "Symmetric (like normal)") plot(x_vals, dshash(x_vals, epsilon = 1), type = "l", main = "Right skewed") plot(x_vals, dshash(x_vals, delta = 2), type = "l", main = "Heavy tails") plot(x_vals, dshash(x_vals, delta = 0.5), type = "l", main = "Light tails") ## End(Not run)
Simulate mean per age
simMean(age)simMean(age)
age |
the age variable |
return predicted means
## Not run: x <- simMean(a) ## End(Not run)## Not run: x <- simMean(a) ## End(Not run)
Simulate sd per age
simSD(age)simSD(age)
age |
the age variable |
return predicted sd
## Not run: x <- simSD(a) ## End(Not run)## Not run: x <- simSD(a) ## End(Not run)
For testing purposes only: The function simulates raw test scores based on a virtual Rasch based test with n results per age group, an evenly distributed age variable, items.n test items with a simulated difficulty and standard deviation. The development trajectories over age group are modeled by a curve linear function of age, with at first fast progression, which slows down over age, and a slightly increasing standard deviation in order to model a scissor effects. The item difficulties can be accessed via $theta and the raw data via $data of the returned object.
simulateRasch( data = NULL, n = 100, minAge = 1, maxAge = 7, items.n = 21, items.m = 0, items.sd = 1, Theta = "random", width = 1 )simulateRasch( data = NULL, n = 100, minAge = 1, maxAge = 7, items.n = 21, items.m = 0, items.sd = 1, Theta = "random", width = 1 )
data |
data.frame from previous simulations for recomputation (overrides n, minAge, maxAge) |
n |
The sample size per age group |
minAge |
The minimum age (default 1) |
maxAge |
The maximum age (default 7) |
items.n |
The number of items of the test |
items.m |
The mean difficulty of the items |
items.sd |
The standard deviation of the item difficulty |
Theta |
irt scales difficulty parameters, either "random" for drawing a random sample, "even" for evenly distributed or a set of predefined values, which then overrides the item.n parameters |
width |
The width of the window size for the continuous age per group; +- 1/2 width around group center on items.m and item.sd; if set to FALSE, the distribution is not drawn randomly but normally nonetheless |
a list containing the simulated data and thetas
the data.frame with only age, group and raw
the complete simulated data with item level results
the difficulty of the items
## Not run: # simulate data for a rather easy test (m = -1.0) sim <- simulateRasch(n=150, minAge=1, maxAge=7, items.n = 30, items.m = -1.0, items.sd = 1, Theta = "random", width = 1.0) # Show item difficulties mean(sim$theta) sd(sim$theta) hist(sim$theta) # Plot raw scores boxplot(raw~group, data=sim$data) # Model data data <- prepareData(sim$data, age="age") model <- bestModel(data, k = 4) printSubset(model) plotSubset(model, type=0) ## End(Not run)## Not run: # simulate data for a rather easy test (m = -1.0) sim <- simulateRasch(n=150, minAge=1, maxAge=7, items.n = 30, items.m = -1.0, items.sd = 1, Theta = "random", width = 1.0) # Show item difficulties mean(sim$theta) sd(sim$theta) hist(sim$theta) # Plot raw scores boxplot(raw~group, data=sim$data) # Model data data <- prepareData(sim$data, age="age") model <- bestModel(data, k = 4) printSubset(model) plotSubset(model, type=0) ## End(Not run)
This function standardizes a numeric vector by subtracting the mean and dividing by the standard deviation. The resulting vector will have a mean of 0 and a standard deviation of 1.
standardize(x)standardize(x)
x |
A numeric vector to be standardized. |
A numeric vector of the same length as x, containing the standardized values.
## Not run: data <- c(1, 2, 3, 4, 5) standardized_data <- standardize(data) print(standardized_data) ## End(Not run)## Not run: data <- c(1, 2, 3, 4, 5) standardized_data <- standardize(data) print(standardized_data) ## End(Not run)
Function for standardizing raking weights Raking weights get divided by the smallest weight. Thereby, all weights become larger or equal to 1 without changing the ratio of the weights to each other.
standardizeRakingWeights(weights)standardizeRakingWeights(weights)
weights |
Raking weights computed by computeWeights() |
the standardized weights
Performs k-fold resampling to estimate averaged coefficients for linear regression. The coefficients are averaged across k different subsets of the data to provide more stable estimates. For small samples (n < 100), returns a standard linear model instead.
subsample_lm(text, data, weights, k = 10)subsample_lm(text, data, weights, k = 10)
text |
A character string or formula specifying the model to be fitted |
data |
A data frame containing the variables in the model |
weights |
Optional numeric vector of weights. If NULL, unweighted regression is performed |
k |
Integer specifying the number of resampling folds (default = 10) |
The function splits the data into k subsets, fits a linear model on k-1 subsets, and stores the coefficients. This process is repeated k times, and the final coefficients are averaged across all iterations to provide more stable estimates.
An object of class 'lm' with averaged coefficients from k-fold resampling. For small samples, returns a standard lm object.
S3 method for printing the results and regression function of a cnorm model
## S3 method for class 'cnorm' summary(object, ...)## S3 method for class 'cnorm' summary(object, ...)
object |
A regression model or cnorm object |
... |
additional parameters |
A report on the regression function, weights, R2 and RMSE
Other model:
bestModel(),
checkConsistency(),
cnorm.cv(),
derive(),
modelSummary(),
print.cnorm(),
printSubset(),
rangeCheck(),
regressionFunction()
This function provides a summary of a fitted beta-binomial continuous norming model, including model fit statistics, convergence information, and parameter estimates.
## S3 method for class 'cnormBetaBinomial' summary(object, ...)## S3 method for class 'cnormBetaBinomial' summary(object, ...)
object |
An object of class "cnormBetaBinomial" or "cnormBetaBinomial2", typically
the result of a call to |
... |
Additional arguments passed to the summary method:
|
The summary includes:
Basic model information (type, number of observations, number of parameters)
Model fit statistics (log-likelihood, AIC, BIC)
R-squared, RMSE, and bias (if age and raw scores are provided) in comparison to manifest norm scores
Convergence information
Parameter estimates with standard errors, z-values, and p-values
Invisibly returns a list containing detailed diagnostic information about the model. The function primarily produces printed output summarizing the model.
cnorm.betabinomial, diagnostics.betabinomial
## Not run: model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = ppvt$age, score = ppvt$raw) ## End(Not run)## Not run: model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = ppvt$age, score = ppvt$raw) ## End(Not run)
This function provides a summary of a fitted beta-binomial continuous norming model, including model fit statistics, convergence information, and parameter estimates.
## S3 method for class 'cnormBetaBinomial2' summary(object, ...)## S3 method for class 'cnormBetaBinomial2' summary(object, ...)
object |
An object of class "cnormBetaBinomial" or "cnormBetaBinomial2", typically
the result of a call to |
... |
Additional arguments passed to the summary method:
|
The summary includes:
Basic model information (type, number of observations, number of parameters)
Model fit statistics (log-likelihood, AIC, BIC)
R-squared, RMSE, and bias (if age and raw scores are provided) in comparison to manifest norm scores
Convergence information
Parameter estimates with standard errors, z-values, and p-values
Invisibly returns a list containing detailed diagnostic information about the model. The function primarily produces printed output summarizing the model.
cnorm.betabinomial, diagnostics.betabinomial
## Not run: model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = ppvt$age, raw = ppvt$raw) ## End(Not run)## Not run: model <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = ppvt$age, raw = ppvt$raw) ## End(Not run)
This function provides a summary of a fitted SinH-ArcSinH (shash) continuous norming model, including model fit statistics, convergence information, and parameter estimates.
## S3 method for class 'cnormShash' summary(object, ...)## S3 method for class 'cnormShash' summary(object, ...)
object |
An object of class "cnormShash", typically the result of a call to
|
... |
Additional arguments passed to the summary method:
|
The summary includes:
Basic model information (polynomial degrees, delta parameter, number of observations)
Model fit statistics (log-likelihood, AIC, BIC)
R-squared, RMSE, and bias (if age and raw scores are provided)
Convergence information
Parameter estimates with standard errors, z-values, and p-values
Invisibly returns a list containing detailed diagnostic information about the model. The function primarily produces printed output summarizing the model.
cnorm.shash, diagnostics.shash
## Not run: model <- cnorm.shash(children$age, children$score) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = children$age, score = children$score) ## End(Not run)## Not run: model <- cnorm.shash(children$age, children$score) summary(model) # Including R-squared, RMSE, and bias in the summary: summary(model, age = children$age, score = children$score) ## End(Not run)
Conducts distribution free continuous norming and aims to find a fitting model. Raw data are modelled as a Taylor polynomial of powers of age and location and their interactions. In addition to the raw scores, either provide a numeric vector for the grouping information (group) for the ranking of the raw scores. You can adjust the grade of smoothing of the regression model by setting the k, t and terms parameter. In general, increasing k and t leads to a higher fit, while lower values lead to more smoothing. If both parameters are missing, taylorSwift uses k = 5 and t = 3 by default.
taylorSwift( raw = NULL, group = NULL, age = NULL, width = NA, weights = NULL, scale = "T", method = 4, descend = FALSE, k = NULL, t = NULL, terms = 0, R2 = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )taylorSwift( raw = NULL, group = NULL, age = NULL, width = NA, weights = NULL, scale = "T", method = 4, descend = FALSE, k = NULL, t = NULL, terms = 0, R2 = NULL, plot = TRUE, extensive = TRUE, subsampling = FALSE )
raw |
Numeric vector of raw scores |
group |
Numeric vector of grouping variable, e. g. grade. If no group or age variable is provided, conventional norming is applied |
age |
Numeric vector with chronological age, please additionally specify width of window |
width |
Size of the sliding window in case an age vector is used |
weights |
Vector or variable name in the dataset with weights for each individual case. It can be used to compensate for moderate imbalances due to insufficient norm data stratification. Weights should be numerical and positive. |
scale |
type of norm scale, either T (default), IQ, z or percentile (= no transformation); a double vector with the mean and standard deviation can as well, be provided f. e. c(10, 3) for Wechsler scale index points |
method |
Ranking method in case of bindings, please provide an index, choosing from the following methods: 1 = Blom (1958), 2 = Tukey (1949), 3 = Van der Warden (1952), 4 = Rankit (default), 5 = Levenbach (1953), 6 = Filliben (1975), 7 = Yu & Huang (2001) |
descend |
ranking order (default descent = FALSE): inverses the ranking order with higher raw scores getting lower norm scores; relevant for example when norming error scores, where lower scores mean higher performance |
k |
The power constant. Higher values result in more detailed approximations but have the danger of over-fit (max = 6). If not set, it uses t and if both parameters are NULL, k is set to 5. |
t |
The age power parameter (max = 6). If not set, it uses k and if both parameters are NULL, k is set to 3, since age trajectories are most often well captured by cubic polynomials. |
terms |
Selection criterion for model building. The best fitting model with this number of terms is used |
R2 |
Adjusted R square as a stopping criterion for the model building (default R2 = 0.99) |
plot |
Default TRUE; plots the regression model and prints report |
extensive |
If TRUE, screen models for consistency and - if possible, exclude inconsistent ones |
subsampling |
If TRUE (default), model coefficients are calculated using 10-folds and averaged across the folds. |
cnorm object including the ranked raw data and the regression model
Gary, S. & Lenhard, W. (2021). In norming we trust. Diagnostica.
Gary, S., Lenhard, W. & Lenhard, A. (2021). Modelling Norm Scores with the cNORM Package in R. Psych, 3(3), 501-521. https://doi.org/10.3390/psych3030033
Lenhard, A., Lenhard, W., Suggate, S. & Segerer, R. (2016). A continuous solution to the norming problem. Assessment, Online first, 1-14. doi:10.1177/1073191116656437
Lenhard, A., Lenhard, W., Gary, S. (2018). Continuous Norming (cNORM). The Comprehensive R Network, Package cNORM, available: https://CRAN.R-project.org/package=cNORM
Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. PLoS ONE, 14(9), e0222279. doi:10.1371/journal.pone.0222279
Lenhard, W., & Lenhard, A. (2020). Improvement of Norm Score Quality via Regression-Based Continuous Norming. Educational and Psychological Measurement(Online First), 1-33. https://doi.org/10.1177/0013164420928457
rankByGroup, rankBySlidingWindow, computePowers, bestModel
## Not run: # Using this function with the example dataset 'ppvt' # You can use the 'getGroups()' function to set up grouping variable in case, # you have a continuous age variable. model <- taylorSwift(raw = ppvt$raw, group = ppvt$group) # return norm tables including 90% confidence intervals for a # test with a reliability of r = .85; table are set to mean of quartal # in grade 3 (children completed 2 years of schooling) normTable(c(5, 15), model, CI = .90, reliability = .95) # ... or instead of raw scores for norm scores, the other way round rawTable(c(8, 12), model, CI = .90, reliability = .95) ## End(Not run)## Not run: # Using this function with the example dataset 'ppvt' # You can use the 'getGroups()' function to set up grouping variable in case, # you have a continuous age variable. model <- taylorSwift(raw = ppvt$raw, group = ppvt$group) # return norm tables including 90% confidence intervals for a # test with a reliability of r = .85; table are set to mean of quartal # in grade 3 (children completed 2 years of schooling) normTable(c(5, 15), model, CI = .90, reliability = .95) # ... or instead of raw scores for norm scores, the other way round rawTable(c(8, 12), model, CI = .90, reliability = .95) ## End(Not run)
Computes weighted quantiles (code from Andrey Akinshin (2023) "Weighted quantile estimators" arXiv:2304.07265 [stat.ME] Code made available via the CC BY-NC-SA 4.0 license) on the basis of either the weighted Harrell-Davis quantile estimator or an adaption of the type 7 quantile estimator of the generic quantile function in the base package. Please provide a vector with raw values, the probabilities for the quantiles and an additional vector with the weight of each observation. In case the weight vector is NULL, a normal quantile estimation is done. The vectors may not include NAs and the weights should be positive non-zero values. Please draw on the computeWeights() function for retrieving weights in post stratification.
weighted.quantile(x, probs, weights = NULL, type = "Harrell-Davis")weighted.quantile(x, probs, weights = NULL, type = "Harrell-Davis")
x |
A numerical vector |
probs |
Numerical vector of quantiles |
weights |
A numerical vector with weights; should have the same length as x |
type |
Type of estimator, can either be "inflation", "Harrell-Davis" using a beta function to approximate the weighted percentiles (Harrell & Davis, 1982) or "Type7" (default; Hyndman & Fan, 1996), an adaption of the generic quantile function in R, including weighting. The inflation procedure is essentially a numerical, non-parametric solution that gives the same results as Harrel-Davis. It requires less ressources with small datasets and always finds a solution (e. g. 1000 cases with weights between 1 and 10). If it becomes too resource intense, it switches to Harrell-Davis automatically. Harrel-Davis and Type7 code is based on the work of Akinshin (2023). |
the weighted quantiles
Harrell, F.E. & Davis, C.E. (1982). A new distribution-free quantile estimator. Biometrika, 69(3), 635-640.
Hyndman, R. J. & Fan, Y. (1996). Sample quantiles in statistical packages, American Statistician 50, 361–365.
Akinshin, A. (2023). Weighted quantile estimators arXiv:2304.07265 [stat.ME]
weighted.quantile.inflation, weighted.quantile.harrell.davis, weighted.quantile.type7
Computes weighted quantiles; code from Andrey Akinshin (2023) "Weighted quantile estimators" arXiv:2304.07265 [stat.ME] Code made available via the CC BY-NC-SA 4.0 license
weighted.quantile.harrell.davis(x, probs, weights = NULL)weighted.quantile.harrell.davis(x, probs, weights = NULL)
x |
A numerical vector |
probs |
Numerical vector of quantiles |
weights |
A numerical vector with weights; should have the same length as x. If no weights are provided (NULL), it falls back to the base quantile function, type 7 |
the quantiles
Applies weighted ranking numerically by inflating cases according to weight. This function will be resource intensive, if inflated cases get too high and in this cases, it switches to the parametric Harrell-Davis estimator.
weighted.quantile.inflation( x, probs, weights = NULL, degree = 3, cutoff = 1e+07 )weighted.quantile.inflation( x, probs, weights = NULL, degree = 3, cutoff = 1e+07 )
x |
A numerical vector |
probs |
Numerical vector of quantiles |
weights |
A numerical vector with weights; should have the same length as x. |
degree |
power parameter for case inflation (default = 3, equaling factor 1000) If no weights are provided (NULL), it falls back to the base quantile function, type 7 |
cutoff |
stop criterion for the sum of standardized weights to switch to Harrell-Davis, default = 1000000 |
the quantiles
Computes weighted quantiles; code from Andrey Akinshin (2023) "Weighted quantile estimators" arXiv:2304.07265 [stat.ME] Code made available via the CC BY-NC-SA 4.0 license
weighted.quantile.type7(x, probs, weights = NULL)weighted.quantile.type7(x, probs, weights = NULL)
x |
A numerical vector |
probs |
Numerical vector of quantiles |
weights |
A numerical vector with weights; should have the same length as x. If no weights are provided (NULL), it falls back to the base quantile function, type 7 |
the quantiles
Conducts weighted ranking on the basis of sums of weights per unique raw score. Please provide a vector with raw values and an additional vector with the weight of each observation. In case the weight vector is NULL, a normal ranking is done. The vectors may not include NAs and the weights should be positive non-zero values.
weighted.rank(x, weights = NULL)weighted.rank(x, weights = NULL)
x |
A numerical vector |
weights |
A numerical vector with weights; should have the same length as x |
the weighted absolute ranks