NEWS

irtsim 0.1.2.9000

irtsim 0.1.2 (2026-05-05)

New choosing-item-parameters vignette (vignette("choosing-item-parameters")) — deeper reference for the three item-parameter specification workflows introduced in the getting-started vignette: import from a prior mirt fit (with a slope-intercept-to-IRT conversion worked example) or a CSV / Excel parameter table; domain-typical preset values for cognitive ability, personality, clinical, and achievement assessments with cited reference ranges; and hypothesized / content-based specification with explicit translation from item-review judgements to distribution arguments. eval=TRUE knitr engine; no new exported helpers in this release.
New irtsim getting-started vignette (vignette("irtsim")) walking a stranger from "I need to plan an IRT study" through to recommended_n(). The vignette builds live (eval=TRUE) so it cannot drift from the package API. Three item-parameter specification paths are demonstrated: by hand, via irt_params_2pl(), and from a prior mirt calibration.
irt_design() now aborts with an informative error if n_factors != 1. Multidimensional IRT support is planned for v0.4.0; until then, the parameter is retained on the design for forward compatibility but silently accepting n_factors > 1 produced a cryptic mirt internal error downstream. The new abort fires up front and points users at the planned support.
recommended_n() gains an aggregate parameter ("max" / "mean" / "median" / "none", default "max"). The default return is now an integer scalar — the smallest sample size that powers every item/param at the requested threshold — with a details attribute carrying the per-item data frame plus aggregate, criterion, and threshold attributes. "mean" and "median" round up via ceiling() so the recommendation never falls below the central tendency. Pass aggregate = "none" to recover the previous per-item data frame return. Behavior change: the default return shape changed from a per-item data frame to a scalar; closes the footgun where users could under-power by forgetting to take max() across items.
Removed the paper-reproduction-gaps vignette. Its content was a scorecard of paper Examples 2 and 3 reproduction gaps that pointed at deferred objectives (Obj 30/31). Those objectives are now superseded by a planned pluggable fit_fn / extract_fn hook (Obj 39, targeted for v0.3.0); the standalone gaps vignette no longer reflects the roadmap. Cross-references to it from paper-example-2-mcar and paper-example-3-grm have also been removed.

irtsim 0.1.1 (2026-04-23)

CRAN resubmission. Documentation-only changes; no user-facing API or behavior changes.

DESCRIPTION: expanded all acronyms on first use (API, IRT, 1PL, 2PL, MCAR, MAR, MSE, RMSE, SE) per CRAN reviewer request.
man/: replaced \dontrun{} with \donttest{} in irt_simulate, summary.irt_results, plot.irt_results, plot.summary_irt_results, recommended_n, print.irt_results, and print.summary_irt_results examples per CRAN reviewer request. Examples remain wrapped (not unwrapped) because each depends on a ~300-fit irt_simulate() call that exceeds the 5-second CRAN example-execution budget.

irtsim 0.1.0

Initial CRAN release.

Core pipeline

irt_design() specifies the data-generating IRT model (items, parameters, theta distribution).
irt_study() adds study conditions (sample sizes, missing-data mechanism, optional separate estimation model).
irt_simulate() runs the Monte Carlo simulation loop with deterministic seeding and optional parallelism.
summary(), plot(), and recommended_n() methods extract simulation-based sample-size recommendations from irt_results objects.

Supported IRT models

1PL (Rasch)
2PL
Graded response model (GRM)

Supported missing-data mechanisms

"none" — complete data
"mcar" — missing completely at random
"mar" — missing at random (monotone, trait-dependent)
"booklet" — structured booklet assignment with common-item overlap
"linking" — two-form linked design with user-supplied linking matrix

Performance criteria

Mean squared error (mse), root mean squared error (rmse), bias, absolute bias, standard error (se), empirical coverage, Monte Carlo SE of MSE (mcse_mse).
Criterion metadata (direction of improvement, display label) centralized in R/criterion_registry.R.
Custom per-iteration criteria via the criterion_fn argument to summary.irt_results() — callbacks receive estimates, true_value, ci_lower, ci_upper, and converged and return named numeric vectors appended to item_summary.

Model misspecification

irt_study(estimation_model = ...) allows fitting a different IRT model than the one used to generate data (e.g., generate 2PL, fit 1PL). Compatible cross-pairs: (1PL, 2PL), (2PL, 1PL), same-model. GRM is not cross-compatible with dichotomous models.

Parallelization

irt_simulate(parallel = TRUE) dispatches iterations across workers via future.apply::future_lapply().
Reproducibility contract: within-mode (identical results on re-run for a given parallel setting) guaranteed. Cross-mode results differ because serial uses Mersenne-Twister and parallel uses L'Ecuyer-CMRG substreams — both statistically valid.
Users control backend via future::plan().

User experience

cli::cli_progress_bar() replaces cat()-based progress reporting (suppressible with progress = FALSE).
Structured cli::cli_abort() error messages with valid-option enumerations for invalid model, criterion, missing mechanism, and estimation_model arguments.

Documentation

Five vignettes reproduce or extend the three examples from Schroeders and Gnambs (2025):
- Paper Example 1 — faithful reproduction of the linked-test design with 1PL estimation.
- Paper Example 1b — extension showing bias-variance tradeoff when a 2PL-generated dataset is fit with a 1PL model.
- Paper Example 2 — MCAR-only partial reproduction with custom-criterion-callback feature demonstration.
- Paper Example 3 — GRM item parameter recovery partial reproduction.
- Paper reproduction status — scorecard documenting what the current API can and cannot reproduce end-to-end.
Vignettes are shipped as static HTML via R.rsp::asis because re-running the Monte Carlo simulations during package checks would exceed CRAN's build-time budget. The source .Rmd files and data-raw/precompute_vignettes.R are available in the GitHub repository for users who wish to reproduce results locally.

Dependencies

Imports: cli, future.apply, ggplot2, mirt, rlang
Suggests: future, knitr, R.rsp, rmarkdown, scales, testthat

Reference

Schroeders, U., and Gnambs, T. (2025). Sample size planning in item response theory: A 10-decision framework. Advances in Methods and Practices in Psychological Science. https://doi.org/10.1177/25152459251314798