Changes in version 0.1.2.9000                      

                 Changes in version 0.1.2 (2026-05-05)                  

  - New choosing-item-parameters vignette
    (vignette("choosing-item-parameters")) — deeper reference for the
    three item-parameter specification workflows introduced in the
    getting-started vignette: import from a prior mirt fit (with a
    slope-intercept-to-IRT conversion worked example) or a CSV / Excel
    parameter table; domain-typical preset values for cognitive ability,
    personality, clinical, and achievement assessments with cited
    reference ranges; and hypothesized / content-based specification
    with explicit translation from item-review judgements to
    distribution arguments. eval=TRUE knitr engine; no new exported
    helpers in this release.
  - New irtsim getting-started vignette (vignette("irtsim")) walking a
    stranger from "I need to plan an IRT study" through to
    recommended_n(). The vignette builds live (eval=TRUE) so it cannot
    drift from the package API. Three item-parameter specification paths
    are demonstrated: by hand, via irt_params_2pl(), and from a prior
    mirt calibration.
  - irt_design() now aborts with an informative error if n_factors != 1.
    Multidimensional IRT support is planned for v0.4.0; until then, the
    parameter is retained on the design for forward compatibility but
    silently accepting n_factors > 1 produced a cryptic mirt internal
    error downstream. The new abort fires up front and points users at
    the planned support.
  - recommended_n() gains an aggregate parameter ("max" / "mean" /
    "median" / "none", default "max"). The default return is now an
    integer scalar — the smallest sample size that powers every
    item/param at the requested threshold — with a details attribute
    carrying the per-item data frame plus aggregate, criterion, and
    threshold attributes. "mean" and "median" round up via ceiling() so
    the recommendation never falls below the central tendency. Pass
    aggregate = "none" to recover the previous per-item data frame
    return. Behavior change: the default return shape changed from a
    per-item data frame to a scalar; closes the footgun where users
    could under-power by forgetting to take max() across items.
  - Removed the paper-reproduction-gaps vignette. Its content was a
    scorecard of paper Examples 2 and 3 reproduction gaps that pointed
    at deferred objectives (Obj 30/31). Those objectives are now
    superseded by a planned pluggable fit_fn / extract_fn hook (Obj 39,
    targeted for v0.3.0); the standalone gaps vignette no longer
    reflects the roadmap. Cross-references to it from
    paper-example-2-mcar and paper-example-3-grm have also been removed.

                 Changes in version 0.1.1 (2026-04-23)                  

CRAN resubmission. Documentation-only changes; no user-facing API or
behavior changes.

  - DESCRIPTION: expanded all acronyms on first use (API, IRT, 1PL, 2PL,
    MCAR, MAR, MSE, RMSE, SE) per CRAN reviewer request.
  - man/: replaced \dontrun{} with \donttest{} in irt_simulate,
    summary.irt_results, plot.irt_results, plot.summary_irt_results,
    recommended_n, print.irt_results, and print.summary_irt_results
    examples per CRAN reviewer request. Examples remain wrapped (not
    unwrapped) because each depends on a ~300-fit irt_simulate() call
    that exceeds the 5-second CRAN example-execution budget.

                        Changes in version 0.1.0                        

Initial CRAN release.

Core pipeline

  - irt_design() specifies the data-generating IRT model (items,
    parameters, theta distribution).
  - irt_study() adds study conditions (sample sizes, missing-data
    mechanism, optional separate estimation model).
  - irt_simulate() runs the Monte Carlo simulation loop with
    deterministic seeding and optional parallelism.
  - summary(), plot(), and recommended_n() methods extract
    simulation-based sample-size recommendations from irt_results
    objects.

Supported IRT models

  - 1PL (Rasch)
  - 2PL
  - Graded response model (GRM)

Supported missing-data mechanisms

  - "none" — complete data
  - "mcar" — missing completely at random
  - "mar" — missing at random (monotone, trait-dependent)
  - "booklet" — structured booklet assignment with common-item overlap
  - "linking" — two-form linked design with user-supplied linking matrix

Performance criteria

  - Mean squared error (mse), root mean squared error (rmse), bias,
    absolute bias, standard error (se), empirical coverage, Monte Carlo
    SE of MSE (mcse_mse).
  - Criterion metadata (direction of improvement, display label)
    centralized in R/criterion_registry.R.
  - Custom per-iteration criteria via the criterion_fn argument to
    summary.irt_results() — callbacks receive estimates, true_value,
    ci_lower, ci_upper, and converged and return named numeric vectors
    appended to item_summary.

Model misspecification

  - irt_study(estimation_model = ...) allows fitting a different IRT
    model than the one used to generate data (e.g., generate 2PL,
    fit 1PL). Compatible cross-pairs: (1PL, 2PL), (2PL, 1PL),
    same-model. GRM is not cross-compatible with dichotomous models.

Parallelization

  - irt_simulate(parallel = TRUE) dispatches iterations across workers
    via future.apply::future_lapply().
  - Reproducibility contract: within-mode (identical results on re-run
    for a given parallel setting) guaranteed. Cross-mode results differ
    because serial uses Mersenne-Twister and parallel uses L'Ecuyer-CMRG
    substreams — both statistically valid.
  - Users control backend via future::plan().

User experience

  - cli::cli_progress_bar() replaces cat()-based progress reporting
    (suppressible with progress = FALSE).
  - Structured cli::cli_abort() error messages with valid-option
    enumerations for invalid model, criterion, missing mechanism, and
    estimation_model arguments.

Documentation

  - Five vignettes reproduce or extend the three examples from
    Schroeders and Gnambs (2025):
      - Paper Example 1 — faithful reproduction of the linked-test
        design with 1PL estimation.
      - Paper Example 1b — extension showing bias-variance tradeoff when
        a 2PL-generated dataset is fit with a 1PL model.
      - Paper Example 2 — MCAR-only partial reproduction with
        custom-criterion-callback feature demonstration.
      - Paper Example 3 — GRM item parameter recovery partial
        reproduction.
      - Paper reproduction status — scorecard documenting what the
        current API can and cannot reproduce end-to-end.
  - Vignettes are shipped as static HTML via R.rsp::asis because
    re-running the Monte Carlo simulations during package checks would
    exceed CRAN's build-time budget. The source .Rmd files and
    data-raw/precompute_vignettes.R are available in the GitHub
    repository for users who wish to reproduce results locally.

Dependencies

  - Imports: cli, future.apply, ggplot2, mirt, rlang
  - Suggests: future, knitr, R.rsp, rmarkdown, scales, testthat

Reference

Schroeders, U., and Gnambs, T. (2025). Sample size planning in item
response theory: A 10-decision framework. Advances in Methods and
Practices in Psychological Science.
https://doi.org/10.1177/25152459251314798