restrictR lets you define reusable input contracts from
small building blocks using the base pipe |>. A contract
is defined once and called like a function to validate data at runtime.
Validators are immutable: each |> returns a new
validator, so you can safely branch from a shared base without side
effects.
| Section | What you’ll learn |
|---|---|
| Reusable schemas | Define and reuse data.frame contracts |
| Dependent validation | Constraints that reference other arguments |
| Enum arguments | Restrict string arguments to a fixed set |
| Arbitrary classes | Validate factors, dates, and model objects |
| Data frame with mixed constraints | Columns + enums + ranges in one contract |
| Checking without stopping | Collect all failures; test without throwing |
| Custom steps | Domain-specific invariants |
| Self-documentation | Print, as_contract_text(),
as_contract_block() |
| Using contracts in packages | The recommended pattern for R packages |
The most common use case: validating a newdata argument
in a predict-like function. Instead of scattering
if/stop() blocks, define the contract
once:
require_newdata <- restrict("newdata") |>
require_df() |>
require_has_cols(c("x1", "x2")) |>
require_col_numeric("x1", no_na = TRUE, finite = TRUE) |>
require_col_numeric("x2", no_na = TRUE, finite = TRUE) |>
require_nrow_min(1L)The result is a callable function. Valid input passes silently:
Invalid input produces a structured error with the exact path and position:
require_newdata(data.frame(x1 = c(1, NA), x2 = c(3, 4)))
#> Error:
#> ! newdata$x1: must not contain NA
#> At: 2require_newdata(data.frame(x1 = c(1, 2), x2 = c("a", "b")))
#> Error:
#> ! newdata$x2: must be numeric, got characterEvery error follows the same format: path: message,
optionally followed by Found: and At: lines.
This makes errors instantly recognizable and grep-friendly.
Some contracts depend on context. A prediction vector must have the
same length as the rows in newdata:
require_pred <- restrict("pred") |>
require_numeric(no_na = TRUE, finite = TRUE) |>
require_length_matches(~ nrow(newdata))The formula ~ nrow(newdata) declares a dependency on
newdata. Pass it explicitly when calling the validator:
newdata <- data.frame(x1 = 1:5, x2 = 6:10)
require_pred(c(0.1, 0.2, 0.3, 0.4, 0.5), newdata = newdata)Mismatched lengths produce a precise diagnostic:
require_pred(c(0.1, 0.2, 0.3), newdata = newdata)
#> Error:
#> ! pred: length must match nrow(newdata) (5)
#> Found: length 3Missing context is caught before any checks run:
require_pred(c(0.1, 0.2, 0.3))
#> Error:
#> ! `pred` depends on: newdata. Pass newdata = ... when calling the validator.Context can also be passed as a named list via .ctx:
For string arguments that must be one of a fixed set:
require_class() covers types without a dedicated check,
such as factors, dates, and fitted-model objects. By default it tests
inheritance, so a subclass passes; set exact = TRUE to
require the first class exactly.
Contracts work well for functions that accept a data frame with typed columns, value ranges, and categorical fields in one go:
require_survey <- restrict("survey") |>
require_df() |>
require_has_cols(c("age", "income", "status")) |>
require_col_numeric("age", no_na = TRUE) |>
require_col_between("age", lower = 0, upper = 150) |>
require_col_numeric("income", no_na = TRUE, finite = TRUE) |>
require_col_one_of("status", c("active", "inactive", "pending"))By default a validator stops at the first failing step. Pass
.on_fail = "all" to run every step and collect all
violations in one report:
messy_survey <- data.frame(
age = c(25, -5, 200),
income = c(35000, NA, 45000),
status = c("active", "banned", "active")
)
require_survey(messy_survey, .on_fail = "all")
#> Error:
#> ! 3 validation failures:
#> survey$age: must be >= 0 and <= 150
#> Found: -5
#> At: 2, 3
#> survey$income: must not contain NA
#> At: 2
#> survey$status: must be one of ["active", "inactive", "pending"]
#> Found: "banned"
#> At: 2To branch on validity in code, is_valid() returns a
logical and validation_errors() returns the messages as a
character vector, empty when the value passes:
is_valid(require_survey, good_survey)
#> [1] TRUE
validation_errors(require_survey, messy_survey)
#> [1] "survey$age: must be >= 0 and <= 150\n Found: -5\n At: 2, 3"
#> [2] "survey$income: must not contain NA\n At: 2"
#> [3] "survey$status: must be one of [\"active\", \"inactive\", \"pending\"]\n Found: \"banned\"\n At: 2"For domain-specific invariants that don’t belong in the built-in set,
use require_custom(). The step function receives
(value, name, ctx) and should call fail() on
failure to produce the same structured errors as built-in steps:
require_weights <- restrict("weights") |>
require_numeric(no_na = TRUE) |>
require_between(lower = 0, upper = 1) |>
require_custom(
label = "must sum to 1",
fn = function(value, name, ctx) {
if (abs(sum(value) - 1) > 1e-8) {
fail(name, "must sum to 1",
found = sprintf("sum = %g", sum(value)))
}
}
)Custom steps can also declare dependencies:
require_probs <- restrict("probs") |>
require_numeric(no_na = TRUE) |>
require_custom(
label = "length must match number of classes",
deps = "n_classes",
fn = function(value, name, ctx) {
if (length(value) != ctx$n_classes) {
fail(name, sprintf("expected %d probabilities", ctx$n_classes),
found = sprintf("length %d", length(value)))
}
}
)
require_probs(c(0.3, 0.7), n_classes = 2L)Print a validator to see its full contract:
require_newdata
#> <restriction newdata>
#> 1. must be a data.frame
#> 2. must have columns: "x1", "x2"
#> 3. $x1 must be numeric (no NA, finite)
#> 4. $x2 must be numeric (no NA, finite)
#> 5. must have at least 1 rowUse as_contract_text() to generate a one-line summary
for roxygen @param:
as_contract_text(require_newdata)
#> [1] "Must be a data.frame. Must have columns: \"x1\", \"x2\". $x1 must be numeric (no NA, finite). $x2 must be numeric (no NA, finite). Must have at least 1 row."Use as_contract_block() for multi-line output suitable
for @details:
The recommended pattern: define contracts near the top of the file
that uses them, or in a dedicated R/contracts.R if several
files share the same validators. Call them at the top of exported
functions.
# R/contracts.R
require_newdata <- restrict("newdata") |>
require_df() |>
require_has_cols(c("x1", "x2")) |>
require_col_numeric("x1", no_na = TRUE, finite = TRUE) |>
require_col_numeric("x2", no_na = TRUE, finite = TRUE)
require_pred <- restrict("pred") |>
require_numeric(no_na = TRUE, finite = TRUE) |>
require_length_matches(~ nrow(newdata))# R/predict.R
#' Predict from a fitted model
#'
#' @param newdata Must be a data.frame. Must have columns: "x1", "x2". $x1 must be numeric (no NA, finite). $x2 must be numeric (no NA, finite). Must have at least 1 row.
#' @param ... additional arguments passed to the underlying model.
#'
#' @export
my_predict <- function(object, newdata, ...) {
require_newdata(newdata)
pred <- do_prediction(object, newdata)
require_pred(pred, newdata = newdata)
pred
}Contracts compose naturally with the pipe and branch safely (each
|> creates a new validator):
The checkmate
package covers similar ground with a different emphasis. Its
assert*/check*/test* families
provide fast, C-backed checks that you call inline, one per argument,
where the check is needed. restrictR instead lets you name
a contract once as a |> chain and reuse that callable
validator across functions, compose and branch it immutably, and have it
print and document itself via as_contract_text(). Use
checkmate when you want quick inline assertions; use
restrictR when the same contract recurs across functions
and you want a single definition that also serves as documentation. The
two interoperate: a checkmate assertion can live inside a
require_custom() step.
sessionInfo()
#> R version 4.6.1 (2026-06-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 26.04 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.32.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] restrictR_0.2.0 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.59
#> [5] maketools_1.3.2 cachem_1.1.0 knitr_1.51 htmltools_0.5.9
#> [9] buildtools_1.0.0 lifecycle_1.0.5 cli_3.6.6 svglite_2.2.2
#> [13] sass_0.4.10 textshaping_1.0.5 jquerylib_0.1.4 systemfonts_1.3.2
#> [17] compiler_4.6.1 sys_3.4.3 tools_4.6.1 evaluate_1.0.5
#> [21] bslib_0.11.0 yaml_2.3.12 otel_0.2.0 jsonlite_2.0.0
#> [25] rlang_1.2.0