Fixes two solver paths that could stall indefinitely on match_couples()
inputs with max_distance, calipers, or other forbidden-edge constraints.
These stalls caused the M1mac and linux-arm64 additional CRAN checks for
1.4.0 to hit the 1.5-hour test timeout.
Forbidden-cell marker is now Inf instead of a large finite value.
apply_max_distance(), apply_calipers(), and mark_forbidden_pairs()
previously wrote a large finite BIG_COST into forbidden cells. The
Jonker-Volgenant and small-n SSP solvers treated BIG_COST as a
regular expensive edge and could degenerate on sparse, near-square
inputs instead of short-circuiting on infeasibility. Switched to Inf
so the C++ solvers' non-finite check fires.
Auto-dispatch no longer routes sparse inputs through SSP for small n.
Previously lap_solve() with method = "auto" selected "sap"
(lap_solve_ssp) for sparse matrices with n <= 100. SSP has its own
worst-case stall on near-square, highly-sparse cost matrices. All
sparse inputs now go through lapmod regardless of size.
match_couples() now drops fully-forbidden rows/columns before LAP.
match_couples() and .couples_from_distance() route through a new
internal .solve_with_partial_feasibility() helper. It removes rows
and columns with no allowed edges before the LAP call and falls back
to greedy_matching() if the optimal solver still cannot find a
perfect matching on the feasibility-pruned submatrix (Hall's-condition
violation). Dropped rows/columns are returned as unmatched, preserving
the partial-matching semantics that tests with tight max_distance /
caliper constraints already expected.
jv_core: drop the same-pass reprocess in AUGMENTING ROW REDUCTION.
The reprocess could revisit a freshly-reduced row in the same pass and
delay convergence on degenerate inputs without changing the final
assignment.lap_animate() now covers every method that assignment() accepts.
Ten new step-by-step traces ship: auction_gs, ramshaw_tarjan,
ssap_bucket, hk01, csflow, cycle_cancel, push_relabel, csa,
orlin, network_simplex. animated_methods() returns all 20 method
strings.testthat suite (tests/testthat/test-trace-parity.R) on a
battery of small cost matrices including forbidden cells. Each frame's
matching is validated for in-range entries, no double-bookings, and no
use of forbidden edges; the final-frame total is compared to the C++
oracle within tolerance.R/trace_helpers_frame.R (make_frame(), make_meta(),
prepare_cost_work(), matching_total_cost(), validate_cost_input())
and R/trace_helpers_mcf.R (min-cost-flow graph, residual edges,
Dijkstra with Johnson potentials, Bellman-Ford, negative-cycle finder,
push/extract). Used by all min-cost-flow traces.prepare_cost_matrix.cpp: entries equal to +Inf were treated as
regular very-large costs rather than forbidden, which made cmax
become Inf and silently skipped the maximize flip. Result:
assignment(method = X, maximize = TRUE) on matrices containing
Inf returned the minimizing answer for any solver routing through
prepare_cost_matrix_impl (auction, auction_scaled, sap,
csflow, hk01, bruteforce). Now NA and any non-finite value are
marked forbidden consistently.lap_solve_orlin and lap_solve_network_simplex_wrapper: the
R-side wrapper used work[is.na(work)] <- Inf which missed the -Inf
produced by negating +Inf in maximize mode, letting forbidden cells
slip through as extreme-cost real edges. Fixed to
work[!is.finite(work)] <- Inf.network_simplex initial spanning tree: the greedy initialiser in
ns_init.h built a partial matching (any row that couldn't claim a
fresh column was left unmatched) and connected unmatched columns to row
0. The resulting starting basis violated flow conservation, and pivots
could not recover a perfect matching even when one existed - e.g. on a
5x5 cost matrix with two forbidden cells under maximize,
assignment(method = "network_simplex") returned an infeasible result
with one row unmatched. Fixed by adding an augmenting-path repair after
the greedy pass: every still-unmatched row runs BFS for an augmenting
path on the allowed-edge bipartite graph, extending the initial matching
to a perfect matching whenever one exists.method = "hungarian"
now uses the shortest-augmenting-path solver shared with JV; the original
O(n^4) Munkres implementation remains available as method = "munkres".
At n = 2000 the new Hungarian runs orders of magnitude faster than 1.3.2.auction and auction_gs.
Cleaner inner loop; no behaviour change.solve_auction_scaled collapsed into a thin wrapper over
scaled_params (~200 lines removed); behaviour identical.r > bn pruning is the algorithm, not a wart); added the 6n
pruning heuristic from p.9.paper/benchmark-table.csv and paper/scaling-results.csv re-measured
on the current development machine for n <= 2000 (per-method table) and
n_total <= 2000 (cross-package table). Larger-n rows in both files are
carried over from the previous machine and not directly comparable.test-lap-solve-batch-coverage.R. Debian r-devel, local r-release, and
local R CMD check --as-cran all pass; the crash did not reproduce off
win-builder.Config/testthat/parallel: true
removed from DESCRIPTION) to eliminate cross-file worker-state leakage as
a possible cause of the win-builder crash.skip_on_cran() at the top of
test-lap-solve-batch-coverage.R. Equivalent coverage is exercised
off-CRAN by test-lap-solve-batch-coverage-2.R,
test-lap-solve-batch-coverage-3.R, test-lap-solve-batch-extended.R,
test-batch-coverage-final.R, test-batch-processing.R, and
test-batch-kbest-extended.R.rbind(left, right). The pooled within-group estimator
((n_L-1)*S_L + (n_R-1)*S_R) / (n_L+n_R-2) is the convention used by
optmatch::match_on() and aligns Mahalanobis behaviour across the matching
packages a user is likely to compare against. Users who relied on the old
default can recover it explicitly with
match_couples(..., sigma = cov(rbind(left[, vars], right[, vars]))).
The previous docstring already documented the default as "pooled
covariance"; this release makes the code match the documentation.full_match() gains method = "optimal" (new default) using a min-cost
max-flow solver (Dijkstra + Johnson potentials) that finds the globally
optimal group assignment minimizing total distance:
min_controls per groupn_left > n_rightsolve_full_matching.cpp (self-contained MCMF)method = "greedy" preserved for fast approximate matchingfull_match() examplefull_match() function assigns every unit to a matched group
with variable ratios (1:k or k:1):
caliper (absolute) or caliper_sd (SD-based)min_controls, max_controlsfull_matching_result S3 classcem_match() function implements coarsened exact matching:
grouping parametercutpoints parametercem_result S3 class with matched units and strata summarysubclass_match() function divides units into propensity score
strata:
subclass_result S3 class with subclass summarymatch_data() generic converts any couplr result to analysis-ready
format with treatment, weights, subclass, and distance columns.
Methods for all result types (matching, full, CEM, subclass).as_matchit() converter creates matchit-class objects from couplr
results, enabling interop with cobalt, marginaleffects, and other MatchIt
ecosystem packages.bal.tab() methods for all couplr result types. Requires
cobalt package (in Suggests).rcond() instead of fragile det() == 0sigma parameter in match_couples(), greedy_couples(), and
compute_distance_matrix() for user-supplied covariance matricesbalance_diagnostics() and join_matched() are now S3 generics with
methods for all result types. Existing code is 100% backward-compatible.full_match() - Variable-ratio full matchingcem_match() - Coarsened exact matchingsubclass_match() - Propensity score subclassificationmatch_data() - Unified analysis-ready outputas_matchit() - Convert to MatchIt formatratio parameter in match_couples() and
greedy_couples(). Matches k control units to each treated unit by
replicating the cost matrix, then deduplicates assignments.replace parameter. Each treated unit
independently selects its nearest control, allowing controls to be reused
across multiple treated units.ps_match() function wraps match_couples() with logistic regression:
glm objectcardinality_match() function maximizes sample size subject to
balance constraints:
max_std_diff (default: 0.1 for excellent balance)batch_fractionsensitivity_analysis() function implements Rosenbaum bounds:
print(), summary(), plot()autoplot() methods for ggplot2-based visualizations (requires ggplot2):
autoplot.matching_result(): histogram, density, or ecdf of distancesautoplot.balance_diagnostics(): love plot, histogram, or variance ratio plotautoplot.sensitivity_analysis(): gamma vs p-value curvesummary.matching_result() now reports match rate and distance
percentilesps_match() - Propensity score matching with logit calipercardinality_match() - Balance-constrained cardinality matchingsensitivity_analysis() - Rosenbaum bounds sensitivity analysisselect() in vignettes by using explicit
dplyr::select() to prevent masking by MASS or other packagesThe package now includes intelligent preprocessing to improve matching quality:
auto_scale parameter in match_couples() and greedy_couples() enables automatic preprocessingpreprocess_matching_vars() function for manual preprocessing controlComprehensive tools to assess matching quality:
balance_diagnostics() function computes multiple balance metrics:
balance_table() creates publication-ready formatted tablesCreate analysis-ready datasets directly from matching results:
join_matched() function automates data preparation:
left_vars and right_vars parameters_left, _right) for overlapping columnspair_id, distance, block_idaugment() method for tidymodels integration:
join_matched() parametersinclude_distance - Include/exclude matching distanceinclude_pair_id - Include/exclude sequential pair IDsinclude_block_id - Include/exclude block identifiersleft_id and right_idPerformance optimization for exploring multiple matching strategies:
compute_distances() function precomputes and caches distance matrices:
join_matched()distance_object):
match_couples() and greedy_couples()match_couples(dist_obj, max_distance = 5)update_constraints():
max_distance or calipers without recomputing distancesmatch_couples(left, right = NULL, vars = NULL, ...)Speed up blocked matching with multi-core processing:
parallel parameter in match_couples() and greedy_couples():
parallel = TRUE for automatic configurationparallel = "multisession" or other future planfuture package:
Like testthat, couplr makes errors light, memorable, and helpful with couple-themed messages:
check_costs parameter (default: TRUE) in match_couples() and greedy_couples():
FALSE to skip checks in production codediagnose_distance_matrix():
options(couplr.emoji = FALSE) if preferredpreprocess_matching_vars() - Main preprocessing orchestratorbalance_diagnostics() - Comprehensive balance assessmentbalance_table() - Formatted balance tables for reportingjoin_matched() - Create analysis-ready datasets from matching resultsaugment.matching_result() - Broom-style interface for joined datacompute_distances() - Precompute and cache distance matricesupdate_constraints() - Modify constraints on distance objectsis_distance_object() - Type checking for distance objectsdiagnose_distance_matrix() - Comprehensive distance diagnosticscheck_cost_distribution() - Check for distribution problemsexamples/auto_scale_demo.R - 5 preprocessing demonstrationsexamples/balance_diagnostics_demo.R - 6 balance diagnostic examplesexamples/join_matched_demo.R - 8 joined dataset demonstrationsexamples/distance_cache_demo.R - Distance caching and reuse examplesexamples/parallel_matching_demo.R - 7 parallel processing examplesexamples/error_messages_demo.R - 10 fun error message demonstrationsThe package has been renamed from lapr to couplr to better reflect its purpose as a general pairing and matching toolkit.
couplr = Optimal pairing and matching via linear assignment
First official stable release with clean, well-organized codebase.
morph_* naming prefixassignment() (low-level) + lap_solve() (tidy)src/core/ - Utilities and headerssrc/interface/ - Rcpp exportssrc/solvers/ - 14 LAP algorithmssrc/gabow_tarjan/ - Gabow-Tarjan solversrc/morph/ - Image morphingHungarian, Jonker-Volgenant, Auction (3 variants), SAP/SSP, SSAP-Bucket, Cost-scaling, Cycle-cancel, Gabow-Tarjan, Hopcroft-Karp, Line-metric, Brute-force, Auto-select
✅ Tidy tibble interface
✅ Matrix & data frame inputs
✅ Grouped data frames
✅ Batch solving + parallelization
✅ K-best solutions (Murty, Lawler)
✅ Rectangular matrices
✅ Forbidden assignments (NA/Inf)
✅ Maximize/minimize
✅ Pixel morphing visualization
lap_solve() - Main tidy interfacelap_solve_batch() - Batch solvinglap_solve_kbest() - K-best solutionsassignment() - Low-level solverget_total_cost(), as_assignment_matrix(), etc.pixel_morph(), pixel_morph_animate()Development history under "lapr" available in git log before v1.0.0.