Package: joinspy 0.8.3

joinspy: Diagnostic Tools for Data Frame Joins

Provides diagnostic tools for understanding and debugging data frame joins. Analyzes key columns before joining to detect duplicates, mismatches, encoding issues, and other common problems. Explains unexpected row count changes and provides safe join wrappers with cardinality enforcement. Concepts and diagnostics build on tidy data principles as described in 'Wickham' (2014) <doi:10.18637/jss.v059.i10>.

Authors:Gilles Colling [aut, cre, cph]

joinspy_0.8.3.tar.gz
joinspy_0.8.3.zip(r-4.7)joinspy_0.8.3.zip(r-4.6)joinspy_0.8.3.zip(r-4.5)
joinspy_0.8.3.tgz(r-4.6-any)joinspy_0.8.3.tgz(r-4.5-any)
joinspy_0.8.3.tar.gz(r-4.7-any)joinspy_0.8.3.tar.gz(r-4.6-any)
joinspy_0.8.3.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
joinspy/json (API)

# Install 'joinspy' in R:
install.packages('joinspy', repos = c('https://gcol33.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/gcol33/joinspy/issues

Pkgdown/docs site:https://gillescolling.com

On CRAN:

Conda:

data-wranglingdiagnosticsdyplrjoins

5.83 score 3 stars 9 scripts 309 downloads 20 exports 2 dependencies

Last updated from:735c74980c. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK136
source / vignettesOK181
linux-release-x86_64OK135
macos-release-arm64OK104
macos-oldrel-arm64OK77
windows-develOK85
windows-releaseOK88
windows-oldrelOK91
wasm-releaseOK131

Exports:analyze_join_chaincheck_cartesiandetect_cardinalityfull_join_spyget_log_fileinner_join_spyis_join_reportjoin_diffjoin_explainjoin_repairjoin_spyjoin_strictkey_checkkey_duplicateslast_reportleft_join_spylog_reportright_join_spyset_log_filesuggest_repairs

Dependencies:clirlang

Common Join Problems
Trailing and leading whitespace | Case mismatches | Encoding and invisible characters | Empty strings masquerading as data | Factor keys | Near-matches and typos | Duplicate keys | NA keys | Type mismatches | Numeric keys with floating-point noise | Many-to-many explosions | No matches at all | Differently named key columns | Troubleshooting workflow | See Also

Last update: 2026-06-12
Started: 2026-01-14

Joins in Production
Assertions with key_check() | Silent Joins in Pipelines | Inspecting Reports Programmatically | Cardinality Guards | Testing Join Contracts with testthat | Logging and Audit Trails | Manual logging | Automatic logging | Reading JSON Logs Downstream | Diagnosing a Multi-Join Pipeline | Sampling for Large Datasets | A Complete Production Pattern | The Cost of Diagnostics

Last update: 2026-06-12
Started: 2026-03-31

Quick Start
String Diagnostics | Whitespace | A Quick Gate: key_check() | Keys with Different Names | Case Mismatches | Encoding and Invisible Characters | Near Matches | Type Mismatches | Combining Multiple Issues | Duplicate Keys | The JoinReport Object | print(), summary(), and plot() | Auto-Repair | Dry Run | Applying Repairs | Repairing a Single Table | What Repair Does Not Fix | Repair Suggestions from a Report | Row Count Predictions | Sampling Large Tables | Post-Join Diagnostics | Explaining Row Count Changes | Before/After Comparison | Safe Join Wrappers | Quiet Mode and Deferred Reports | Cardinality Enforcement | Advanced Features | Cartesian Product Detection | Multi-Table Join Chains | Backend Support | Logging and Audit Trails | Quick Reference | Further Reading

Last update: 2026-06-12
Started: 2026-01-23

Why Your Keys Don't Match
What a Join Actually Compares | What join_spy() Scans For | Whitespace | Case | Invisible characters and encodings | Empty strings | Types, factors, and numeric keys | Duplicates, NAs, and predicted row counts | Near matches | Compound keys | Scenario 1: The Excel Export | Scenario 2: Two Databases, Two Conventions | Scenario 3: The PDF Copy-Paste | Scenario 4: The Slowly Growing Mismatch | Scenario 5: Compound Keys | The Pattern

Last update: 2026-06-12
Started: 2026-03-31

Working with Backends
Auto-detection | From wrapper to engine call | Explicit override | Class preservation | Backend differences, demonstrated | Column-name collisions | Row ordering | NA keys match each other | join_strict(), .quiet, and the report flow | Diagnostics are backend-agnostic | Choosing a backend | See Also

Last update: 2026-06-12
Started: 2026-03-31