--- title: "Coverage and topology" author: "Gilles Colling" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Coverage and topology} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} has_sf <- requireNamespace("sf", quietly = TRUE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = has_sf ) ``` The per-feature verbs in [Streaming spatial operations](streaming-spatial.html) treat each geometry on its own. A second family of verbs reads a whole feature set at once: the constructions and topology operations that need every feature in a group together, not one row at a time. These ride a shared partition tier. A `by` argument routes the layer into one shard per group, each shard is processed as an independent coverage, and the shards stream so the resident budget is one group rather than the whole layer. This vignette covers building polygons from lines, cleaning a polygon coverage, decomposing geometry into its parts, set-wise constructions, and linear referencing. Every block runs on the optional [sf](https://r-spatial.github.io/sf/) package. ```{r libs, message = FALSE} library(vectra) library(sf) ``` ## From lines to polygons `spatial_polygonize()` builds the polygonal faces enclosed by a line network, the inverse of taking polygon boundaries. A group's lines are unioned and noded, then the faces of that arrangement are returned, one per row. ```{r polygonize} grid <- st_sfc( st_linestring(rbind(c(0, 0), c(2, 0))), st_linestring(rbind(c(0, 1), c(2, 1))), st_linestring(rbind(c(0, 2), c(2, 2))), st_linestring(rbind(c(0, 0), c(0, 2))), st_linestring(rbind(c(1, 0), c(1, 2))), st_linestring(rbind(c(2, 0), c(2, 2)))) f <- tempfile(fileext = ".vtr") write_vtr(data.frame(geometry = st_as_binary(grid, hex = TRUE)), f) tbl(f) |> spatial_polygonize() |> collect_sf() ``` `spatial_line_merge()` is the line counterpart of a dissolve: it sews segments that meet end to end into maximal linestrings, one row per chain. Segments meeting at a junction of degree greater than two stay separate. ```{r line-merge} seg <- st_sfc( st_linestring(rbind(c(0, 0), c(1, 0))), st_linestring(rbind(c(1, 0), c(2, 0))), st_linestring(rbind(c(2, 0), c(3, 0)))) g <- tempfile(fileext = ".vtr") write_vtr(data.frame(geometry = st_as_binary(seg, hex = TRUE)), g) tbl(g) |> spatial_line_merge() |> collect_sf() ``` ## Cleaning a coverage Real coverages arrive with jittered shared borders, sub-pixel slivers, and more detail than a map needs. Three verbs clean them while keeping adjacent polygons edge-matched. `spatial_snap_grid()` rounds coordinates to a regular grid and repairs the result, one batch at a time. It is the fixed-precision snap-rounding that `spatial_overlay()` applies internally, exposed as a standalone verb, so a layer can be pre-noded to a common precision without running a full overlay. ```{r snap-grid} p <- st_polygon(list(rbind(c(0.04, 0.03), c(1.02, 0.01), c(0.98, 1.03), c(0.01, 0.97), c(0.04, 0.03)))) h <- tempfile(fileext = ".vtr") write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(p), hex = TRUE)), h) tbl(h) |> spatial_snap_grid(0.1) |> collect_sf() ``` `spatial_snap()` instead pulls a streamed layer onto a resident reference layer: vertices and edges within `tolerance` of `y` are moved onto it, so two layers that should share a boundary line up exactly. ```{r snap} ref <- st_sfc(st_linestring(rbind(c(0, 0), c(10, 0)))) line <- st_linestring(rbind(c(0, 0.2), c(5, 0.1), c(10, 0.2))) sn <- tempfile(fileext = ".vtr") write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(line), hex = TRUE)), sn) tbl(sn) |> spatial_snap(ref, tolerance = 0.5) |> collect_sf() ``` `spatial_eliminate()` cleans a polygon coverage by absorbing every feature smaller than `max_area` into a neighbour, the QGIS "Eliminate". Each sliver joins the neighbour it shares the longest border with, or the largest-area neighbour with `into = "largest_area"`. An area-rooted union-find collapses chains of slivers so a connected run flows to its single largest member, whose attributes survive. ```{r eliminate} big <- st_polygon(list(rbind( c(0, 0), c(10, 0), c(10, 10), c(0, 10), c(0, 0)))) sliver <- st_polygon(list(rbind( c(10, 0), c(10.3, 0), c(10.3, 10), c(10, 10), c(10, 0)))) e <- tempfile(fileext = ".vtr") write_vtr(data.frame( id = c("keep", "sliver"), geometry = st_as_binary(st_sfc(big, sliver), hex = TRUE)), e) tbl(e) |> spatial_eliminate(max_area = 5) |> collect_sf() ``` The thin sliver is gone, absorbed into the square it bordered, and the survivor keeps the `keep` attributes. `spatial_simplify()` simplifies a polygon coverage without tearing shared edges. Boundaries are unioned so a shared border is one line, noded into arcs, each arc simplified once with its junction endpoints pinned, then re-polygonized. Adjacent polygons stay edge-matched with no slivers. This is the topology-preserving simplification a per-feature `spatial_map(~ sf::st_simplify())` cannot give, because that simplifies each polygon's copy of a shared border independently. ```{r simplify} p1 <- st_polygon(list(rbind( c(0, 0), c(1, 0), c(1, 0.5), c(1, 1), c(0, 1), c(0, 0)))) p2 <- st_polygon(list(rbind( c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0.5), c(1, 0)))) s <- tempfile(fileext = ".vtr") write_vtr(data.frame( id = c("a", "b"), geometry = st_as_binary(st_sfc(p1, p2), hex = TRUE)), s) tbl(s) |> spatial_simplify(tolerance = 0.6) |> collect_sf() ``` ## Decomposing geometry `spatial_explode()` splits every multipart geometry into its single-part components, a `MULTIPOLYGON` into one row per polygon and so on, copying the source attributes onto each part. An optional `part` argument names a 1-based part-index column. It is the streaming "multipart to singleparts" tool and the inverse of `spatial_dissolve()`. ```{r explode} mp <- st_multipolygon(list( list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0))), list(rbind(c(2, 2), c(3, 2), c(3, 3), c(2, 3), c(2, 2))))) m <- tempfile(fileext = ".vtr") write_vtr(data.frame( id = 1L, geometry = st_as_binary(st_sfc(mp), hex = TRUE)), m) tbl(m) |> spatial_explode(part = "part_id") |> collect_sf() ``` `spatial_topology()` decomposes a polygon coverage into the arcs of its planar topology. The unioned boundaries are noded so a shared border becomes one arc, tagged with the identifiers of the (up to two) polygons on either side: two for an internal shared edge, one for an outer edge. It is the inverse of `spatial_polygonize()`. ```{r topology} q1 <- st_polygon(list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0)))) q2 <- st_polygon(list(rbind(c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0)))) tp <- tempfile(fileext = ".vtr") write_vtr(data.frame( id = c("a", "b"), geometry = st_as_binary(st_sfc(q1, q2), hex = TRUE)), tp) tbl(tp) |> spatial_topology(id = "id") |> collect() ``` The shared edge between `a` and `b` appears once, with both face columns filled; each outer edge appears once with the second face `NA`. `spatial_centerline()` traces the centerline (medial axis) of each polygon from the Voronoi diagram of its densified boundary: the Voronoi edges that fall inside the polygon are its skeleton, merged into maximal lines. `density` sets the boundary sampling and `prune` drops the short spurs the skeleton grows toward convex corners. It is the usual approximation for a river or road centerline from a filled shape; non-polygon geometry passes through unchanged. ```{r centerline} strip <- st_polygon(list(rbind( c(0, 0), c(10, 0), c(10, 2), c(0, 2), c(0, 0)))) cl <- tempfile(fileext = ".vtr") write_vtr(data.frame(geometry = st_as_binary(st_sfc(strip), hex = TRUE)), cl) tbl(cl) |> spatial_centerline(density = 0.25, prune = 0.5) |> collect_sf() ``` ## Set-wise constructions `spatial_construct()` builds a geometry construction from a whole feature set, the constructions a per-feature `spatial_map()` cannot express. A `kind` argument selects it: `"convex_hull"`, `"concave_hull"`, `"envelope"`, `"oriented_box"`, `"enclosing_circle"`, `"inscribed_circle"`, `"pole"` (the pole of inaccessibility, the centre of the maximum inscribed circle), `"voronoi"`, and `"delaunay"`. A `by` argument builds one construction per group; the enclosing kinds emit one feature per group, the tessellations one polygon per cell. ```{r construct-hull} nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) nc$band <- nc$SID74 > 5 n <- tempfile(fileext = ".vtr") write_vtr(data.frame( band = nc$band, geometry = st_as_binary(st_geometry(nc), hex = TRUE)), n) tbl(n) |> spatial_construct("convex_hull", by = "band", crs = st_crs(nc)) |> collect_sf() ``` A tessellation kind returns one polygon per cell, each carrying the group's `by` values: ```{r construct-voronoi} xy <- rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0.4, 0.6)) cloud <- st_sfc(lapply(seq_len(nrow(xy)), function(i) st_point(xy[i, ]))) v <- tempfile(fileext = ".vtr") write_vtr(data.frame(geometry = st_as_binary(cloud, hex = TRUE)), v) tbl(v) |> spatial_construct("voronoi") |> collect_sf() ``` ## Linear referencing `spatial_locate()` places streamed points along a resident line layer: each point gets its nearest line's identifier, the measure (distance along that line to the point's projection), and the perpendicular offset, with an optional `snap` onto the line. The inverse, a measure back to a point, is `sf::st_line_interpolate()` through `spatial_map()`. ```{r locate} roads <- st_sf(road = c("main", "side"), geometry = st_sfc( st_linestring(rbind(c(0, 0), c(10, 0))), st_linestring(rbind(c(0, 5), c(0, 15))))) pts <- data.frame(id = 1:2, x = c(3, 1), y = c(1, 9)) l <- tempfile(fileext = ".vtr") write_vtr(pts, l) tbl(l) |> spatial_locate(roads, coords = c("x", "y"), y_id = "road") |> collect() ``` Point 1 is one unit off `main` at measure 3; point 2 is one unit off `side` at measure 4. With `snap = TRUE` each point's geometry is replaced by its projection onto the matched line. ```{r cleanup, include = FALSE} unlink(c(f, g, h, sn, e, s, m, tp, cl, n, v, l)) ```