| Title: | Tidy Schema Validation for Data Frames |
|---|---|
| Description: | Validate data.frames against schemas to ensure that data matches expectations. Define schemas using 'tidyselect' and predicate functions for type consistency, nullability, and more. Schema failure messages can be tailored for non-technical users and are ideal for user-facing applications such as in 'shiny' or 'plumber'. |
| Authors: | Will Hipson [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-3931-2189>) |
| Maintainer: | Will Hipson <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-05-17 06:38:21 UTC |
| Source: | https://github.com/whipson/schematic |
Validate a data.frame against a schema
check_schema(data, schema)check_schema(data, schema)
data |
A data.frame to check |
schema |
A Schema object created with 'schema()' |
invisible if validation passes, otherwise stops with error
my_schema <- schema( mpg ~ is.numeric ) check_schema(mtcars, my_schema)my_schema <- schema( mpg ~ is.numeric ) check_schema(mtcars, my_schema)
Check if all values in a vector are distinct
is_all_distinct(x)is_all_distinct(x)
x |
A vector |
TRUE if the vector has all unique values
is_all_distinct(c(1:5)) # TRUE is_all_distinct(c(1, 1, 2)) # FALSEis_all_distinct(c(1:5)) # TRUE is_all_distinct(c(1, 1, 2)) # FALSE
'NA's are not ignored and any vector with 'NA's will fail unless the whole vector is 'NA'.
is_incrementing(x)is_incrementing(x)
x |
A vector |
TRUE if the vector is sorted
is_incrementing(1:5) # TRUE is_incrementing(letters[1:5]) # TRUE is_incrementing(c(4, 3, 0)) # FALSEis_incrementing(1:5) # TRUE is_incrementing(letters[1:5]) # TRUE is_incrementing(c(4, 3, 0)) # FALSE
Check if all values are not NA
is_non_null(x)is_non_null(x)
x |
A vector |
TRUE if the vector has no NA values
is_non_null(1:5) # TRUE is_non_null(c(1, NA, 3)) # FALSEis_non_null(1:5) # TRUE is_non_null(c(1, NA, 3)) # FALSE
A positive integer is a whole number that is greater than 0.
is_positive_integer(x)is_positive_integer(x)
x |
A vector |
This check requires 'is.integer(x)' to be true. If you want a more flexible check that allows for numbers of type 'numeric' but still want them to be integers, then use 'is_whole_number()'.
'NA's are ignored as long as they are 'NA_integer'.
TRUE if all elements are positive integers (NA ignored)
is_positive_integer(c(1L, 2L, 4L)) # TRUE is_positive_integer(2.4) # FALSE is_positive_integer(-3) # FALSEis_positive_integer(c(1L, 2L, 4L)) # TRUE is_positive_integer(2.4) # FALSE is_positive_integer(-3) # FALSE
'NA's are ignored as long as they are 'NA_character_'.
is_text(x)is_text(x)
x |
A vector |
TRUE if vector is either character or factor
is_text(letters[1:4]) # TRUE is_text(as.factor(letters[1:4])) # TRUE is_text(1) # FALSEis_text(letters[1:4]) # TRUE is_text(as.factor(letters[1:4])) # TRUE is_text(1) # FALSE
Similar to 'is_positive_integer()' but without the constraint that the underlying data type is actually integer. Useful if the numbers are stored as 'numeric' but you want to check that they are whole.
is_whole_number(x)is_whole_number(x)
x |
A vector |
'NA's are ignored.
TRUE if all elements are whole numbers (NA ignored)
is_whole_number(c(2.0, 4.0)) # TRUE is_whole_number(c(-1.4)) # FALSEis_whole_number(c(2.0, 4.0)) # TRUE is_whole_number(c(-1.4)) # FALSE
Predicates that error will store the error messages internally and these can be accessed here.
last_check_errors()last_check_errors()
error messages
last_check_errors()last_check_errors()
This modifies a predicate function to ignore Inf.
mod_infinitable(pred)mod_infinitable(pred)
pred |
A predicate function |
A new predicate that ignores infinites
# The `is_incrementing` predicate will fail here x <- c(1, Inf, 3) is_incrementing(x) # FALSE is_incrementing_inf <- mod_infinitable(is_incrementing) is_incrementing_inf(x) # TRUE# The `is_incrementing` predicate will fail here x <- c(1, Inf, 3) is_incrementing(x) # FALSE is_incrementing_inf <- mod_infinitable(is_incrementing) is_incrementing_inf(x) # TRUE
This modifies a predicate function to ignore NAs.
mod_nullable(pred)mod_nullable(pred)
pred |
A predicate function |
A new predicate that allows NAs
# The `is_incrementing` predicate will fail if there are NAs x <- c(1, NA, 3) is_incrementing(x) # FALSE is_incrementing_null <- mod_nullable(is_incrementing) is_incrementing_null(x) # TRUE# The `is_incrementing` predicate will fail if there are NAs x <- c(1, NA, 3) is_incrementing(x) # FALSE is_incrementing_null <- mod_nullable(is_incrementing) is_incrementing_null(x) # TRUE
Print method for Schema
## S3 method for class 'Schema' print(x, ...)## S3 method for class 'Schema' print(x, ...)
x |
Object of class Schema |
... |
Other arguments passed to 'print()' |
invisible
Create a schema object
schema(...)schema(...)
... |
Formulae of the form tidyselect_expr ~ predicate |
A Schema object
# Simple schema with one declared column my_schema <- schema( mpg ~ is.double ) # Multiple columns my_schema <- schema( Sepal.Length ~ is.numeric, Species ~ is.factor ) # Use tidyselect syntax and anonymous functions my_schema <- schema( starts_with("Sepal") ~ is.numeric, c(Petal.Length, Petal.Width) ~ function(x) all(x > 0) ) # Use named arguments to customize error messages my_schema <- schema( `Must be a positive number` = cyl ~ function(x) all(x > 0) )# Simple schema with one declared column my_schema <- schema( mpg ~ is.double ) # Multiple columns my_schema <- schema( Sepal.Length ~ is.numeric, Species ~ is.factor ) # Use tidyselect syntax and anonymous functions my_schema <- schema( starts_with("Sepal") ~ is.numeric, c(Petal.Length, Petal.Width) ~ function(x) all(x > 0) ) # Use named arguments to customize error messages my_schema <- schema( `Must be a positive number` = cyl ~ function(x) all(x > 0) )