The {awdb}
package provides functions for querying the
four endpoints of the Air
and Water Database (AWDB) REST API maintained by the National Water
and Climate Center (NWCC) at the United States Department of Agriculture
(USDA). Endpoints include data, forecast, reference-data, and metadata.
The package is extremely light weight, with Rust via extendr
doing most of
the heavy lifting to deserialize and flatten deeply nested JSON
responses. The package is also designed to support pretty printing of
tibble
s if you import the {tibble}
package.
You can install the release version of {awdb}
from CRAN
with:
install.packages("awdb")
Or you can get the development version from GitHub with:
# install.packages("pak")
::pak("kbvernon/awdb") pak
This package provides a separate function to query each endpoint at the USDA AWDB REST API:
Endpoint | Function |
---|---|
data | get_elements() |
forecasts | get_forecasts() |
reference-data | get_references() |
metadata | get_stations() |
Because the API does not provide for spatial queries, requests made
with areas of interest (aoi
) first ask the API metadata
endpoint for all stations in the database and their spatial coordinates.
It converts the set to an sf
object, performs a spatial
filter with the aoi
, and then sends another request for
elements
or forecasts
at the stations in the
aoi
.
Find all AWDB stations around Bear Lake in northern Utah that measure soil moisture percent at various depths.
library(awdb)
library(sf)
library(tibble)
<- get_stations(bear_lake, elements = "SMS:*")
stations
stations#> Simple feature collection with 9 features and 14 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -111.6296 ymin: 41.68541 xmax: -111.1663 ymax: 42.4132
#> Geodetic CRS: WGS 84
#> # A tibble: 9 × 15
#> station_triplet station_id state_code network_code name dco_code county_name
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 374:UT:SNTL 374 UT SNTL Bug L… UT Rich
#> 2 484:ID:SNTL 484 ID SNTL Frank… UT Franklin
#> 3 1114:UT:SNTL 1114 UT SNTL Garde… UT Cache
#> 4 493:ID:SNTL 493 ID SNTL Giveo… ID Bear Lake
#> 5 1115:UT:SNTL 1115 UT SNTL Klond… UT Cache
#> 6 1013:UT:SNTL 1013 UT SNTL Templ… UT Cache
#> 7 823:UT:SNTL 823 UT SNTL Tony … UT Cache
#> 8 1113:UT:SNTL 1113 UT SNTL Tony … UT Cache
#> 9 1098:UT:SNTL 1098 UT SNTL Usu D… UT Rich
#> # ℹ 8 more variables: huc <chr>, elevation <dbl>, data_time_zone <dbl>,
#> # pedon_code <chr>, shef_id <chr>, begin_date <chr>, end_date <chr>,
#> # geometry <POINT [°]>
USDA NWCC refers to soil, snow, stream, and weather variables measured at AWDB stations as “elements.” Here we get snow water equivalent and soil moisture measurements around Bear Lake in early May of 2015.
<- get_elements(
elements
bear_lake,elements = c("WTEQ", "SMS:8"),
awdb_options = set_options(
begin_date = "2015-05-01",
end_date = "2015-05-07"
)
)
c(
elements["station_triplet",
"element_code",
"element_values"
)]#> # A tibble: 10 × 3
#> station_triplet element_code element_values
#> <chr> <chr> <list>
#> 1 374:UT:SNTL WTEQ <tibble [7 × 2]>
#> 2 471:ID:SNTL WTEQ <tibble [7 × 2]>
#> 3 484:ID:SNTL WTEQ <tibble [7 × 2]>
#> 4 1114:UT:SNTL WTEQ <tibble [7 × 2]>
#> 5 493:ID:SNTL WTEQ <tibble [7 × 2]>
#> 6 1115:UT:SNTL WTEQ <tibble [7 × 2]>
#> 7 1013:UT:SNTL WTEQ <tibble [7 × 2]>
#> 8 823:UT:SNTL WTEQ <tibble [7 × 2]>
#> 9 1113:UT:SNTL WTEQ <tibble [7 × 2]>
#> 10 1098:UT:SNTL WTEQ <tibble [7 × 2]>
"element_values"]][[1]]
elements[[#> # A tibble: 7 × 2
#> date value
#> <chr> <dbl>
#> 1 2015-05-01 2.1
#> 2 2015-05-02 1.1
#> 3 2015-05-03 0
#> 4 2015-05-04 0
#> 5 2015-05-05 0
#> 6 2015-05-06 0
#> 7 2015-05-07 0
These are time series, so the element values come in a list column
containing data.frames with at least date
and
value
columns. Using tidyr::unnest()
is
helpful for unpacking all of them.
Get streamflow forecasts for the Cascades in west central Oregon. As
with get_elements()
, this returns a list column.
<- get_forecasts(cascades, elements = "SRVO")
forecasts
c(
forecasts["station_triplet",
"element_code",
"publication_date",
"forecast_period",
"forecast_values"
)]#> # A tibble: 63 × 5
#> station_triplet element_code publication_date forecast_period forecast_values
#> <chr> <chr> <chr> <chr> <list>
#> 1 14050000:OR:US… SRVO 2025-01-01 00:00 02-01:07-31 <tibble>
#> 2 14050000:OR:US… SRVO 2025-02-01 00:00 02-01:07-31 <tibble>
#> 3 14050000:OR:US… SRVO 2025-01-01 00:00 02-01:09-30 <tibble>
#> 4 14050000:OR:US… SRVO 2025-02-01 00:00 02-01:09-30 <tibble>
#> 5 14050000:OR:US… SRVO 2025-01-01 00:00 04-01:07-31 <tibble>
#> 6 14050000:OR:US… SRVO 2025-02-01 00:00 04-01:07-31 <tibble>
#> 7 14050000:OR:US… SRVO 2025-01-01 00:00 04-01:09-30 <tibble>
#> 8 14050000:OR:US… SRVO 2025-02-01 00:00 04-01:09-30 <tibble>
#> 9 14053500:OR:US… SRVO 2025-01-01 00:00 02-01:07-31 <tibble>
#> 10 14053500:OR:US… SRVO 2025-02-01 00:00 02-01:07-31 <tibble>
#> # ℹ 53 more rows
"forecast_values"]][[1]]
forecasts[[#> # A tibble: 5 × 2
#> probability value
#> <chr> <dbl>
#> 1 10 50
#> 2 30 43
#> 3 50 38
#> 4 70 34
#> 5 90 28
A somewhat unique endpoint for this REST API is called “References.”
If you have ever worked with government employees or the military, you
maybe are aware that they prefer an extremely condensed form of speech
jammed full of acronyms and other codes. The references endpoint helps
clarify their cryptic language with data dictionaries that explain what
each code used in the database actually means. It also in the process
provides an exhaustive list of available options. All of this, you can
access with get_references()
. For instance, if you want a
table showing all possible station elements in the AWDB, it is as simple
as this.
get_references("elements")
#> # A tibble: 115 × 9
#> code name physical_element_name function_code data_precision description
#> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 TAVG AIR TEM… air temperature V 1 Average Ai…
#> 2 TMAX AIR TEM… air temperature X 1 Maximum Ai…
#> 3 TMIN AIR TEM… air temperature N 1 Minimum Ai…
#> 4 TOBS AIR TEM… air temperature C 1 Instantane…
#> 5 PRES BAROMET… barometric pressure C 2 Barometric…
#> 6 BATT BATTERY battery C 2 Battery Vo…
#> 7 BATV BATTERY… battery V 2 <NA>
#> 8 BATX BATTERY… battery X 2 Maximum Ba…
#> 9 BATN BATTERY… battery N 2 Minimum Ba…
#> 10 ETIB BATTERY… battery-eti precip g… C 2 <NA>
#> # ℹ 105 more rows
#> # ℹ 3 more variables: stored_unit_code <chr>, english_unit_code <chr>,
#> # metric_unit_code <chr>
In the above examples, we use set_options()
to pass
additional query parameters. If you don’t pass any arguments, it uses
defaults assumed by the AWDB REST API. It’s important to note that not all
parameters are passed to every endpoint. The references
endpoint, for example, doesn’t take any query parameters other than
reference_type
. To see what goes where, you can print the
awdb_options
list returned by set_options()
.
This will also show you the current values for each parameter.
set_options()
#>
#> ── AWDB Query Parameter Set ────────────────────────────────────────────────────
#> Options passed to each endpoint.
#>
#> VALUE STATION ELEMENT FORECAST
#> networks * [X] [X] [X]
#> duration DAILY [X] [X] [ ]
#> begin_date NULL [ ] [X] [ ]
#> end_date NULL [ ] [X] [ ]
#> period_reference END [ ] [X] [ ]
#> central_tendency NULL [ ] [X] [ ]
#> return_flags FALSE [ ] [X] [ ]
#> return_original_values FALSE [ ] [X] [ ]
#> return_suspect_values FALSE [ ] [X] [ ]
#> begin_publication_date NULL [ ] [ ] [X]
#> end_publication_date NULL [ ] [ ] [X]
#> exceedence_probabilities NULL [ ] [ ] [X]
#> forecast_periods NULL [ ] [ ] [X]
#> station_names NULL [X] [X] [X]
#> dco_codes NULL [X] [X] [X]
#> county_names NULL [X] [X] [X]
#> hucs NULL [X] [X] [X]
#> return_forecast_metadata FALSE [X] [ ] [ ]
#> return_reservoir_metadata FALSE [X] [ ] [ ]
#> return_element_metadata FALSE [X] [ ] [ ]
#> active_only TRUE [X] [X] [X]
#> request_size 10 [ ] [X] [X]