This vignette illustrates how to estimate bid-ask spreads from open, high, low, and close prices using the efficient estimator described in Ardia, Guidotti, & Kroencke (JFE, 2024): https://doi.org/10.1016/j.jfineco.2024.103916.
The function edge
computes a single bid-ask spread
estimate from vectors of open, high, low, and close prices. The
functions edge_rolling
and edge_expanding
are
optimized for fast calculations over rolling and expanding windows,
respectively. The function spread
provides additional
functionalities for xts
objects and implements additional
estimators. For all functions, an output value of 0.01 corresponds to a
spread estimate of 1%.
edge
, edge_rolling
,
edge_expanding
These functions can be easily used with tidy data. For instance, download daily prices for Bitcoin and Ethereum using the crypto2 package:
library(dplyr)
library(crypto2)
df <- crypto_list(only_active=TRUE) %>%
filter(symbol %in% c("BTC", "ETH")) %>%
crypto_history(start_date = "20200101", end_date = "20221231")
head(df)
#> # A tibble: 6 × 17
#> id slug name symbol timestamp ref_cur_id ref_cur_name
#> <int> <chr> <chr> <chr> <dttm> <chr> <chr>
#> 1 1 bitcoin Bitcoin BTC 2020-01-01 23:59:59 2781 USD
#> 2 1 bitcoin Bitcoin BTC 2020-01-02 23:59:59 2781 USD
#> 3 1 bitcoin Bitcoin BTC 2020-01-03 23:59:59 2781 USD
#> 4 1 bitcoin Bitcoin BTC 2020-01-04 23:59:59 2781 USD
#> 5 1 bitcoin Bitcoin BTC 2020-01-05 23:59:59 2781 USD
#> 6 1 bitcoin Bitcoin BTC 2020-01-06 23:59:59 2781 USD
#> # ℹ 10 more variables: time_open <dttm>, time_close <dttm>, time_high <dttm>,
#> # time_low <dttm>, open <dbl>, high <dbl>, low <dbl>, close <dbl>,
#> # volume <dbl>, market_cap <dbl>
Estimate the spread for each coin in each year:
df %>%
mutate(yyyy = format(timestamp, "%Y")) %>%
group_by(symbol, yyyy) %>%
arrange(timestamp) %>%
summarise("EDGE" = edge(open, high, low, close))
#> # A tibble: 6 × 3
#> # Groups: symbol [2]
#> symbol yyyy EDGE
#> <chr> <chr> <dbl>
#> 1 BTC 2020 0.00319
#> 2 BTC 2021 0.00376
#> 3 BTC 2022 0.000200
#> 4 ETH 2020 0.00223
#> 5 ETH 2021 0.00628
#> 6 ETH 2022 0.00262
Estimate the spread using a rolling window of 30 days for each coin and plot the results:
library(ggplot2)
df %>%
group_by(symbol) %>%
arrange(timestamp) %>%
mutate("EDGE (rolling)" = edge_rolling(open, high, low, close, width = 30)) %>%
ggplot(aes(x = timestamp, y = `EDGE (rolling)`, color = symbol)) +
geom_line() +
theme_minimal()
Estimate the spread using an expanding window for each coin and plot the results:
df %>%
group_by(symbol) %>%
arrange(timestamp) %>%
mutate("EDGE (expanding)" = edge_expanding(open, high, low, close)) %>%
ggplot(aes(x = timestamp, y = `EDGE (expanding)`, color = symbol)) +
geom_line() +
theme_minimal()
Notice that, generally, using intraday data (instead of daily) improves the estimation accuracy, especially when the spread is expected to be small (see example below).
spread
The function spread()
provides additional
functionalities for xts objects and
implements additional estimators. For instance, download daily data for
Microsoft (MSFT) using the quantmod package
which returns an xts
object:
library(quantmod)
x <- getSymbols("MSFT", auto.assign = FALSE, start = "2019-01-01", end = "2022-12-31")
head(x)
#> MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
#> 2007-01-03 29.91 30.25 29.40 29.86 76935100 21.23530
#> 2007-01-04 29.70 29.97 29.44 29.81 45774500 21.19974
#> 2007-01-05 29.63 29.75 29.45 29.64 44607200 21.07885
#> 2007-01-08 29.65 30.10 29.53 29.93 50220200 21.28508
#> 2007-01-09 30.00 30.18 29.73 29.96 44636600 21.30642
#> 2007-01-10 29.80 29.89 29.43 29.66 55017400 21.09307
class(x)
#> [1] "xts" "zoo"
Estimate the spread with:
or, equivalently:
Estimate the spread for each month and plot the estimates:
Estimate the spread using a rolling window of 21 obervations:
To illustrate higher-frequency estimates, download intraday data from Alpha Vantage. You must register with Alpha Vantage in order to download their data, but the one-time registration is fast and free. Register at https://www.alphavantage.co/ to receive your key. You can set the API key globally as follows:
Download minute data for Microsoft:
x <- getSymbols(
Symbols = "MSFT",
auto.assign = FALSE,
src = "av",
periodicity = "intraday",
interval = "1min",
output.size = "full")
Keep only prices during regular market hours:
x <- x["T09:30/T16:00"]
head(x)
#> MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume
#> 2023-08-17 09:30:00 320.540 321.870 320.405 321.75 364230
#> 2023-08-17 09:31:00 321.780 321.781 320.890 321.04 66948
#> 2023-08-17 09:32:00 321.080 321.330 320.805 321.16 61487
#> 2023-08-17 09:33:00 321.220 321.220 320.450 320.63 51775
#> 2023-08-17 09:34:00 320.625 320.920 320.480 320.60 57119
#> 2023-08-17 09:35:00 320.570 320.860 320.455 320.71 90454
Estimate the spread for each day and plot the estimates:
Use multiple estimators and plot the estimates:
sp <- spread(x, width = endpoints(x, on = "day"), method = c("EDGE", "AR", "CS", "ROLL"))
plot(sp, type = "b", legend.loc = "topright")
If you find this package useful, please star the repo! The repository also contains implementations for Python, C++, MATLAB, and more; as well as open data containing bid-ask spread estimates for crypto pairs in Binance and for U.S. stocks in CRSP.
Ardia, D., Guidotti, E., Kroencke, T.A. (2024). Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices. Journal of Financial Economics, 161, 103916. doi: 10.1016/j.jfineco.2024.103916
A BibTex entry for LaTeX users is:
@article{edge,
title = {Efficient estimation of bid–ask spreads from open, high, low, and close prices},
journal = {Journal of Financial Economics},
volume = {161},
pages = {103916},
year = {2024},
doi = {https://doi.org/10.1016/j.jfineco.2024.103916},
author = {David Ardia and Emanuele Guidotti and Tim A. Kroencke},
}