Asset Pricing¶
- Kenneth French Data Library CSV US Equities — Fama/French factor returns and research portfolios (3-factor, 5-factor, momentum, industry portfolios, and more). Downloadable as CSV. Updated monthly. Coverage: 1926–present. Also available via WRDS with Python API access — see our Fama-French WRDS guide. Also accessible in Python/R via community packages.
- Open Source Asset Pricing CSV Python Equities — Replicated portfolio returns and stock-level signals for 200+ cross-sectional predictors from the academic literature, by Chen & Zimmermann (2022, Critical Finance Review). Downloadable as CSV and accessible via Python package. Updated periodically. (GitHub)
- JKP Factors CSV Global Equities — Factor return data for 153 equity anomalies, organized into 13 themes (value, momentum, profitability, investment, and more) for 93 countries. Accompanies Jensen, Kelly & Pedersen (2023, Journal of Financial Economics), which studies replicability and significance of equity factors. Includes long-short portfolio returns and stock-level signal data.
- Global Factor Premia CSV Global Equities — Factor data for global equities, covering firm characteristics and stock returns across international markets. Associated papers: Hanauer & Windmüller (2020, "Enhanced Momentum Strategies"), Hanauer (2020, "A Comparison of Global Factor Models"), and Windmüller (2022, "Firm Characteristics and Global Stock Returns: A Conditional Asset Pricing Model", The Review of Asset Pricing Studies).
Volatility & Options Data¶
- CBOE Volatility Indices CSV US Equities — Free historical data for VIX (implied volatility of S&P 500 options, 1990–present), VIX9D, VIX3M, VIX6M (term structure), SKEW (tail risk), and put/call ratios. Daily frequency. Downloadable as CSV from the CBOE website. The VIX is the standard "fear gauge" in asset pricing and risk management research; SKEW captures option-implied tail risk.
- VOLARE (VOLatility Archive for Realized Estimates) CSV US Global Equities — Open research infrastructure providing standardized realized volatility and covariance measures constructed from ultra-high-frequency (tick-level) financial data. Covers U.S. equities, exchange rates, and futures (including commodities). Delivers realized variance, bipower variation, semivariances, realized quarticity, realized kernels, range-based measures, and multivariate realized covariance — computed at 1-minute and 5-minute sampling frequencies. Built from Kibot tick data with documented outlier detection and microstructure noise handling. The accompanying paper lists covered assets in Appendix Tables A.1 (stocks), A.2 (exchange rates), and A.3 (futures, including an S&P 500 ETF). Reference: arXiv:2602.19732. See also: Commodities & Futures | Foreign Exchange.
Short Selling¶
- FINRA Short Interest CSV US Equities — Bi-monthly (mid-month and end-of-month) short interest positions for all equity securities, published by FINRA. Reports total shares sold short per security. Useful for studying short selling constraints, investor disagreement, and overvaluation.
- FINRA Daily Short Sale Volume CSV US Equities — Daily aggregate short sale volume by security across all FINRA-regulated venues. Published next business day. Covers total and short volume per ticker. Note: short sale volume ≠ short interest (positions); volume captures daily flow, not outstanding positions.
SEC Market Structure Data CSV US¶
The SEC provides free data on U.S. equity market microstructure through MIDAS (Market Information Data Analytics System) and the SEC Data Library. These datasets are valuable for research on market microstructure, institutional ownership, and trading frictions.
- Fails-to-Deliver Data — Twice-monthly settlement failure data for all equity securities, including date, CUSIP, ticker, issuer name, price, and total fails outstanding. Useful for short selling and settlement risk research.
- Market Structure Data Downloads — Overview page for all SEC equity market structure datasets derived from MIDAS. Includes interactive visualization tools and downloadable data files.
- Metrics by Individual Security — Market microstructure metrics for 4,800+ individual equity securities, including cancel-to-trade ratios, hidden order rates, and odd-lot statistics.
- Summary Metrics by Exchange — Market microstructure metrics aggregated by security type (stock/ETP) and exchange, covering trade-to-order volume, cancel-to-trade ratios, odd-lot rates, and hidden order rates.
- Summary Metrics by Decile and Quartile — Market structure metrics partitioned by security characteristics (market capitalization, price level, volatility, and turnover).
- Hazard, Survivor & Cumulative Distribution Data — Hazard, survivor, and cumulative distribution functions for quote lifetimes across market cap groups, from microseconds to full trading days.
- Spreads and Depth by Individual Security — Bid-ask spreads and market depth data by individual security and date, useful for studying liquidity and transaction costs.
Researcher Datasets¶
- Amit Goyal — Predictive variables for stock returns and bond returns, widely used in empirical asset pricing.
- Grigory Vilkov — Datasets and code on option pricing, volatility, and risk management.
- Zhiguo He — Data on intermediary capital ratios and associated risk factors (He, Kelly & Manela).
- Robert Stambaugh — Datasets on asset pricing, market efficiency, and liquidity.
- Vincent Bogousslavsky — Datasets on intraday market microstructure and trading patterns.
See also: Bonds & Fixed Income for corporate bond factor data | Company Filings (SEC EDGAR) for institutional holdings data (Form 13F) | Data Collections for general-purpose data repositories | Fama-French on WRDS for WRDS access to Fama-French data.