Data & Code
Interactive datasets, replication materials, and code repositories for my research
This page provides access to datasets, interactive tools, and replication code from my research. All datasets include documentation, and many feature interactive notebooks that run entirely in your browser.
Featured
Teleworkability Index
A continuous, occupation-level index measuring the fraction of work that can be performed remotely. Constructed using O*NET features and ORS survey labels via two-stage Random Forest pipeline. Covers 873 SOC occupations with model performance: R² = 0.93, Correlation = 0.97. Stage 1 classifies zero/non-zero teleworkability (F1 = 0.96), Stage 2 predicts the teleworkability fraction for non-zero occupations (MAE = 0.046).
Related paper: View publication →
Data Use & Citation
All datasets are provided for research and educational purposes. If you use any of these datasets in your work, please cite the corresponding paper. For questions or data requests, please contact me.