Research

If you are looking for a copy of one or more of my existing publications, please visit the Reprints page.

Select discussions of some of my published and in-progress work are located on my Thoughts page.


Presentation archive

If you are looking for a PDF copy of a presentation made at a previous conference, you will find it below, sorted by year.

2024

  • (all presentations were symposia introductions, I can send reference materials in my possession upon request)

2023

2022

2021

2020

  • All presentations made by colleagues or deferred/cancelled due to COVID

2019

2018

2017

2016

2015

2014


Supplemental Materials

Supplemental materials, if any, for existing publications can be found below.


Available datasets used in my papers

All datasets available for download are provided in the list below.


Helpful research tools

The following lists provide links to software, protocols, algorithms and other tools that may be useful, sorted by type of research being performed.

Finding and summarizing relevant literature

These tools are (very roughly) ordered in terms of how useful I have found them in prior work.

  • Web of Science – (the stand-by solution for searching high-quality academic outlets, very helpful when paired with Google Scholar or similar broader searches to capture relevant grey literature)
  • SCite – extracts citations and the context in which they are used with some rudimentary tagging (e.g., supporting, contrasting, self-cite); now has an generative AI component that helps do the first draft of a literature review for a targeted question (prettty nifty!) – can also be used as a proxy for research impact (more tools on that front below)
  • SciSpace – Found by my colleague (Rasha Al-Aswad), which can perform plan text searches of abstracts, process PDFs, and other tools using generative AI
  • getAbstract – traditionally provided summaries of books but it looks like they have expanded their library of covered content significantly
  • Bentley AI Tech Accelerator – a curated list of tools, several of which (e.g., Elicit and Scholarcy) are useful for literature review and summarization

Tools for performing systematic reviews and meta-analyses

These tools can also be used to find and summarize relevant literature, but do so by relying on citation networks or other forms of connectedness, which means that the natural language and keyword-based search tools above tend to be the first step and are good for all projects, whereas these are more catering to a systematic, methodical scrub of the literature.

  • Effect size converter
  • Correlational effect size benchmarks
  • SUMARI
  • Research Transparency Index (tool to determine how transparent a paper is with respect to its methods, reporting, etc.)
  • Research Rabbit – found by one of my colleagues (Booma Yandava), this is another way to find relevant literature and also provides some nice visualizations (network analysis and timeline development, among others) and suggestions for future reading
  • Bibliometrix – Another great find by Booma, this R package lets you analyze and visualize the literature that you have found and explore by concept, author, semantic relationships and more through tabular, statistical, and network views
  • ConnectedPapers – Provides a nice visualization and trawling of citation networks and will conveniently dump the results to BiBTeX files
  • Semantic Scholar (open source mapping the graph of scientific papers)
  • VOSViewer (a way to map scientific literature landscapes)

Simulations, systems dynamics, and modeling

Quantitative analyses of primary data

General purpose statistical software

Working with a number of different colleagues has exposed me to several software packages, from basics like SPSS and eViews to the more typically workhorses of Stata, R, SAS, and python via Anaconda (specifically the jupyter IDE). Projects end up going in different directions, but I tend to land on Stata or R, since these seem to be the most common in the field. But if I am the one who gets to pick, I increasingly tend to lean towards R. Why?

  • It’s free! Full stop.
  • Extensibility – R is constantly being expanded and is among the top picks for data scientists, social scientists, and applied researchers. This means packages are always coming out and being updated at the cutting edge of what is possible not only in terms of more ‘classical’ statistics (linear models and their extensions) but new tools like machine learning, topic modeling, etc.
  • Flexibility – Essentially everything can be a variable in R and there is no need to be referencing one and only one data frame at a time. This allows for a project to “live” totally in R without the need for import or export. Side calculations can happen in line too. Yes, Stata can use its matrix language to do some of this, but I find it much clunkier
  • Good IDEs – Base R from the command line is not a great user experience. An IDE or GUI interface makes it much more approachable. I “grew up” on and have used RStudio extensively and would highly recommend. I have started to work on some projects in Visual Studio, specifically the Cursor branch.
  • Quarto – the most comprehensive solution I’ve found in R yet to produce websites, PDFs, and presentations that incorporate R code and other elements using Markdown. It takes some getting used to but the learning curve is not that steep and it provides ready access to lots of other power tools, and provides many of the same features as (for example):
    • LaTeX and the Beamer package (you can use an online LaTeX typesetter like Overleaf for this purpose) – a way to quickly make nice looking and reusable presentations
    • Bookdown – Tools to convert your R code to LaTeX, and then onwards to PDFs and book formats
  • Downsides – I won’t lie, R is not an unalloyed good. Here are some things to keep in mind
    • Fewer guardrails: In many cases when you are splicing together functions and routines, you can get R to spit out a result, but that result may not be what you bargained for. When running a new routine for the first time, I would recommend checking the result in another software or using a different package to make sure you are getting a sensible result.
    • Data type mismatches: On the other hand, you may be doing what seems like a super simple task but R will refuse to give you a result because the data types are not compatible. Cue the search on StackOverflow or Copilot.
    • Learning curve for others: Just because you like R and have invested the time in understanding it does not mean you coauthors have.
    • Less “polish”: Since it is free software with user-contributed packages, the ‘polish’ isn’t always there in terms of user experience, output verbosity, or (especially) detailed documentation with good referencing to statistical textbooks or worked examples. Some packages are amazing in this regard, others less so. Not a universal issue but I have run across it.
    • Not quite as general: As far as I understand it (I’m no expert here), you can basically do anything in Python that you could do in a general programming language like C. I’m less certain you have that level of generality for R – but for my day-to-day purposes I don’t come up against this limitation.
    • Wading through duplicate and overlapping packages. Half of the battle with R is picking what package you want to use to achieve a certain result. Do you use the base linear modeling package (lm) or an econometrics forward one (fixest)? Which is better for hierarchical or multilevel modeling (lme4, nmle)? And for output, do you use stargazer or modelsummary? To be clear, just like stata different commands run different models (regress, ivreg2, heckman), but the comparison here is to say that there may be 3,4 or 5 alternatives to ‘regress’ without a clear understanding of which is best. Many forks in the road to consider

My current “core stack” of R packages (last updated November 2025)

  • Quarto: Built into RStudio (no loading required per se), allows for near seamless creation of HTML pages, PDFs, presentations from Markdown (based on the R Markdown package, among others). Some useful tools included:
    • Graphviz
    • [Native LaTeX output]: Great for formal modeling since it will render your models in good looking mathematical format
    • beamer and Revealjs: Allows for the creation of powerpoint presentations from markdown, beamer is LaTeX based, Revealjs is javascript based.
    • Shiny: custom applications, which require a hosting service (shinyapps.io is good for simple applications)
  • Tidyverse: For general data wrangling and graphics, allows for SQL-like manipulation of datasets, includes (among many other useful packages):
    • dplyr for data manipulation (very powerful and general, once you get used to the syntax)
    • ggplot2 for graphics generation (more details below)
    • readxl: For importing datasets from Excel
  • Workhorse econometric packages
    • Fixest: Workhorse function with many convenient features – more convenient than lm,glm models in most cases. Most typical econometric analyses and diagnostics can be run through this interface. Light on diagnostics, which may need to be run in other packages. Also note that random effects panel analyses are NOT an option, need to use lme4 or plm.
      • Estimatr is similar in many respects to Fixest but with a slightly different emphasis. I see these largely as substitutes for one another but have picked Fixest basically since I found it first.
    • plm: Provides a bevy of panel data diagnostic tests to select between various panel model options (e.g., pooled v. FE, FE v RE); also allows for random effects models (equivalent to lme4 with (1|group) using REML=FALSE, i.e., FIML). More use cases provided in this textbook
    • lmtest: Provides the means to perform many standard coefficient and linear hypothesis tests such as Wald tests
    • car: John Fox’s [no relation] very helpful Companion for Applied Regression – tons of good regression diagnostic tools
  • Graphics and Reporting
    • ggplot2: Included in the tidyverse package above, makes “prettier” and more flexible graphics than the standard graphics package in R. Has a unique syntax but is worth learning
    • ggdag: Package to create directed acyclic graphs (DAGs) using a model-based syntax
    • Marginal Effects: Allows for computation of marginal effects in complex models in the style of Stata’s margins command
    • Model Summary: Outputs tables in publication ready format. N.B.: I liked and used to use stargazer, but it does not support fixest, but this could be a good alternative for you if you use lm, glm, etc.)
  • Other commonly used analytical tools in my field
    • lme4: The essential add-on to fixest that allows for random effects panel data models and mixed models to be run. Think “advanced moderation”
    • lavaan: Structural equation modeling (SEM) package that allows for path analyses, CFAs, full SEMs with multilevel capabailities with a fairly intuitive syntax. Think “advanced mediation”

Some reference materials for using R

Beyond the specific items above, there are a number of good textbooks for performing econometric analyses in R. These include compendia that employ multiple packages such as Introduction to Econometrics with R or Principles of Econometrics with R. But there are also texts more closely tied to particular packages such as:

At the end of the day, there is substantial overlap between what these packages can do (some are even partially dependent on others), but they have different function calls and syntax and some are uniquely able to perform certain functions. In my experience, you will pick one as your “workhorse” and call on the others when needed. Just be careful to keep track of which you are using when! For now, here would be my recommendations to get most of what you would want to get accomplished on a day to day basis (more references regarding data input and reporting provided below):

  • Data wrangling: tidyverse I personally have gotten used to tidyverse (e.g., dplyr), which has a bit of a learning curve but plays nice with other packages like ggplot2 and the verbosity actually helps once you can “translate” the commands back into plain English. The major alternative (as far as I understand it) is data.table, which has a different more compact syntax and supposedly is superior for very large datasets. I do not work with ‘big’ datasets enough to really need the optimized speed of data.table. I’d suggest picking one and stick with it, both work and it seems to be a matter of preference except for edge cases.
  • Modeling: fixest for standard OLS, IV, DiD, panel data, logit, probit models with “typical” standard error arrangements (a nice introduction for Stata users is here, which also provides references to more speciality modeling like lme4 for hierarchical linear modeling). Other specialties like SEM (lavaan), time series, or survival analysis have their own packages such as lavaan that go beyond my coverage here.

Some other solutions for general data “wrangling”, analysis, visualization, and reporting

Content analyses and qualitative data

Generic automation tools and useful applications

  • Zapier (basically a web-based AppleScript)
  • AppleScript and Automator (must learn tools if you are using OS X, Bookends can be scripted)
  • axiom.ai (browser based automation)
  • Bookends (my reference software of choice, OS X only)

Tools for reviewing the work of others

Writing and crafting articles

Construct and variable repositories

How to connect to essential databases

Accessing and processing publicly available data

Research impact

  • Altmetric – Not a link but actually a javascript that you can add as a bookmark to call up the Altmetric score for any particular article you are looking at (you can see more details at the Altmetric website with information on the API available here)
  • Grobid – An automated way for extracting bibliometric data from individual articles

Other interesting databases for specific purposes

Interesting initiatives started by various groups

Reaching beyond the scholarly community

  • Faculti (a video streaming platform that seeks to bring relevant and timely academic insights to the fore, with interviews from scholars regarding their recent work)
  • Research Outreach (a similar intent to Faculti, with a compendium of articles that complement original research and provide a connection between academics and practice)