Research
If you are looking for a copy of one or more of my existing publications, please visit the Reprints page.
Select discussions of some of my published and in-progress work are located on my Thoughts page.
Presentation archive
If you are looking for a PDF copy of a presentation made at a previous conference, you will find it below, sorted by year.
2024
- (all presentations were symposia introductions, I can send reference materials in my possession upon request)
2023
2022
2021
2020
- All presentations made by colleagues or deferred/cancelled due to COVID
2019
- SMS Special Conference - Coordination Equilibria
- SMS - A Two Study Investigation of Executive CSE
- AOM - Founding Teams and Venture Innovation
- ISA - Competitive Repertoires across the Life Cycle
2018
- SMS - TMTs Competitive Repertoires and Performance: A Requisite Variety Lens
- AOM - Competitive Repertoires and Performance across the Life Cycle
2017
2016
- AOM - Strategic Leader Interfaces
- AOM - A Literature Review of Competitive Actions
-
EURAM - A Typology of SE
2015
2014
Supplemental Materials
Supplemental materials, if any, for existing publications can be found below.
- Fox Simsek and Heavey 2023 - Online Appendix
- Simsek Fox and Heavey 2023 - Appendices A and B
- Supplemental Material - Simsek Heavey and Fox 2021 - Appendix 1
- Supplemental Material - Simsek Heavey and Fox 2018 - Appendix 1
Available datasets used in my papers
All datasets available for download are provided in the list below.
- Kauffman Firm Survey data (directly from Kauffman Foundation)
- Wohlers Reports (available for purchase)
Helpful research tools
The following lists provide links to software, protocols, algorithms and other tools that may be useful, sorted by type of research being performed.
Finding and summarizing relevant literature
These tools are (very roughly) ordered in terms of how useful I have found them in prior work.
- Web of Science – (the stand-by solution for searching high-quality academic outlets, very helpful when paired with Google Scholar or similar broader searches to capture relevant grey literature)
- SCite – extracts citations and the context in which they are used with some rudimentary tagging (e.g., supporting, contrasting, self-cite); now has an generative AI component that helps do the first draft of a literature review for a targeted question (prettty nifty!) – can also be used as a proxy for research impact (more tools on that front below)
- SciSpace – Found by my colleague (Rasha Al-Aswad), which can perform plan text searches of abstracts, process PDFs, and other tools using generative AI
- getAbstract – traditionally provided summaries of books but it looks like they have expanded their library of covered content significantly
- Bentley AI Tech Accelerator – a curated list of tools, several of which (e.g., Elicit and Scholarcy) are useful for literature review and summarization
Tools for performing systematic reviews and meta-analyses
These tools can also be used to find and summarize relevant literature, but do so by relying on citation networks or other forms of connectedness, which means that the natural language and keyword-based search tools above tend to be the first step and are good for all projects, whereas these are more catering to a systematic, methodical scrub of the literature.
- Effect size converter
- Correlational effect size benchmarks
- SUMARI
- Research Transparency Index (tool to determine how transparent a paper is with respect to its methods, reporting, etc.)
- Research Rabbit – found by one of my colleagues (Booma Yandava), this is another way to find relevant literature and also provides some nice visualizations (network analysis and timeline development, among others) and suggestions for future reading
- Bibliometrix – Another great find by Booma, this R package lets you analyze and visualize the literature that you have found and explore by concept, author, semantic relationships and more through tabular, statistical, and network views
- ConnectedPapers – Provides a nice visualization and trawling of citation networks and will conveniently dump the results to BiBTeX files
- Semantic Scholar (open source mapping the graph of scientific papers)
- VOSViewer (a way to map scientific literature landscapes)
Simulations, systems dynamics, and modeling
- Vensim PLE
- Judea Pearl on causal models
- Meta-Log distributions
- Bayesian Modeling and MCMC
- DAGitty (a freeware, online software that helps you draw and verify directed acyclic graphs – DAGs)
Quantitative analyses of primary data
- KonFound It!
- Kristopher Preacher’s grab bag of mediation and moderation tools
- Falk’s Mediation calculators
- Jeremy Dawson’s moderation workbooks
- David Kenny’s tutorials on SEM
- Interactions in non-linear models using margins
- Dealing with ill-conditioned data matrices
- Natural experiments
- Panel data 101
- Worked examples for D-i-D, 2SLS, and similar
- A no-nonsense approachable discussion of data analysis (draws significantly on Angrist and Pischke, 2009)
General purpose statistical software
Working with a number of different colleagues has exposed me to several software packages, from basics like SPSS and eViews to the more typically workhorses of Stata, R, SAS, and python via Anaconda (specifically the jupyter IDE). Projects end up going in different directions, but I tend to land on Stata or R, since these seem to be the most common in the field. But if I am the one who gets to pick, I increasingly tend to lean towards R. Why?
- It’s free! Full stop.
- Extensibility – R is constantly being expanded and is among the top picks for data scientists, social scientists, and applied researchers. This means packages are always coming out and being updated at the cutting edge of what is possible not only in terms of more ‘classical’ statistics (linear models and their extensions) but new tools like machine learning, topic modeling, etc.
- Flexibility – Essentially everything can be a variable in R and there is no need to be referencing one and only one data frame at a time. This allows for a project to “live” totally in R without the need for import or export. Side calculations can happen in line too. Yes, Stata can use its matrix language to do some of this, but I find it much clunkier
- Good IDEs – Base R from the command line is not a great user experience. An IDE or GUI interface makes it much more approachable. I “grew up” on and have used RStudio extensively and would highly recommend. I have started to work on some projects in Visual Studio, specifically the Cursor branch.
- Quarto – the most comprehensive solution I’ve found in R yet to produce websites, PDFs, and presentations that incorporate R code and other elements using Markdown. It takes some getting used to but the learning curve is not that steep and it provides ready access to lots of other power tools, and provides many of the same features as (for example):
- Downsides – I won’t lie, R is not an unalloyed good. Here are some things to keep in mind
- Fewer guardrails: In many cases when you are splicing together functions and routines, you can get R to spit out a result, but that result may not be what you bargained for. When running a new routine for the first time, I would recommend checking the result in another software or using a different package to make sure you are getting a sensible result.
- Data type mismatches: On the other hand, you may be doing what seems like a super simple task but R will refuse to give you a result because the data types are not compatible. Cue the search on StackOverflow or Copilot.
- Learning curve for others: Just because you like R and have invested the time in understanding it does not mean you coauthors have.
- Less “polish”: Since it is free software with user-contributed packages, the ‘polish’ isn’t always there in terms of user experience, output verbosity, or (especially) detailed documentation with good referencing to statistical textbooks or worked examples. Some packages are amazing in this regard, others less so. Not a universal issue but I have run across it.
- Not quite as general: As far as I understand it (I’m no expert here), you can basically do anything in Python that you could do in a general programming language like C. I’m less certain you have that level of generality for R – but for my day-to-day purposes I don’t come up against this limitation.
- Wading through duplicate and overlapping packages. Half of the battle with R is picking what package you want to use to achieve a certain result. Do you use the base linear modeling package (lm) or an econometrics forward one (fixest)? Which is better for hierarchical or multilevel modeling (lme4, nmle)? And for output, do you use stargazer or modelsummary? To be clear, just like stata different commands run different models (regress, ivreg2, heckman), but the comparison here is to say that there may be 3,4 or 5 alternatives to ‘regress’ without a clear understanding of which is best. Many forks in the road to consider
My current “core stack” of R packages (last updated November 2025)
- Quarto: Built into RStudio (no loading required per se), allows for near seamless creation of HTML pages, PDFs, presentations from Markdown (based on the R Markdown package, among others). Some useful tools included:
- Graphviz
- [Native LaTeX output]: Great for formal modeling since it will render your models in good looking mathematical format
- beamer and Revealjs: Allows for the creation of powerpoint presentations from markdown, beamer is LaTeX based, Revealjs is javascript based.
- Shiny: custom applications, which require a hosting service (shinyapps.io is good for simple applications)
- Tidyverse: For general data wrangling and graphics, allows for SQL-like manipulation of datasets, includes (among many other useful packages):
- Workhorse econometric packages
- Fixest: Workhorse function with many convenient features – more convenient than lm,glm models in most cases. Most typical econometric analyses and diagnostics can be run through this interface. Light on diagnostics, which may need to be run in other packages. Also note that random effects panel analyses are NOT an option, need to use lme4 or plm.
- Estimatr is similar in many respects to Fixest but with a slightly different emphasis. I see these largely as substitutes for one another but have picked Fixest basically since I found it first.
- Estimatr is similar in many respects to Fixest but with a slightly different emphasis. I see these largely as substitutes for one another but have picked Fixest basically since I found it first.
- plm: Provides a bevy of panel data diagnostic tests to select between various panel model options (e.g., pooled v. FE, FE v RE); also allows for random effects models (equivalent to lme4 with (1|group) using REML=FALSE, i.e., FIML). More use cases provided in this textbook
- lmtest: Provides the means to perform many standard coefficient and linear hypothesis tests such as Wald tests
- car: John Fox’s [no relation] very helpful Companion for Applied Regression – tons of good regression diagnostic tools
- Fixest: Workhorse function with many convenient features – more convenient than lm,glm models in most cases. Most typical econometric analyses and diagnostics can be run through this interface. Light on diagnostics, which may need to be run in other packages. Also note that random effects panel analyses are NOT an option, need to use lme4 or plm.
- Graphics and Reporting
- ggplot2: Included in the tidyverse package above, makes “prettier” and more flexible graphics than the standard graphics package in R. Has a unique syntax but is worth learning
- ggdag: Package to create directed acyclic graphs (DAGs) using a model-based syntax
- Marginal Effects: Allows for computation of marginal effects in complex models in the style of Stata’s margins command
- Model Summary: Outputs tables in publication ready format. N.B.: I liked and used to use stargazer, but it does not support fixest, but this could be a good alternative for you if you use lm, glm, etc.)
- ggplot2: Included in the tidyverse package above, makes “prettier” and more flexible graphics than the standard graphics package in R. Has a unique syntax but is worth learning
- Other commonly used analytical tools in my field
- lme4: The essential add-on to fixest that allows for random effects panel data models and mixed models to be run. Think “advanced moderation”
- lavaan: Structural equation modeling (SEM) package that allows for path analyses, CFAs, full SEMs with multilevel capabailities with a fairly intuitive syntax. Think “advanced mediation”
Some reference materials for using R
Beyond the specific items above, there are a number of good textbooks for performing econometric analyses in R. These include compendia that employ multiple packages such as Introduction to Econometrics with R or Principles of Econometrics with R. But there are also texts more closely tied to particular packages such as:
- Applied Econometrics with R and the AER package
- An R Companion to Applied Regression and the car package
- the plm package for panel data econometrics
- and a nice translation manual between Stata and R that introduces the fixest package, which is very flexible and also pairs with tidyverse, marginaleffects, and modelsummary (all very useful packages)
- R for Data Science (a nice overall summary – basically a book in HTML form, which is intimately tied to the packages that comprise the tidyverse, including well known and used packages like dplyr and ggplot2
- R Studio cheat sheets for several of the key R packages
- Tidymodels – I have not used this extensively but it looks to be a gentle introduction into Big Data and Machine Learning based methods in R
At the end of the day, there is substantial overlap between what these packages can do (some are even partially dependent on others), but they have different function calls and syntax and some are uniquely able to perform certain functions. In my experience, you will pick one as your “workhorse” and call on the others when needed. Just be careful to keep track of which you are using when! For now, here would be my recommendations to get most of what you would want to get accomplished on a day to day basis (more references regarding data input and reporting provided below):
- Data wrangling: tidyverse I personally have gotten used to tidyverse (e.g., dplyr), which has a bit of a learning curve but plays nice with other packages like ggplot2 and the verbosity actually helps once you can “translate” the commands back into plain English. The major alternative (as far as I understand it) is data.table, which has a different more compact syntax and supposedly is superior for very large datasets. I do not work with ‘big’ datasets enough to really need the optimized speed of data.table. I’d suggest picking one and stick with it, both work and it seems to be a matter of preference except for edge cases.
- Modeling: fixest for standard OLS, IV, DiD, panel data, logit, probit models with “typical” standard error arrangements (a nice introduction for Stata users is here, which also provides references to more speciality modeling like lme4 for hierarchical linear modeling). Other specialties like SEM (lavaan), time series, or survival analysis have their own packages such as lavaan that go beyond my coverage here.
Some other solutions for general data “wrangling”, analysis, visualization, and reporting
- Tableau
- Esttab and Tabout (Results table generators for STATA)
- A seemingly simple tool for creating websites (I use WordPress but this seems effective with fewer startup costs)
Content analyses and qualitative data
Generic automation tools and useful applications
- Zapier (basically a web-based AppleScript)
- AppleScript and Automator (must learn tools if you are using OS X, Bookends can be scripted)
- axiom.ai (browser based automation)
- Bookends (my reference software of choice, OS X only)
Tools for reviewing the work of others
- G*Power
- CORVIDS – dataset reconstruction under certain conditions
- RSprite – creation of datasets consistent with reported summary statistics for validation
- Stanley and Wang (1969) – bounds on unknown correlations given known covariates
- Using reported statistics to reproduce results
Writing and crafting articles
- PowerThesaurus
- Stock cartoons
- Thoughts on academic workflows
- Meme finder
- Quote sites 1, 2, 3, 4
- The CRediT framework
Construct and variable repositories
- Decision making and individual differences repository
- International personality item pool
- Individual and organizational assessment scales
- O*Net
- Quartr (helps search through earnings call data among other things)
- The many ways to classify industries (NAICS, Fama-French 49, TNIC per Hoberg and Philips (2010), trade-based measures)
How to connect to essential databases
- Downloading data directly from WRDS databases into R (with a sample query and overall workflow here)
- Mike Nguyen’s collection of patent-related data links and APIs
Accessing and processing publicly available data
Research impact
- Altmetric – Not a link but actually a javascript that you can add as a bookmark to call up the Altmetric score for any particular article you are looking at (you can see more details at the Altmetric website with information on the API available here)
- Grobid – An automated way for extracting bibliometric data from individual articles
Other interesting databases for specific purposes
- USPTO Patent Litigation Docket Database
- Other useful patent data sources include the full-text patent database, the underlying USPTO data source with information and forms documenting the process by which specific patents are pending, approved, or rejected
- Aggregated research datasets from the USPTO, including the useful PatentsView API as a place to start – along with the more general compilation of patent and innovation data put together by the World Intellectual Property Organization
- The American Time Use Survey is an interesting dataset on how Americans allocate their time to various activities
- See also the list of more general purpose databases in the Tools section of this site
Interesting initiatives started by various groups
- New Methods and Data in Strategic Management Research
- Carnegie School of Organizational Learning
- Organizational Design Community
- Competitive Dynamics Conference
Reaching beyond the scholarly community
- Faculti (a video streaming platform that seeks to bring relevant and timely academic insights to the fore, with interviews from scholars regarding their recent work)
- Research Outreach (a similar intent to Faculti, with a compendium of articles that complement original research and provide a connection between academics and practice)