Professor Alyssa Carlson publishes a new method for estimating binary response models with endogeneity in the Journal of Econometrics. Binary outcomes, variables that only takes on the values 0 or 1, are commonly found in many economic questions and dataset (i.e., employment, attending college, purchasing products) in which we wish to estimate a causal effect (i.e., effect of job training on employment, effect of financial aid on college attendance, effect of price on purchasing products). The methodological contribution of this paper is identification and valid estimation of these causal effects without strong assumptions previously used in the literature. These previous assumptions were not realistic in some settings. For example, when modeling the effect of price on purchasing a product, the previous assumptions place a functional form restrictions on the utility function that unobserved market/product/individual characteristics are additively separable from observed market/product/individual characteristics. The figure below shows how the proposed estimator (gcfprobit) does a much better job estimating the true average structural function (predicted probability that the outcome is equal to 1 over different levels of the endogenous variable) relative to over estimators in the literature (ivprobit and sml).
Formal Abstract: For binary response models, the literature primarily addresses endogeneity by a control function approach assuming conditional independence (CF-CI). However, as the literature also notes, CF-CI implies conditions like homoskedasticity (of the latent error with respect to the instruments) that fail in many empirical settings. I propose an alternative approach that allows for heteroskedasticity, achieving identification with a conditional mean restriction. These identification results apply to a latent Gaussian error term with flexibly parametrized heteroskedasticity. I propose a two step conditional maximum likelihood estimator and derive its asymptotic distribution. In simulations, the new estimator outperforms others when CF-CI fails and is fairly robust to distributional misspecification.