10 MLE

10.1 Binary Dependent Variables

While STATA has seperate commands for different MLE models (logit, nbreg, etc.), R combines some models into single commands. We can use zelig(), the command we learned earlier, and just change the model = portion. Alternatively, there are commands such as glm(), which do the same thing outside of the Zeligverse. We’ll loog at some examples with the iris dataset.

This dataset, originally collected by Ronald Fisher, looks at three different species of iris: setosa, versicolor, and virginica. It provides information on the length and width of flowers’ petal and sepals. Looking at the data, we can see that setosas are rather distinct, and easy to seperate graphically. Veriscolor and virginia are more similar, so we’ll examine them statistically. We can create a new dataframe with the filter() command we learned in Modeling and Wrangling.

To predict whether a flower is setosa or virginica, we could use a logit model.

logit Species Sepal.Length Sepal.Width Petal.Length Petal.Width

In R, we estimate a logit by specifying model = "logit" in zelig().

IF we want to instead estimate a probit model, in STATA, we change the command.

probit Species Sepal.Length Sepal.Width Petal.Length Petal.Width

In R, we change model =.

Logit Probit
(Intercept) −42.638+ −23.985+
(25.707) (13.843)
Sepal.Length −2.465 −1.440
(2.394) (1.272)
Sepal.Width −6.681 −3.778
(4.480) (2.556)
Petal.Length 9.429* 5.316*
(4.737) (2.435)
Petal.Width 18.286+ 10.486+
(9.743) (5.614)
Num.Obs. 100 100
AIC 21.9 21.8
BIC 34.9 34.8
Log.Lik. −5.949 −5.876
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

10.3 Rare-events and Zero-inflation