This article is part of the Stata for Students series. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.
In principle, estimating the mean value of a variable in a population and calculating the mean value of a variable in a sample are very different tasks. In practice, this distinction is obscured by the fact that most of the time the sample mean is the best estimate for the population mean. In this section we'll discuss two commands that estimate the mean value of a variable for a population and give you a 95% confidence interval for that estimate.
Setting Up
If you plan to carry out the examples in this article, make sure you've downloaded the GSS sample to your U:\SFS folder as described in Managing Stata Files. Then create a do file called ci.do in that folder that loads the GSS sample as described in Doing Your Work Using Do Files. If you plan on applying what you learn directly to your homework, create a similar do file but have it load the data set used for your assignment.
Mean in a Sample
The summarize command gives you the sample mean, as described in the Descriptive Statistics section:
sum educ
Produces:
Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 254 13.38583 3.336343 0 20
Mean in a Population
The mean command estimates the population mean:
mean educ
Produces:
Mean estimation Number of obs = 254 -------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ educ | 13.38583 .2093408 12.97355 13.7981 --------------------------------------------------------------
Note how the estimated mean is exactly the same as that produced by sum. However, mean gives you a 95% confidence interval for that estimate.
You can get the same results using the ci (confidence interval) command while specifying that you want the mean:
ci mean educ
This produces:
Variable | Obs Mean Std. Err. [95% Conf. Interval] -------------+--------------------------------------------------------------- educ | 254 13.38583 .2093408 12.97355 13.7981
The mean and ci commands can do a variety of other things, but for this purpose they produce the exact same results so which you use is purely a matter of taste—most likely your instructor's taste.
Complete Do File
The following is a complete do file for this section.
capture log close
log using ci.log, replace
clear all
set more off
use gss_sample
sum educ
mean educ
ci mean educ
log close
Last Revised: 9/2/2016