Stata for Students: Means and Confidence Intervals

This article is part of the Stata for Students series. If you are new to Stata we strongly recommend reading all the articles in the Stata Basics section.

In principle, estimating the mean value of a variable in a population and calculating the mean value of a variable in a sample are very different tasks. In practice, this distinction is obscured by the fact that most of the time the sample mean is the best estimate for the population mean. In this section we'll discuss two commands that estimate the mean value of a variable for a population and give you a 95% confidence interval for that estimate.

Setting Up

If you plan to carry out the examples in this article, make sure you've downloaded the GSS sample to your U:\SFS folder as described in Managing Stata Files. Then create a do file called ci.do in that folder that loads the GSS sample as described in Doing Your Work Using Do Files. If you plan on applying what you learn directly to your homework, create a similar do file but have it load the data set used for your assignment.

Mean in a Sample

The summarize command gives you the sample mean, as described in the Descriptive Statistics section:

sum educ

Produces:

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
        educ |        254    13.38583    3.336343          0         20

Mean in a Population

The mean command estimates the population mean:

mean educ

Produces:

Mean estimation                   Number of obs   =        254

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        educ |   13.38583   .2093408      12.97355     13.7981
--------------------------------------------------------------

Note how the estimated mean is exactly the same as that produced by sum. However, mean gives you a 95% confidence interval for that estimate.

You can get the same results using the ci (confidence interval) command while specifying that you want the mean:

ci mean educ

This produces:

    Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
        educ |        254    13.38583    .2093408        12.97355     13.7981

The mean and ci commands can do a variety of other things, but for this purpose they produce the exact same results so which you use is purely a matter of taste—most likely your instructor's taste.

Complete Do File

The following is a complete do file for this section.

capture log close
log using ci.log, replace

clear all
set more off

use gss_sample

sum educ
mean educ
ci mean educ

log close

Last Revised: 9/2/2016