Introduction to Stata

Author

Russell Dimond

Published

January 3, 2025

Introduction

This web book will introduce you to Stata and its core concepts. It has three goals:

First, to prepare you to excel in research methods and applied statistics courses that use Stata. You’ll go in already knowing how Stata works and why it does what it does, so you can focus on learning the material for the course.

Second, to prepare you to take advantage of the rest of the SSCC’s Stata curriculum, both online and in workshops. This includes Data Wrangling in Stata, which teaches critical skills for data-driven research that are often not included in statistical classes.

Third, to teach you how to make your Stata work reproducible right from the beginning, so you never have to unlearn any bad habits.

There are two different approaches one can take to Stata. One is to use it as an interactive tool: you start Stata, load your data, and start typing or clicking on commands. This can be a good way to explore your data, figure out what you want to do, and check that your programs worked properly. It can also be useful when you’re trying to learn something new because you get immediate feedback. However, interactive work cannot be easily reproduced, or modified if you change your mind. It’s also very difficult to recover from mistakes—there’s no “undo” command in Stata.

The other approach is to treat Stata as a programming language. In this approach you write your programs, called do files, and run them. A do file contains the same commands you’d type in interactive Stata, but since they’re written in a permanent file they can be debugged or modified and then rerun at will. They also serve as an exact record of how you obtained your results—a lab notebook for the social scientist. Any work you intend to publish, present or rely on in any way should be done using do files. Thus this workshop will for the most part ignore Stata’s graphical user interface and prepare you to write do files.

To get the most out of Introduction to Stata you need to be an active participant. Open Stata, and type in and run the example code yourself. This will help you retain more, and ensure you get all the details right—Stata is always happy to tell you when you’re wrong. Do the exercises (some of them are straightforward applications of what you just learned; others will require more creativity). Using Stata is not something you read and understand—it’s a skill you must practice.

Running Stata at the SSCC

The SSCC makes Stata available in our computer labs and on our Windows and Linux. servers. You can also download it from the UW-Madison Campus Software Library and install it on your computer.

Winstat will let you use Stata in a familiar Windows environment. Winstat for Big Jobs gives you more memory and the ability to start a long job and then disconnect from the server while you wait for it to finish.

Linstat gives you much more computing power and memory. If you connect using Open OnDemand, you’ll get a Linux desktop and then you can start Stata with the same graphical user interface as in Windows or macOS. You can also submit Stata jobs to Slurm, where you can use up to 64 cores and a terabyte of memory, and jobs can run for up to 30 days.

For more information about the SSCC’s computing resources, including details about how to use them to run Stata, see the Guide to Research Computing at the SSCC.