Exercise Solutions

clear all
use dates

Exercise 1

gen birthdate = mdy(month_of_birth, day_of_birth, year_of_birth)
format birthdate %td
list birthdate


     +-----------+
     | birthdate |
     |-----------|
  1. | 19sep1975 |
     +-----------+

Exercise 2

gen birthmonth = ym(year_of_birth, month_of_birth)
format birthmonth %tm
list birthmonth


     +----------+
     | birthm~h |
     |----------|
  1. |   1975m9 |
     +----------+

Exercise 3

gen nd_time2 = mdyhms(nd_month, nd_day, nd_year, 8, 0, 0)
format nd_time2 %tc
list nd_time2


     +--------------------+
     |           nd_time2 |
     |--------------------|
  1. | 15feb2023 08:00:52 |
     +--------------------+

Without gen double, the nd_time2 variable is created as the default variable type, float. float has about seven digits of accuracy (compared to sixteen for double), which is not enough to store time in milliseconds precisely. The result is a rounding error of 52 seconds (about 52,000 milliseconds). Always use double for datetime variables!

Exercise 4

gen interview_date = date(interview, "DMY")
format interview_date %td
list interview_date


     +-----------+
     | interv~te |
     |-----------|
  1. | 01may2005 |
     +-----------+

Exercise 5

You can create a new variable to hold the combination of interview and interview_time, but you can also just pass them to clock directly.

gen double interview_datetime = clock(interview+interview_time, "DMYhm")
format interview_datetime %tc
list interview_datetime


     +--------------------+
     | interview_datetime |
     |--------------------|
  1. | 01may2005 10:15:00 |
     +--------------------+

Note that the combined string has no separator between year and hour:

display interview + interview_time

1 May, 200510:15AM

That’s okay: Stata is smart enough to know that 2005 is a year and what follows must be the hour.

gen double now = clock(c(date) + c(datetime), "DMYhms")
format now %tc
list now

c(date) undefined
r(133);

Exercise 6

If you haven’t already created sd1_date start with:

gen sd1_date = date(sd1, "MDY")
format sd1_date %td
list sd1 sd1_date

Now convert it to quarterly:

gen sd1_quarterly = qofd(sd1_date)
format sd1_quarterly %tq
list sd1_date sd1_quarterly, ab(30)

February is in the first quarter of the year.

Exercise 7

clear
use claims

gen pandemic = (daten > mdy(3, 15, 2020)) & (daten < mdy(6, 1, 2021))
table pandemic, stat(mean ICSA)

Don’t let the alignment of those numbers fool you: the mean is MUCH higher in the pandemic period. (Yes, we could fix the table’s alignment if we wanted to–the table command is meant for building publication-quality tables as well as easy but useful ones.)

Exercise 8

gen quarter = quarter(daten)
tab quarter, sum(ICSA)
tab quarter if !pandemic, sum(ICSA)

The massive spike in claims in the second quarter of 2020 gives quarter 2 the highest average for the entire period, but if you exclude the pandemic period quarter 3 is higher.

Exercise 9

clear
use atus_restructured

tab activity if time==hms(12, 0, 0), sort

tab activity if time==hms(22, 0, 0), sort

2:00AM is on day 2 of the study, so we need to switch over to mdyhms() and specify that:

tab activity if time==mdyhms(1, 2, 1960, 2, 0, 0), sort