{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Coding practice \\#1: Due end of class, September 24\n", "\n", "Answer the questions below in a jupyter notebook. You can simply add cells to this notebook and enter your answers. When you are finished, print the notebook and hand it in during class. \$To print: From the file menu, choose 'print preview' which will open a new tab with the notebook ready to print. Please print on both sides of paper if possible.\$\n", "\n", "A reminder: Ruhl's office hours are T/R 2:30PM-3:30PM in Soc Sci 7444 and McWeeny's office hours are Monday 9:30AM-11:30AM in Soc Sci 6470. \n", "\n", "*You should feel free to discuss the coding practice with your classmates, but the work you turn in should be your own.*\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 1: Your name\n", "Replace 'Your name' above with your actual name. Enter it as last name, first name." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 2: Warm-ups\n", "\n", "1. What type is each of the following? Edit the cell and type your answer next to the statement in bold.\n", " 1. type('hello') **str**\n", " \n", " 2. type(\$'hello'\$) **list**\n", " \n", " 3. type(str(int('5'))) **str**\n", " \n", " 4. Explain your answer to C. **The '5' is a string. The int() converts it to an int and the str() converts it back to a str.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Take the list below, and write code to print it as a sentence, so that it reads: Why are there so many questions?" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Why are there so many questions?\n" ] } ], "source": [ "words = ['Why', 'are', 'there', 'so', 'many', 'questions', '?']\n", "\n", "for i in words[:-2]: # loop \n", " print(i, end=' ') # the end=' ' puts a space, rather than a line return after the printed word\n", " \n", "print(words[-2], end='')\n", "print(words[-1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. You are given the the data dictionary below. Notice that the gdp data are stored as strings. Fix data by replacing the strings with the approriate floats and then print the dict. You might try using a loop or a list comprehension. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['8.95', '12.56', '14.78']\n", "{'year': [1990, 2000, 2010], 'gdp': [8.95, 12.56, 14.78]}\n" ] } ], "source": [ "data = {'year':[1990, 2000, 2010], 'gdp':['8.95', '12.56', '14.78']}\n", "print(data['gdp']) # the gdp data are strings (how can you tell?)\n", "\n", "# with a list comprehension\n", "# data['gdp'] = [float(i) for i in data['gdp']]\n", "\n", "# with a for loop\n", "for i in range(len(data['gdp'])):\n", " data['gdp'][i] = float(data['gdp'][i])\n", "\n", "print(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. Use slicing to extract each word from the string below. (Do not extract the spaces.) Store each word in a string variable of the form: word_1, word_2...so that the print statement replicates the string. \$This is an exhausting way to print a string!\$" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No matter how you slice it\n" ] } ], "source": [ "truth = 'No matter how you slice it'\n", "\n", "# Add the definitions of word_1,... using slices and then uncomment the next line of code.\n", "word_1 = truth[0:2]\n", "word_2 = truth[3:9]\n", "word_3 = truth[10:13]\n", "word_4 = truth[14:17]\n", "word_5 = truth[18:23]\n", "word_6 = truth[24::]\n", "print(word_1, word_2, word_3, word_4, word_5, word_6)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "5. Print the sum of the integers between 0 and 50 that are multiples of 5: e.g., 0, 5, 10,...50. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "275\n" ] } ], "source": [ "# as a for loop\n", "#total = 0\n", "#for i in range(0,51,5): # list slices, the end of range is exclusive, so we need '51' to get the '50'\n", "# total = total + i\n", "\n", "# using the built-in sum() function\n", "total = sum(range(0,51,5))\n", "\n", "print(total)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "6. The two lists below contain the most popular female and male names found in the U.S. Social Security database. Write a code that compares the lengths of the male and female names at each rank (e.g. James v. Mary, Robert v. Jennifer, etc.) and prints a statement for each rank of the form: 'The name James is longer than the name Mary.'\n", "\n", "\$Optional: If you would like to learn a more 'python-y' way to do this, look up the zip() function.\$" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The name James is longer than the name Mary\n", "The name Patricia is longer than the name John\n", "The name Jennifer is longer than the name Robert\n", "The name Michael is longer than the name Linda\n", "The name Elizabeth is longer than the name William\n" ] } ], "source": [ "names_f = ['Mary', 'Patricia', 'Jennifer', 'Linda', 'Elizabeth']\n", "names_m = ['James', 'John', 'Robert', 'Michael', 'William']\n", " \n", "for i in range(5):\n", " if len(names_f[i]) > len(names_m[i]):\n", " print('The name', names_f[i], 'is longer than the name', names_m[i])\n", " else:\n", " print('The name', names_m[i], 'is longer than the name', names_f[i])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 3: U.S. Exports\n", "\n", "The following code cell contains a dictionary with nominal exports of goods (in millions of dollars) from the United States to every other country in the world in 2017.\n", "\n", "Run the code cell below to read the dictionary into memory. " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "us_exports_2017 = {'Argentina': 9585.957928, 'Serbia': 125.604552, 'Bulgaria': 381.490292, 'Bangladesh': 1473.961033, \n", " 'Sierra Leone': 79.185252, 'Guyana': 377.349484, 'Turkey': 9741.491909, 'Syrian Arab Republic': 6.961023,\n", " 'Saint Vincent and the Grenadines': 81.348131, 'Eritrea': 5.021254, 'Kosovo': 10.070786, 'Bermuda': 693.65006,\n", " 'Zimbabwe': 39.926035, 'Suriname': 359.542397, 'Uganda': 107.869624, 'Palau': 19.432665, 'Guinea': 103.022431, \n", " 'Heard Island and Mcdonald Islands': 0.125829, 'Mongolia': 82.192362, 'Botswana': 93.045279, 'Cameroon': 158.845783, \n", " 'Ecuador': 4820.700634, 'Falkland Islands (Malvinas)': 0.482089, 'Iceland': 399.258756, 'South Korea': 48326.408702, \n", " 'Hungary': 1888.233492, 'Comoros': 1.375097, 'Malta': 293.009573, 'Luxembourg': 1078.074731, 'Tonga': 16.719457, \n", " 'Tanzania; United Republic of': 145.242068, 'Liberia': 138.340637, 'Mali': 62.01555, 'Algeria': 1059.840617, \n", " 'South Africa': 5020.062464, 'Kyrgyzstan': 26.066859, 'Swaziland': 24.073435, 'Central African Republic': 11.178603, \n", " 'Andorra': 3.285167, 'Seychelles': 15.488847, 'Other Countries': 0.0, 'Cook Islands': 5.243873, \n", " \"Lao People's Democratic Republic\": 25.67166, 'Ghana': 859.971235, 'Malaysia': 12964.461912, \n", " 'Micronesia (Federated States of)': 44.827548, 'Indonesia': 6863.831289, 'Tajikistan': 17.911115, 'Gambia': 39.235769, \n", " 'Thailand': 10991.613094, 'Cape Verde': 9.146393, 'West Bank': 2.19236, 'Sri Lanka': 336.25559, \n", " 'Burma (Myanmar)': 211.368799, 'Lebanon': 1221.529041, 'Poland': 4523.318939, 'Austria': 4275.26315, \n", " 'Panama': 6301.303156, 'Lithuania': 609.272207, 'Hong Kong': 39939.130227, 'Kenya': 454.453344, 'Netherlands': 41510.332962, \n", " 'Dominican Republic': 7827.520352, 'Cayman Islands': 885.800982, 'Reunion': 8.376879, 'Belarus': 72.668182, \n", " 'Burundi': 8.13276, 'Nigeria': 2170.069073, 'Martinique': 168.255979, 'Equatorial Guinea': 111.93615, 'Vatican City': 1.146853, \n", " 'Taiwan': 25729.504439, 'Georgia': 382.865271, 'Aruba': 966.305683, 'Lesotho': 1.901616, 'Colombia': 13312.122946, \n", " 'Curacao': 692.167798, 'Cuba': 291.276873, 'French Southern Territories': 2.842721, 'China': 129893.586716, \n", " 'Norway': 5453.37072, 'Svalbard and Jan Mayen Island': 1.986719, \"CÃ´te d'Ivoire\": 319.964133, 'Uruguay': 1599.284155, \n", " 'Montserrat': 9.708109, 'Madagascar': 53.367175, 'Vanuatu': 7.762425, 'Bhutan': 34.892754, 'Greenland': 9.884715, \n", " 'Kuwait': 5143.414006, 'Iraq': 1204.721644, 'Senegal': 207.93405, 'Ukraine': 1787.9993, 'Haiti': 1410.575337, \n", " 'Mauritius': 61.421605, 'Nicaragua': 1589.398811, 'Qatar': 3124.095821, 'Moldova; Republic of': 17.614764, \n", " 'Samoa': 38.309271, 'Saint Kitts and Nevis': 243.537203, 'Ethiopia': 877.152409, 'French Polynesia': 119.798493, \n", " 'Uzbekistan': 136.094699, 'East Germany': 0.0, 'Switzerland': 21684.835238, 'Rwanda': 66.140569, 'Solomon Islands': 9.303881, \n", " 'Macedonia': 41.225784, 'Christmas Island': 0.476695, 'Sao Tome and Principe': 3.140792, 'Papua New Guinea': 105.851898, \n", " 'Venezuela': 4133.114819, 'Iran': 136.000618, 'Grenada': 102.991413, 'New Caledonia': 59.362579, 'Kazakhstan': 552.092011, \n", " 'Italy': 18404.689805, 'Paraguay': 2719.236832, 'Guinea Bissau': 3.751655, 'India': 25688.861698, 'Slovenia': 371.602049, \n", " 'Canada': 282265.135262, 'United Arab Emirates': 20019.696481, 'Turks and Caicos Islands': 369.866125, \n", " 'Croatia': 447.955755, 'Nepal': 75.851857, 'Congo': 117.378013, 'Saint Helena': 0.587754, \n", " 'British Indian Ocean Territory': 36.334868, 'Singapore': 29805.915154, 'Slovakia': 441.620058, 'Faroe Islands': 2.034537, \n", " 'Gaza Strip': 0.555101, 'Saint Lucia': 572.438902, 'Dominica': 178.447184, 'Bolivia': 594.703634, 'Chad': 31.338485, \n", " 'British Virgin Islands': 377.227746, 'Wallis and Futuna': 0.300526, 'South Sudan': 13.57431, 'Kiribati': 4.928546, \n", " 'Fiji': 66.551898, 'French Guiana': 162.74323, 'United Kingdom': 56257.922547, 'Finland': 1534.930626, 'Jordan': 1920.88633, \n", " 'Serbia and Montenegro': 0.0, 'Norfolk Island': 0.138341, 'Pakistan': 2808.16696, 'Bahamas': 3057.271998, 'Mexico': 243314.438647,\n", " 'Honduras': 5079.663953, 'Afghanistan': 941.42806, 'Anguilla': 59.409382, 'Romania': 954.684233, 'Spain': 11063.740551, \n", " 'Tuvalu': 0.796156, 'Portugal': 1192.459324, 'Brunei': 121.111606, 'France': 33595.514399, 'Togo': 481.810854, \n", " 'Chile': 13605.342592, 'Marshall Islands': 556.988101, 'Morocco': 2219.661823, 'Peru': 8662.61201, \n", " 'Western Sahara': 0.418356, 'Angola': 809.444868, 'Malawi': 27.88084, 'Netherlands Antilles': 0.0, 'Belize': 294.920783, \n", " 'Azerbaijan': 354.555299, 'Macau; SAR of China': 502.136131, 'Bahrain': 898.1564, 'Niue': 0.086722, 'North Korea': 0.00266, \n", " 'Nauru': 0.198927, 'Cyprus': 82.40193, 'Gibraltar': 1126.417629, 'Somalia': 70.594917, 'Greece': 960.728649, \n", " 'Gabon': 89.00649, 'Saudi Arabia': 16348.000104, 'Japan': 67605.076964, 'Mauritania': 127.77739, 'Estonia': 273.958646, \n", " 'Tokelau Islands': 1.162557, 'Sint Maarten': 545.15619, 'Namibia': 118.561677, 'Australia': 24526.835921, \n", " 'Maldives': 34.373287, 'Sudan': 69.654352, 'Djibouti': 157.467862, 'New Zealand': 3924.611227, 'East Timor': 2.565401, \n", " 'Antigua and Barbuda': 425.623529, 'Czech Republic': 2273.780613, 'Guatemala': 6895.337671, 'Liechtenstein': 27.744402, \n", " 'Belgium': 29923.692464, 'Mozambique': 179.126068, 'Monaco': 42.263572, 'Brazil': 37221.566249, 'Germany': 53896.753486, \n", " 'Egypt': 3991.832766, 'San Marino': 2.109451, 'Yemen': 199.030638, 'Cocos (Keeling) Islands': 0.452079, \n", " 'Israel': 12550.082579, 'Mayotte': 0.493107, 'Barbados': 525.452357, 'Philippines': 8450.925039, \n", " 'Burkina Faso': 60.978397, 'Guadeloupe': 244.842196, 'Benin': 250.13587, 'Armenia': 55.005606, \n", " 'Democratic Republic of Congo': 76.260999, 'Niger': 44.082049, 'Viet Nam': 8133.364685, 'Albania': 62.086804, \n", " 'Montenegro': 9.148718, 'Cambodia': 400.220962, 'Bosnia and Herzegovina': 27.142924, 'El Salvador': 3057.896029, \n", " 'Pitcairn Islands': 0.001431, 'Trinidad and Tobago': 1816.745419, 'Tunisia': 543.575125, 'Sweden': 3734.457331, \n", " 'Zambia': 85.149483, 'Turkmenistan': 282.178944, 'Oman': 1985.049832, 'Denmark': 2210.191808, 'Latvia': 381.621395, \n", " 'Ireland': 10707.555883, 'Libyan Arab Jamahiriya': 134.952674, 'Saint Pierre and Miquelon': 0.390569, \n", " 'Costa Rica': 6169.60881, 'Jamaica': 2105.505325, 'Russia': 6998.497148}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the cell below, write a program to do the following:\n", "1. Report the value of U.S. exports of goods to Canada. Please print a complete English sentence. Report the value in billions of dollars and round to two decimal places." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The United States exported $282.27 billion of goods to Canada in 2017.\n", "The United States exported$282.27 billion of goods to Canada in 2017.\n" ] } ], "source": [ "# This version rounds in explicitly\n", "print(\"The United States exported $\" + str(round(us_exports_2017['Canada']/1000,2)) + \" billion of goods to Canada in 2017.\")\n", "\n", "# This version uses python's string format. We haven't covered it in class, but it maybe useful at some point. \n", "print('The United States exported${0:.2f} billion of goods to Canada in 2017.'.format(us_exports_2017['Canada']/1000))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the cell below, write a program to do the following:\n", "2. Report the **total** value of U.S. exports of goods to the \"BRICS\" countires: Brazil, Russia, India, China, and South Africa. Again, use a complete sentence, report the value in billions of dollars, and round to two decimal places." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The United States exported $204.82 billion of goods to the BRICS countries (Brazil, Russia, India, China, and South Africa) in 2017.\n" ] } ], "source": [ "exports_to_BRICS = us_exports_2017['Brazil'] + us_exports_2017['Russia'] + us_exports_2017['India'] + us_exports_2017['China'] + us_exports_2017['South Africa']\n", "print(\"The United States exported$\" + str(round(exports_to_BRICS/1000,2)) + \" billion of goods to the BRICS countries (Brazil, Russia, India, China, and South Africa) in 2017.\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the cell below, write a program to do the following:\n", "3. Report the total value of U.S. exports of goods to the entire world. Again, use a complete sentence and round to two decimal places, but this time report the value in trillions of dollars. **(Hint: use the .values() method.)**" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The United States exported $1.55 trillion of goods in 2017.\n" ] } ], "source": [ "total_us_exports = sum(us_exports_2017.values())\n", "print(\"The United States exported$\" + str(round(total_us_exports/1000000,2)) + \" trillion of goods in 2017.\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the cell below, write a program to do the following:\n", "4. Report U.S. exports of goods to the North American Free Trade Agreement (NAFTA) member countries (i.e., Canada and Mexico) as a percentage of total U.S. exports of goods. Again, use a complete sentence and round to two decimal places." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Exports to NAFTA member countries accounted for 33.99% of all U.S. exports of goods in 2017.\n" ] } ], "source": [ "exports_to_NAFTA = us_exports_2017['Canada'] + us_exports_2017['Mexico']\n", "print(\"Exports to NAFTA member countries accounted for \" + str(round(100*exports_to_NAFTA/total_us_exports,2)) + \"% of all U.S. exports of goods in 2017.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 4: Numeric integration (challenging)\n", "\n", "Recall from your statistics or econometrics class that the probability density function (pdf) of a normal distribution with mean $\\mu$ and variance $\\sigma^2$ is given by \n", "$$f(x) = \\frac{1}{\\sqrt{2\\pi\\sigma^2}}e^{-\\frac{(x-\\mu)^2}{2\\sigma^2}}$$\n", "The corresponding cumulative distribution function (cdf) is \n", "$$F(x) = \\int_{-\\infty}^{x} f(y)dy = \\int_{-\\infty}^{x} \\frac{1}{\\sqrt{2\\pi\\sigma^2}}e^{-\\frac{(y-\\mu)^2}{2\\sigma^2}}dy.$$\n", "\n", "Unfortunately, the cdf $F(x)$ does not have a closed-form expression; we cannot compute this integral by hand. Instead, we use a technique called numeric integration (i.e., a computer) to calculate the value of the cdf at a given point.\n", "\n", "One simple but reasonably accurate numeric integration technique is the trapezoid rule, which is taught in every introductory calculus course. The trapezoid rule approximates the area under a curve $g(x)$ with the area of $N$ trapezoids of equal width. Formally,\n", "$$\\int_{a}^{b} f(x) dx \\approx \\sum_{k=0}^{N} \\frac{1}{2} \\Big(f(a + k \\Delta x) + f(a + (k+1) \\Delta x) \\Big) \\Delta x$$\n", "where $\\Delta x = \\frac{b-a}{N}$.\n", "\n", "In this exercise, your task is to calculate the value of the cdf of the standard normal distribution (i.e., $\\mu=0$ and $\\sigma^2 = 1$) at $x = 1.96$. We will do it two ways. First on our own (to practice some list comprehensions) and then, the easy way, using the norm function from scipy.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Write your own code. Your goal is to compute the summation that approximates the integral. Use three list comprehensions. \n", " * First, create and store the points ( $a+k\\Delta x$ ) at which the pdf is calculated, \n", " * Second, create and store the values ( $f(a+k\\Delta x)$ ) of the pdf at each of those points, and \n", " * Third, create and store the areas ( $0.5(f[k]+f[k+1])\\Delta x$ ) of each trapezoid. \n", " * Fourth, sum up your trapazoids.\n", " \n", "\n", "Use $a = -10$, $b=1.96$, and $N=10,000$.\n", "\n", "Print the resulting value of the cdf. " ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The value computed by the trapazoid rule is: 0.97500209119795\n" ] } ], "source": [ "# Import required packages:\n", "import numpy as np # for sqrt, exp, pi\n", "from scipy.stats import norm # for norm.cdf()\n", "\n", "# Set parameter values:\n", "mu = 0 # the mean\n", "sigma = 1 # the std dev\n", "a = -10 # lower bound of integration for trapazoid rule\n", "b = 1.96 # upper bound of integration for trapazoid rule\n", "N = 10000 # number of point of trapazoids for trapazoid rule\n", "Delta_x = (b - a)/N # width of each trapazoid\n", "\n", "#-------------------------------------------------------------------------\n", "# Part 1: Writing my own code:\n", "\n", "# Create a list containing all the x points:\n", "x_points = [a + k*Delta_x for k in range(N+1)]\n", "\n", "# Calculate the value of the pdf at each of the x points:\n", "# This is a little messy looking, but it is just the f(x) function in the first equation. \n", "pdf = [np.exp(-((x - mu)**2)/(2*sigma**2))/np.sqrt(2*np.pi*sigma**2) for x in x_points]\n", "\n", "# Calculate the area of each trapezoid:\n", "trapezoid_areas = [0.5*(pdf[k] + pdf[k+1])*Delta_x for k in range(N)]\n", "\n", "# Sum the areas of all the trapezoids to obtain the value of the cdf:\n", "my_cdf = sum(trapezoid_areas)\n", "\n", "# Print result from Part 1:\n", "print('The value computed by the trapazoid rule is:', my_cdf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Use the function norm.cdf() from scipy.stats and print the result. Scipy is a package of functions useful in mathematics and engineering. To use this function, you will need to import it first with the following code:\n", "python\n", "from scipy.stats import norm\n", "\n", "You might use ? or Google to learn more about norm.cdf(). " ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The value as computed from scipy.states is: 0.9750021048517795\n" ] } ], "source": [ "# Using norm.cdf() from scipy.stats [https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.stats.norm.html]\n", "# We did not need to pass the mean and std to norm.cdf() because the default values are loc=0, scale = 1\n", "print('The value as computed from scipy.states is:', norm.cdf(b))\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }