Introduction to Web Scraping with R

Note: all SSCC training is in person unless the class description says otherwise.

Web scraping is the process of extracting data from websites, which can then be used for research.

In this workshop, you will be learn best practices for ethical web scraping, the basics of HTML syntax and CSS selectors, and how to extract information from a web page with the rvest package in R.

To benefit from this workshop, you will need an understanding of the fundamentals of working with data in R, such as you can get from our Data Wrangling in R workshop or online curriculum (https://sscc.wisc.edu/sscc/pubs/dwr/).

Familiarity and experience with HTML, CSS selectors, the purrr R package, and running jobs on Linstat (https://kb.wisc.edu/sscc/page.php?id=102669) will allow you to complete more advanced and larger web scraping tasks, but they are not required.

Instructor: Struck
Room: 3218 Sewell Social Sciences Building
Dates: 10/6, 10/13, 10/20
Time: 10:45 - 11:45

Each session of this class builds on the material taught in the previous sessions. If you cannot attend all of the class's sessions but still want to take the class, you must contact the Help Desk, find out what will be covered in the session(s) you will miss, and learn that material on your own before the next session. In most cases the material can be found in the SSCC Knowledge Base.

 

SSCC Members click here to register
Non-Members, please fill out the following:
Name:
UW Email Address:
Primary Department:
School/College:
Status: