SSCC News August 2021

Fall Training

SSCC’s statistical consultants are excited to be teaching in person again this fall and we have a full slate of workshops scheduled. Most are targeted at new graduate students, but some are also for veteran researchers. Visit our training page for details and to register.

Attendance at our Data Wrangling workshops was low during the pandemic. We get it—a person can only take so much Zoom—but we’ve also seen an increase in graduate students starting to do research and discovering they don’t have the skills to put real-world data in a form they can analyze. So we’re teaching the Data Wrangling workshops in big lecture halls this August and invite the second year grad students and everyone else who missed out during the pandemic to join us as well as the first-year grad students.

Not sure if you need to take a Data Wrangling workshop? Here’s a self-test: you’re working with panel data with incomes across many years, so you know you need to correct for inflation. You’ve downloaded data on the Consumer Price Index and looked up the formula, but how do you combine inflation data that has one observation per year with panel data that has one observation per person per year?

If your answer involves copy and paste, Excel, or doing anything “by hand,” you need to take a Data Wrangling workshop. Your answer should be a confident “I’ll use a one-to-many merge in Stata” or the equivalent in R, Python, or some other package—and if you take one of our workshops it will be.

SSCC Summer Tech Update

The SSCC will put our annual summer tech update into production from 7pm Tuesday 8/17/2021 to 7am Wednesday 8/18/2021. All SSCC services will be unavailable during this time. Highlights of the update include:

  • Stata is being updated to Stata 17, which includes new tools for creating customized publication-quality tables.
  • R is being updated to version R 4.1.0, which includes a new and more efficient pipe ( |> replaces %>% ).
  • We are installing AlmaLinux 8 on all the Linstat, LinSilo, Condor, and CondorSilo servers. AlmaLinux 8 has newer versions of most libraries, including those needed to make maps.
  • PyCharm, a Python development environment, will be installed on all servers.
  • JMP, a graphical user interface for SAS, will be installed on Winstat and WinSilo.
  • New laptops have been acquired for the Mobile Lab and are ready for use this coming fall.
  • NVivo will be upgraded, so you will be prompted to upgrade your project the first time you open it in the new version. You should be using NVivo Server on Winstat to avoid file corruption (if you aren’t it’s time to switch), so follow the instructions in Converting NVivo Server Projects to NVivo Server R1.
  • Most other software is also being brought up to their latest versions.

R and Python users will need to reinstall their packages to match the new versions.

Scientific Word has been removed from Winstat and WinSilo, as the company that made it has gone out of business. You can write LaTeX documents using TeXStudio or embed LaTeX equations in Markdown documents using many tools.

Caitlin Tefft Leaving the SSCC

We’re sad to announce that Caitlin Tefft is leaving the SSCC in September to focus on her graduate studies full time. We’re excited for her, but after more than ten years as “the face of the SSCC” she’ll be sorely missed.

Caitlin has been sharing her growing expertise in qualitative data analysis, but without her our ability to support QDA programs like NVivo will be limited to basic functionality. SSCC Director Andy Arnold, in his role as chair of the campus Research Technology Advisory Group, has been gathering data on the need for consulting on qualitative analysis and would be very interested in hearing how Caitlin’s support for it has impacted your work.

Future of the Help Desk

Due to budget uncertainties, we will not be hiring a replacement for Caitlin immediately. This will leave Amanda Todd to staff the Help Desk full-time, which will require adaptation. It’s also not at all clear what the future course of the COVID-19 pandemic will be or what pandemic-induced changes in the kinds of support needed will be permanent, so we will be flexible as we figure out how to best meet your needs with reduced staff.

We do anticipate that compared to before the pandemic there will be a greater need for remote help and less need for in-person help. We will start the semester with the Help Desk in person but will move to it being in person only on certain days as demand settles down. (Desktop Support, i.e. Cody Gerhartz and Paul Boyer, will continue to be in person.) Meanwhile, you can now make an appointment for help via video chat on short notice—typically less than an hour, depending on demand. See the SSCC Help Desk page for updated schedules or to make an appointment. Of course, you may continue to send emails to helpdesk@ssc.wisc.edu or leave a voice mail at 262-9917 as well.

Using Linstat From Home? Consider Connecting from Winstat

The X11 protocol used for graphics in Linux can be very sensitive to network lag, which can lead to poor performance if you’re connecting from home. It depends on the program, however: Stata and Matlab are generally okay, while for whatever reason graphical programs for running Python (Spyder, PyCharm, even Firefox for running Jupyter Notebooks) are very bad.

Connections to Winstat are much more tolerant of network lag, and a connection from Winstat to Linstat is blazing fast. So, if you’re having trouble running Linux programs directly from home, try connecting to Winstat first and then using X-Win32 on Winstat to connect to Linstat.

UW-Madison Joins Dryad

 The University has joined Dryad, an open-source repository for research data. Many funding agencies and publications are now requiring that data be made publicly available, and Dryad will help you meet those requirements (without using SSCC’s expensive, high-performance disk space for archival purposes). Your data might fit Dryad if:

  • You are a UW-Madison researcher with a netID.
  • The total dataset is 300GB or less.
  • The data is able to be open access and:
    • Is not sensitive
    • Does not contain personally identifiable human subject information
  • The data is able to be licensed with the Creative Commons Zero waiver (CC0)

Research Data Services has more information for researchers interested in using Dryad.