SSCC News for September 2024

Welcome to the SSCC!

We want to extend a warm welcome to all the new members of the Social Science Computing Cooperative, whether you’re a new faculty member, staff member, or graduate student who will use our resources for research; or an undergraduate taking a class that uses SSCC resources.

What is the SSCC?

The SSCC provides servers, software, training, and consulting to support researchers (and future researchers) who do statistical analysis. If you didn’t attend an orientation session, feel free to email the SSCC Help Desk, tell us about yourself, and ask what we can do for you.

What is SSCC News?

SSCC News is one of our main ways of getting information to our members. It comes out about once every two months. Please look over the email when you get it and then read the articles that will affect you.

If you’d rather not receive SSCC News, email helpdesk@ssc.wisc.edu and they can take care of that for you. If you’re no longer interested in SSCC News because you no longer use your SSCC account, they can close it for you.

Most Project Directories Will Move to ResearchDrive

Last month, the SSCC Chairs & Directors Committee voted to direct the SSCC not to buy new storage space for at least two years. Instead, as many project directories as possible will be moved to ResearchDrive, freeing up space for projects that cannot be moved. The SSCC will only create new project directories for data that cannot be stored in ResearchDrive. This will save SSCC members an estimated $150,000 over two years and more going forward, by taking advantage of the large investment the University has made in ResearchDrive.

Note that projects in Silo, projects containing administrative data, and projects used for instruction cannot be moved to ResearchDrive. If your project is or soon will be larger than 25TB, the amount of space ResearchDrive provides free, we’ll talk about what will best meet your needs. But you should plan on most other projects moving to ResearchDrive.

Only PIs can request ResearchDrive space (see the eligibility rules), but they can make it available to graduate students and others. Graduate students who need project space should speak with their advisor or another faculty member and request space in that faculty member’s ResearchDrive. Faculty, we are counting on you to make sure all grad students can store their data in ResearchDrive.

If you have a project directory, anticipate receiving an invitation to move your data to ResearchDrive in the near future. But you are welcome to start the process today: just read Using ResearchDrive and fill out the ResearchDrive Request Form. As always, if you have questions please send them to the SSCC Help Desk.

SSCC Training

The SSCC’s statistical consultants will teach a variety of topical workshops yet this semester. Highlights include Using the SSCC Linux Servers, Functions and Iteration in R, Publication-Quality Tables in Stata, and Missing Data Analysis with Blimp.

If you missed our core R and Stata classes, we’ll teach them again in January before the spring semester starts. They will be online so you don’t have to be back in Madison yet. You can also find the material for those classes in our Knowledge Base.

Coming Soon: A Database Server for Research

In the coming weeks the SSCC will put a database server into production that is specifically designed for research and research data. It will run Postgres and MariaDB. This will allow researchers to work with large data sets that are best stored as databases and queried using SQL. If you’re interested in using the new database server, reach out to the SSCC Help Desk and we’ll keep you informed.

Reminder: Help Desk Changes

While the SSCC Help Desk is short-handed, it will close earlier (3:00PM) and be available by email, voice mail, or appointment. See the Help Desk web page for more information.

Linux File Server Issues Resolved

Over the summer, we had recurring issues with the Linux file server that affected everything done in Linux. It was one of the longest-running problems in the SSCC’s history and we owe you an explanation.

In late spring, we put a new and powerful Linux file server into production. The new server had a higher default for the maximum number of threads (parallel processes) the file server can use. Given that the Slurm cluster can have hundreds of jobs trying to write files at the same time, this seemed like a good thing. What we didn’t know is that there is an obscure and apparently low-priority bug in Linux where file server threads can interfere with each other and cause the server to hang. Under “normal” circumstances the probability of this occurring is low, but the probability increases non-linearly with the number of threads being used.

At first we thought the problem was particular jobs that read and write lots of files: the problem would only occur when such jobs where running and went away when they were killed, but it was never consistent. (In reality the jobs only mattered in that they prompted the file server to use more threads.) Then we tried giving the server more resources: more cores, more memory, more network bandwidth…and more threads (which only made things worse). We developed a variety of new tools for monitoring the servers and detecting problems. The whole process was slow because the problem only occurred every few days at most, so it took a long time to test theories or see if the latest change did any good. Meanwhile, the system administrators were monitoring the servers and clearing problems when they occurred on nights and weekends. Finding a solution to this problem was their top priority.

Finally, we found the bug report that explained what we were seeing and immediately reduced the maximum number of threads to what it had been on the old file server. That was now 27 days ago and the problem has not recurred, so we’re reasonably confident it’s fixed. The file server will sometimes be slow when it’s being heavily used, but no longer hangs. We apologize for the trouble this issue caused, and you can rest assured no one was more frustrated by it than the system administrators involved.