Improving campus services for working with sensitive data

Strands of DNA

Increasingly, researchers in a wide range of fields at UC Berkeley are applying novel data science approaches to very large sensitive and restricted data sets. Working closely with Berkeley Research Computing (BRC), the Research Data Management (RDM) Program has been helping dozens of faculty, students, and postdocs working with sensitive data by providing consulting expertise in a number of disciplines, including the biological sciences, public health, social welfare, demography, computer science, and more. The combined approach of providing data management and computation support helps researchers integrate data management and curation best practices into their larger research workflows while protecting their data. 

Getting the sources you need for your text mining project

Text mining sources flowchart

You have a great research question that you want to answer with text data mining (TDM) methods, and you've got some Python under your belt or you've decided to see what you can learn from a browser-based tool like Voyant. You're ready to get started on a computational text analysis project. But wait!

Where do you get the texts?

Holistic approaches to Institutional Repositories at Open Repositories 2018

Open Repositories 2018

Open Repositories, an annual international conference that brings together users, developers, and librarians to discuss open digital repository platforms for institutional data and scholarship, was held in Bozeman, Montana from June 4 - 8. Anna Sackmann, a librarian and RDM consultant, attended to learn more about how other institutions are incorporating repository deposit into the researcher workflow and to present related local efforts at UC Berkeley.

Guides to making research software reproducible, citable, and well-documented

Computational research workflow diagram

Preserving and maintaining research software is a challenge to researchers and academic libraries. In my role as a CLIR postdoctoral fellow in software curation, I recently discussed the technical challenges of preserving and maintaining research software in a UC Berkeley Library Update, Research Software in Practice, posted in May.

UC DLFx conference tackles the new frontier of data management

Vessela Y Ensberg (UC Davis), Emily Lin (UC Merced), Ho Jung Yoo (UC San Diego), Amy Neeser (UC Berkeley) at UCDLFx 2018

The inaugural 2018 University of California Digital Library Forum (UC DLFx), "Building the UC Digital Library: Theory and Practice," took place February 27th to March 1st at UC Riverside. The conference brought together librarians, digital technology experts, educators, policy-makers, and research support staff from the UC campuses and the California Digital Library. Participants discussed and explored how libraries and research support departments are engaging with the constantly changing and challenging world of data and digital scholarship.

Using rclone to transfer data to bDrive

RClone Browser screen shot

The bDrive repository offers everyone at UC Berkeley unlimited storage, strong search capabilities, and mobile access. This storage is an important data management resource for research teams. The standard web client, however, does not always work well when dealing with very large files, many files, or deep folder structures. The web client’s connection is slow, and can disconnect in the midst of a lengthy, time-consuming transfer.