Improve Services for Researchers Working with Data - Proposed OneIT Priority Initiative for FY23

Projects

Note: This describes an RTL project proposed for FY23 as a OneIT Priority Initiative. As the project is defined further in collaboration with partners and stakeholders, this project page will be updated. Contact Research IT for more information: research-it@berkeley.edu

Project Description and Goals

In collaboration with bIT (Storage & Backup, bConnected teams), RTL will develop a framework and methodology for working with researchers to understand and express their data storage needs. Together, these tools will provide a systematic way to characterize a research data use case and its requirements, allowing for “one conversation held multiple times” (rather than a series of disparate conversations) and moving forward the campus storage strategy effort as it relates to research data.

Our goal is to establish a common lens through which we can understand storage characteristics and assess risk to research data, allowing both campus and investigators to place a value on that data. 

Project Charter and Schedule

Under development

FY23 Key Measures of Success

  • Identify a set of fundamental research data storage characteristics

  • Develop a mechanism for quantifying the importance of each storage characteristic to help researchers select among storage options.

  • Align with and support the data storage options dashboard currently being developed by the bConnected team.

  • Develop a requirements gathering template that produces a systematic and sharable characterization of the risks and requirements for research data storage. 

  • Inform other service teams of this project’s outcomes

  • Provide input into the development of storage and backup services, both on campus and system-wide (e.g., the UC Research Data Backup RFP).

Characteristics of research data storage 

An important outcome of this project will be a standardized template for characterizing the risks and requirements for research data storage that will allow campus to have a comparable and holistic understanding of individual research projects as well as the broader needs across campus. This framework and the resulting methodology (e.g., a requirements gathering template) will necessarily be multi-dimensional and will likely be expressed in the form of a data visualization, such as a spider diagram or radar chart. An initial draft of the dimensions to be included follows:

  • Data security

  • Availability

  • Access

  • Speed 

  • Durability

  • Integrity

  • Provenance

  • Disaster recovery (e.g., hardware failure, ransomware attack, fire/earthquake/flood, theft)

  • Collaboration

  • Sharing

  • Publication

  • Preservation

  • Longterm (“cold”) storage

  • Retention/deletion

  • Fit with research workflows