Note: This describes an RTL project proposed for FY23 as a OneIT Priority Initiative. As the project is defined further in collaboration with partners and stakeholders, this project page will be updated. Contact Research IT for more information: research-it@berkeley.edu
Project Description and Goals
In collaboration with bIT (Storage & Backup, bConnected teams), RTL will develop a framework and methodology for working with researchers to understand and express their data storage needs. Together, these tools will provide a systematic way to characterize a research data use case and its requirements, allowing for “one conversation held multiple times” (rather than a series of disparate conversations) and moving forward the campus storage strategy effort as it relates to research data.
Our goal is to establish a common lens through which we can understand storage characteristics and assess risk to research data, allowing both campus and investigators to place a value on that data.
Project Charter and Schedule
Under development
FY23 Key Measures of Success
-
Identify a set of fundamental research data storage characteristics
-
Develop a mechanism for quantifying the importance of each storage characteristic to help researchers select among storage options.
-
Align with and support the data storage options dashboard currently being developed by the bConnected team.
-
Develop a requirements gathering template that produces a systematic and sharable characterization of the risks and requirements for research data storage.
-
Inform other service teams of this project’s outcomes
-
Provide input into the development of storage and backup services, both on campus and system-wide (e.g., the UC Research Data Backup RFP).
Characteristics of research data storage
An important outcome of this project will be a standardized template for characterizing the risks and requirements for research data storage that will allow campus to have a comparable and holistic understanding of individual research projects as well as the broader needs across campus. This framework and the resulting methodology (e.g., a requirements gathering template) will necessarily be multi-dimensional and will likely be expressed in the form of a data visualization, such as a spider diagram or radar chart. An initial draft of the dimensions to be included follows:
-
Data security
-
Availability
-
Access
-
Speed
-
Durability
-
Integrity
-
Provenance
-
Disaster recovery (e.g., hardware failure, ransomware attack, fire/earthquake/flood, theft)
-
Collaboration
-
Sharing
-
Publication
-
Preservation
-
Longterm (“cold”) storage
-
Retention/deletion
-
Fit with research workflows