AI in Research IT
Research IT provides a range of machine learning and AI resources to support researchers. We primarily support Berkeley’s high-performance computing clusters but can also help researchers navigate cloud-based solutions and national research infrastructure. We provide consulting services to help researchers access the GPU resources, software environments, and technical support for AI and ML. Whether you're training models, running inference, or exploring LLM-based workflows, Research IT can help direct you to the right resources for your project.
GPU resources
There are a variety of GPU resources available on campus, both offered through Research IT and beyond. For the most demanding compute jobs, there are existing campus contracts with Cloud providers that can be leveraged.
Savio
Savio is a campus-wide cluster designed for supporting state-of-the-art high performance computing, which supports AI and machine learning (ML) research.
-
Savio offers a Linux-based computing environment equipped with a high-speed parallel filesystem for data and hardware with CPUs and GPUs for computing
-
View available hardware
-
GPU access on Savio is limited, especially for the newest GPUs
-
All Principal Investigators at UC Berkeley are eligible for a Faculty Computing Allowance, which provides access to almost all resources on Savio
-
Computing with GPUs for machine learning applications is supported,
-
Custom Python environments for PyTorch and Tensorflow are available
-
Using containers with GPUs is also supported
-
Web-based Open OnDemand (OOD) allows you to access HPC resources just by logging in via your browser with your CalNet ID
-
A much easier way to access resources on Savio, especially for new HPC users
-
It brings together file access, command line access, job tracking, and allows for interactive jobs via Jupyter, R Studio, and more
-
Savio supports VS Code, a powerful integrated development environment which supports AI-assisted coding agents
-
Savio supports the VS code remote extension connection via SSH and a web-based alternative (Code Server
-
For example, the GitHub Copilot extension in VS Code provides AI coding support and is free for students and faculty via GitHub Education
-
See documentation on VS Code and on configuring an SSH connection to Savio for VS Code
Secure Research Data + Compute (SRDC) Platform
The Secure Research Data and Computing (SRDC) platform provides high-performance computing infrastructure designed for research involving sensitive data.
-
SRDC is located in a secure campus data center with personnel who enforce privacy and security requirements
-
The SRDC HPC cluster includes CPU computing options as well as nodes with GeForce GTX 1080 Ti GPUs
-
SRDC is accessible to all qualified UC Berkeley faculty/PIs under their Faculty Computing Allowance upon consultation
-
Researchers with specific need may purchase additional nodes to be added to the system
-
Linux and Windows virtual machine environments are also available for workflows requiring a GUI or Windows-specific software
-
SRDC storage allocations are based on project need
-
For projects which involve highly sensitive (P4) data, SRDC offers an alternative to Savio with dedicated security monitoring and compliance support.
BearBorg
Research IT has launched an early beta of our new AI programming environment: BearBorg
-
Combines client-side programming tools with a multi-model, multi-provider LLM proxy
-
Features a JupyterLab environment with JupyterAI for LLM assistance, which can use the code you're working on for context
-
This can help you generate snippets, debug, or brainstorm ideas
-
A more powerful way to work with LLMs than copy-pasting code
-
Beta testers can access JupyterAI, usage details, and view the documentation pages
-
Integrating AI with your development environment allows for a more collaborative research process where the AI understands the specific project's history and logic.
-
Easy to get setup. Log in with your CalNet ID and you can start chatting or coding without the complexity of identity management. No API keys or separate accounts with each provider needed
-
BearBorg helps standardize the AI environment. By using a managed proxy and versioned containers, researchers can reliably understand the types of responses generated by specific models.
-
What's Next: building a scalable architecture designed to eventually handle the most sensitive (P4) research data with total privacy. Expanding model choices and adding more specialized file-processing tools to help refine data from images, pdfs, and more.
National Research Clusters
For researchers with computing needs that exceed campus resources, researchers may consider applying for access to a national HPC cluster. These federally-supported platforms provide access to large-scale GPU clusters, often at no cost through allocation-based systems or educational programs.
-
ACCESS (Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support): The ACCESS program can provide researchers with allocations on one of several national HPC clusters. Researchers can request compute time on GPU-equipped clusters across participating institutions to support a range of AI/ML workloads. (ACCESS replaced the former XSEDE program.)
-
NAIRR (National AI Research Resource): The NAIRR pilot program focuses on democratizing access to AI research infrastructure. NAIRR provides access to compute resources, datasets, and educational tools for AI research. The program also offers classroom-focused GPU access for instructors teaching AI/ML courses.
-
NRP/Nautilus (National Research Platform): Nautilus is a distributed cluster offering GPU resources through a Kubernetes-based environment.
-
NRP also offers Jupyter-based GPU access
-
SDSC/HPC@UC: The San Diego Supercomputer Center (SDSC) coordinates HPC resources at the UC level on the SDSC cluster Expanse which includes GPU nodes for AI and ML applications.
Research IT consultants can provide guidance on which resource is the best fit for your computational needs and assist with the allocation request process.
Cloud
Cloud computing offers a flexible and highly scalable way to access GPUs and other computing resources for researchers that have available funding
-
Research IT offers cloud computing consulting
-
Research IT consultants can discuss and recommend cloud computing and data storage options
-
We can also provide insight if there is a campus service available that may be able to meet your needs
-
UC Berkeley has existing cloud computing contracts with Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure
-
There is special pricing negotiated by UCOP and UC Berkeley
-
All three providers have GPU options available
-
FAQ documents available for each of the three major Cloud services