Status and Service Updates
Savio HPC Services Resumed: Mon, 8/12
We are excited to report that Savio HPC services have resumed. As planned, the Berkeley IT team repaired the automated transfer switch in the data center over the weekend. The data center's power is restored, and the Savio supercluster is back online. Jobs have started running, and HPC services, including Open OnDemand and Globus, are also back in service.
Savio outage (data center repair)- Starting 5 PM, Friday Aug 9th
The Berkeley IT team is planning to repair the automated transfer switch in the Earl Warren Hall Data Center. The work is needed to automate the power failover to generators during future power outages. We have scheduled a Savio downtime to accommodate the repair work, which calls for a full power shutdown of the data center. The downtime will start at 5:00 PM on Friday, Aug 9th and we anticipate to return the HPC services on Monday, Aug 12th by 5:00 PM. A scheduler reservation is already in place to ensure that no jobs run after 5:00 PM on Friday, August 9th. If you plan to submit jobs, please request proper walltime to ensure that jobs complete before the downtime. Otherwise, your jobs will wait in the queue until the cluster is back online.
Savio HPC Open OnDemand Service back online: Mon, 7/15
The Open OnDemand HPC service at https://ood.brc.berkeley.edu/ is back online. We appreciate your patience while we were working through some issues. The service has some changes. We have upgraded Open OnDemand to the latest version, 3.1.7. We have also adopted CILogon for user authentication to eliminate the repetitive login problems you might have experienced. Please select the appropriate institute, primarily the University of California, Berkeley, at the login page. The command-line tool email_lookup.sh can help clarify at which institute you should log in.
Savio HPC Services are back online except for OOD: Wed, 7/10
The Savio HPC system (with the new Rocky Linux 8 OS installed and implemented), with the exception of the Open OnDemand (OOD) service, has been returned to service and is available to users. We will need more time to configure OOD, so please do not attempt to use OOD at this time. Please note, however, that the Savio documentation has not yet been fully updated to reflect changes due to the new Rocky Linux 8 OS (e.g., changes in the software stack and software module farms, changes in how to compile user code, etc.). Therefore, until the updates in the Savio documentation have been completed, we suggest that Savio users refer to the LBNL Science IT documentation for the Lawrencium HPC system at https://scienceit-docs.lbl.gov/hpc/rocky8-migration/ , https://scienceit-docs.lbl.gov/hpc/software/software-module-farm/ , and https://scienceit-docs.lbl.gov/hpc/software/module-management/ (which is similar to though not exactly the same as Savio) as a temporary guide to some of the changes that have taken place on the Savio system due to the Savio Rocky Linux 8 OS upgrade.
Savio Downtime: Fri, 7/5 - Wed, 7/10
As you know, we have been working on upgrading the Savio operation system to Rocky 8. To complete the OS upgrade, we coordinated with the data center group on campus to combine our work with the long-awaited power work needed at the data center. The joint downtime will start at 5PM on Friday, 7/5. We anticipate to return the services by the end of Wednesday, 7/10. A scheduler reservation is in place to ensure no jobs run after 5PM on Friday, 7/5. If you plan to submit jobs, please request the appropriate wall time to ensure job completion before the downtime. Otherwise, your jobs will wait in the queue until Savio is back online.
Savio Scratch File System is Back up and Running: Mon., 7/1
The Savio /global/scratch parallel file system is back up and running and usable again.
Savio Scratch File System is Down: Mon, 6/24
The /global/scratch parallel file system started having access problems on Friday, 6/21. The investigation is underway. We apologize for any inconvenience this may cause and will keep you posted about the investigation's status.
News Articles
VIRTUAL-- UC Berkeley Cloud Meetup 028: Secure Research Data + Compute
Details
No researcher wants to wake up to a headline screaming that their research has leaked human subject data onto the Darkweb. But it's not always clear to people how to protect their key research data, and there is much pressing work to do beyond security. Ever...Read more about VIRTUAL-- UC Berkeley Cloud Meetup 028: Secure Research Data + Compute
VIRTUAL-- UC Berkeley Cloud Meetup 027: Pacific Research Platform
This month's Cloud Meetup will focus on the Pacific Research Platform. Launched in 2015 with a grant from the National Science Foundation, the Pacific Research Platform sought to build a high-speed broadband freeway for research in a variety of data-intensive
...Read more about VIRTUAL-- UC Berkeley Cloud Meetup 027: Pacific Research PlatformIntroducing the New Research IT Website
Research IT is very excited to announce the launch of the new Research IT website. The site is now hosted on Open Berkeley to match other campus websites and comes with enhanced security and accessibility features as well as Berkeley branding.
One...Read more about Introducing the New Research IT Website
- « first News Articles
- ‹ previous News Articles
- …
- 8 of 116 News Articles
- 9 of 116 News Articles
- 10 of 116 News Articles
- 11 of 116 News Articles
- 12 of 116 News Articles (Current page)
- 13 of 116 News Articles
- 14 of 116 News Articles
- 15 of 116 News Articles
- 16 of 116 News Articles
- …
- next › News Articles
- last » News Articles