Status and Service Updates

View all status and service updates below 

Issue with job emails: Thurs, 7/28

We have received a number of tickets about not receiving emails when jobs finish. We are looking into this and will provide an update when it is resolved.

/home & /clusterfs are working: Wed, 5/18

/home and /clusterfs are now back to normal. Thank you for your patience while we resolved this issue.

clusterfs degraded performance: Tues, 5/17

We are aware of the degraded performance on home directory and condo storage under /clusterfs and are working to try to fix it by the end of today.

Working on Viz Node Issues: Wed, 05/11

The Viz node has been experiencing issues since our scheduled downtime last week. This issue also affects Matlab OOD access. We are working to identify and correct this issue, and will keep you updated as this process continues.

Data Transfer Node + Globus Back Online: Tues, 4/26

The DMZ routes on campus are back. Our data transfer node is back online and the Globus service is resumed. 

Data Transfer Node Down: Mon, 4/25

The data transfer node is offline again, likely because the dmz route is down on campus. We will send an update when this is back online. 

Scheduled Savio Downtime: Mon, 5/2

We will power down Savio between 8am-5pm on Monday, 5/2 to apply vendor patches to the SLURM scheduling system. If you submit jobs before the downtime please request proper walltime for them to complete in time, otherwise they will wait in the queue until the cluster is back online.

No office hours on Wed, 3/23 - Thurs, 3/24

Research IT will not be holding our regular office hours during the week of Spring Break, 3/21 - 3/25. Please get in touch with us at research-it@berkeley.edu for any help in the meantime.

Savio Cluster is back in service: Fri, 1/28

We have solutions in place to fix /global/scratch file system so the Savio/Vector clusters are back in service. Please submit a ticket at brc-hpc-help@berkeley.edu if you have any questions or experience continued issues. Thank you for your patience and happy computing!

Working on Scratch Instability: Thurs, 01/27

The global scratch file system on Savio has not been stable since last night. Access to certain folders/files may be sporadic or hanging, and some file operations might give I/O errors. We’re doing emergency maintenance and will keep you updated as we resolve it.

Savio Scratch File System is Back Online: Tues, 1/25

The global scratch file system is back to service. Jobs have started running on Savio. Thank you very much for your understanding and patience while we were restoring the service.

Working on Scratch Instability: Tues, 01/25

The global scratch file system on Savio has not been stable since about noon today. Access to certain folders/files may be sporadic or hanging, and some file operations might throw out I/O errors. We are working on this issue and will keep you updated as we resolve it.

No office hours on 1/5 or 1/6

Research IT is participating in a "soft" curtailment the week of Monday, Jan. 3 through Friday, Jan. 7, 2022. We will not be holding office hours during this week.

Closed for curtailment: Thurs, 12/23 - Mon, 01/03

Research IT will be participating in the campus-wide curtailment program from Thursday, Dec. 23, 2021 through Monday, Jan. 3, 2022. Many facilities and services will be closed or operating on modified schedules during this time.

Data Transfer Node is back online, Thurs, 12/2

We are glad to inform you that DTN, the designated Data Transfer Node, is back online. We apologize for any inconvenience this caused.

Data Transfer Node (DTN) is down: Thurs, 12/2

The Data Transfer Node (DTN) is currently down. We plan to go on site around noon to get it fixed and then will post an update.

No office hours on 11/23 or 11/24

Research IT is participating in the campus-wide curtailment during the week of Thanksgiving from Monday, Nov. 22 through Friday, Nov. 26, 2021. We will not be holding office hours this week.

Data Transfer Node is back in service: Tues, 10/26

We are glad to inform you that DTN, the designated Data Transfer Node dtn00.brc.berkeley.edu, is back online. We apologize for any inconvenience for the past a few days.

Data Transfer Node is down: Mon, 10/25

As we are working to restore the service, you can login dtn01.brc.berkeley.edu to transfer data for now. Please contact us with any questions.

Open OnDemand access via eduroam is working: Thurs, 10/21

Users should no longer experience the timeout issues that had been occurring in previous weeks

New condo storage pricing announced

Condo storage purchase is now available in chunks of 112TB at the estimated cost of $5750. Please contact us if you would like to purchase storage for your research data.

Savio3_gpu partition open to FCA users

The savio3_gpu partition is open to Faculty Computing Allowance users. You can submit jobs to a subset of GPU nodes on Savio. Read further for details on how to use GPU partitions.

Open OnDemand downtime: Fri, 10/15 at 10am

Campus's EduRoam currently cannot route to OOD. Please use AirBears2, full tunnel vpn, or wired Ethernet connections. In order to focus our support on Open OnDemand, the JupyterHub server is officially taken offline. We plan to take a short downtime to complete the transition starting 10:00 AM Friday, October 15 for approximately 30 minutes.

Savio Scratch File System is Back Online: Thurs, 9/30

The global scratch file system is back to service. Jobs have started running on Savio. Thank you very much for your understanding and patience while we were restoring the service.

Working on scratch instability: Tues, 9/28

The global scratch file system on Savio has not been stable since this morning. The access to certain folders/files may be sporadic or hanging, some file operations might throw out I/O errors. We are working on this issue and will keep you updated as we work out solutions.

Savio back online: Mon, 9/20, 10:15am

We have resolved the issue on the scratch parallel file system. The work is complete and jobs have started running on Savio. Thank you for your patience.

Savio Scratch File System is Back Online: Thurs, 9/30

The global scratch file system is back to service. Jobs have started running on Savio. Thank you very much for your understanding and patience while we were restoring the service.

Working on scratch instability: Tues, 9/28

The global scratch file system on Savio has not been stable since this morning. The access to certain folders/files may be sporadic or hanging, some file operations might throw out I/O errors. We are working on this issue and will keep you updated as we work out solutions.

Savio back online: Mon, 9/20, 10:15am

We have resolved the issue on the scratch parallel file system. The work is complete and jobs have started running on Savio. Thank you for your patience.

Savio scheduled downtime: Mon, 9/20, 8am

A small number of users on the new scratch file system have been impacted by a file system bug that prevents the creation of new files. We plan to have a four hour downtime to resolve this issue.

Savio scheduled downtime: Fri, 8/27, 9-11am

In order to make a minor change to the current structure of the /global/scratch file system, we are scheduling a brief downtime to relocate all user directories from /global/scratch/[username] to /global/scratch/users/[username].

Savio back online: Thurs, 8/12, 5pm

Savio is now back online, and we’re pleased to announce the availability of the new /global/scratch file system, which should alleviate the space shortage issues. Please migrate any critical data to the new system.

Scheduled Savio downtime: Thurs, 8/12, 9am

We are excited to announce an upcoming downtime on Thursday, August 12 at 9am to complete the roll-out of the new /global/scratch file system which offers a significant upgrade in capability. We ask that you begin migrating any critical data to the new file system and leave any unneeded data behind.

Savio scheduled downtime: Tues, 7/20, 9am

We need to stop scheduling new SLURM jobs for a short period of time to migrate backend database and implement support for allocation management inside the new MyBRC user portal. The jobs will resume running when we complete the scheduled work.

Savio back online: Tues, 4/20, 3:30pm

The Savio cluster is back online as planned, and jobs have started running. Please contact us via email or at our drop in office hours if you have any questions.

Savio scheduled downtime: Tues, 4/20, 9am

There is a planned downtime starting at 9am until the end of the day on Tuesday, April 20, 2021 to prepare space for the new global scratch parallel filesystem.