Botanist finds a bioinformatic home in Savio

October 5, 2017

Before her research led to an appointment as a Research Botanist at The University and Jepson Herbaria at UC Berkeley, Ingrid Jordon-Thaden graduated from the University of Nebraska-Lincoln and the University of Heidelberg, Germany. When she left Heidelberg, she took with her not only a PhD, but also the beginnings of her research on the genetic history of the genus Draba, in the mustard family.

Draba oligosperma, the species that is Jordon-Thaden’s primary focus, is unusual among other plants within its genus because it has one of the widest distribution in high altitudes from Alaska to Arizona and California to Wyoming. Jordon-Thaden is especially interested in studying Draba oligosperma when it exhibits polyploidy (more than two sets of paired chromosomes) and apomixis (asexual reproduction). By combining population genetics and phylogenetics in her research, Jordon-Thaden studies each plant population’s historical context in hopes of discovering the role of these changes in the adaptation of Draba. What advantages, for example, can a plant with multiple ploidy levels gain in its environment? How might apomictic populations remain stable with little to no genetic variation? Are there environmental factors that could induce apomictic behavior in plants? Can a plant be forced to be apomictic in the lab? These are just some of the questions Jordon-Thaden fields in her study of Draba, which she hopes can have significant impact on plant breeding practices in the agricultural industry.

Finding a Bioinformatic Home

Inheriting the Draba project from Professor Dr. Marcus A. Koch and Dr. Ihsan Al-Shehbaz, Jordon-Thaden continued her investigation after graduating as she moved between research positions at institutions throughout the States, from the University of Florida to Bucknell University and UC Berkeley. However, as she brought her work from university to university, Jordon-Thaden ran into a problem: “[Once my appointment at Florida ended,] I was stuck without a bioinformatic home,” she said.

As she tried to find a new home for her computational research, Jordon-Thaden discovered that there were no university or national resources that fit her needs. Most university facilities could not allow her to use their HPC clusters because parts of her work were generated or paid for by other institutions. National resources, like iPlant Collaborative (now CyVerse) and CIPRES, were unable to accommodate her large data files and the high memory usage her algorithms require, or had insufficient wall-clock limits (i.e., her jobs needed to run for more time than the maximum permitted). Jordon-Thaden had no choice but to shelve her research on Draba.

Getting Back on Track

In January 2016, however, Jordon-Thaden was able to restart her research after finding out about Research IT’s Berkeley Research Computing Program (BRC) and its services. As the lab manager for newly-hired Assistant Professor Carl Rothfels of Integrative Biology and University and Jepson Herbaria, Jordon-Thaden was able to benefit from his access to a free Faculty Computing Allowance (FCA) to run computations on Savio, Research IT’s high performance computing cluster. “We were poking around, trying to look for places to do our bioinformatics, and [Dr. Rothfels] found out that his lab members could have an account [using his FCA],” she explained. “So as soon as we could, we arranged it so I could have access.”

To examine the genetic history of Draba oligosperma, Jordon-Thaden used Savio to perform multiple sequence alignments on the raw data output from Illumina Next-Generation Sequencing, aligning the data based on features in DNA that are similar across multiple specimens of the plant. Running as many as 800 individual plant samples at a time, Jordon-Thaden’s computations typically took about 2 days on the cluster, falling within Savio’s wall-clock limit of 72 hours. “The Savio server usually works fine for most of my analyses,” Jordon-Thaden said.

To analyze and sort the Draba plant samples, Jordon-Thaden runs parsimony analyses and maximum likelihood estimation on the data, creating trees that describe the relationship of the individual plants to each other. She then uses these trees to study each population’s genetic history, genetic diversity, and even migration routes. Jordon-Thaden estimated that even when using Savio continuously and running the same alignments multiple times, she never exceeded 30-50% of the FCA in her first year of using Savio. BRC domain consultants were able to help make her data processing on the cluster more efficient, further extending the quantity of research supported by the Professor Rothfels' FCA. At this point, "the FCA can easily support 4-6 people doing work like mine,” she said.

Looking Forward

Having used Savio for over a year and a half now, Jordon-Thaden has been able to generate results from this data for 6 distinct publications. Currently, BRC is continuing to assist Jordon-Thaden as she prepares for the next step in her journey beyond Berkeley once her appointment ends in December. As she was looking for a new bioinformatic home, Jordon-Thaden has begun testing Jetstream, using an allocation provided by BRC IT Architect and XSEDE Campus Champion Aaron Culich, as a new option for continuing her Draba research. “I’ve started the process of getting access to [Jetstream] and getting my own XSEDE account,” she said. “Now that I’ve met with [Aaron], I know what I need to do.” She has now accepted a new position as the Director of Greenhouses and Botanic Garden in the Department of Botany in the University of Wisconsin, Madison. With the help of Research Data Management consultants, she is already coordinating with IT at UW Madison to transfer her data seamlessly from Savio to her new HPC home.

Whether you are transitioning into or out of Berkeley, Research IT is happy to connect you to computational and data management resources that fit your project’s requirements, at both the university and national level. To get started, send us an email at