Research Computing Newsletter: August 2016
Fall Semester! The Research Computing team is gearing up for a new academic year and looking forward to interacting with the research community. Be on the lookout for training opportunities this year on-campus around using RedCat (our high performance computing cluster), Hpcdata (our experimental hadoop cluster), the Secure Research Environment, Data Management and Storage options, Getting the most out of your file transfers, and other training opportunities through XSEDE and the Ohio Supercomputer Center (OSC).
It is our pleasure to announce that Daniel Balagué Guardia has joined our Research Computing team. In his role, Daniel is responsible for extending Research Computing services to our community through active collaboration, and by architecting solutions that leverage the university's investment in its cyberinfrastructure. Daniel holds a Ph.D. in mathematics from the Universitat Autónoma de Barcelona and has held a variety of positions most recently as an Assistant Adjunct Professor at UCLA at the Program in Computing (PIC) in the math department.
We’re also hiring! Research Computing has an immediate opening for a Cyberinfrastructure Engineer, with a heavy networking and systems concentration, to work directly with research groups, including in our Great Lakes Energy Institute, Center for Membrane and Structural Biology, Comprehensive Cancer Center, Institute for Computational Biology, Center for Imaging Research, and Center for Computational Imaging and Personalized Diagnostics, along with our Network Engineering and Security group, to ensure optimal use of our campus cyberinfrastructure and optimal end-to-end network performance between our campus and external collaborators.
If you are interested, please feel free to contact Roger Bielefeld directly. You can also see the official posting under "position 5392" at the university's employment page http://www.case.edu/finadmin/humres/employment/career.html, where you can also submit an application.
Research Data Storage
Reliable access to data sets is a critical component of the research process. Research Computing leverages the highly scalable Dell Fluid File System network attached storage appliance to provide best-of-breed technology to the campus. The result? Dynamic storage designed to constantly adapt to our performance needs, helping minimize cost, time, and risk as you focus on moving your lab research forward.
We understand that your lab data, is one of our university’s most valuable assets, which makes supporting and protecting the systems that store your data a high priority. Our Research Data Storage (RDS) service leverages our [U]Tech network operations center (NOC) and a Dell service called Copilot, that proactively monitors our storage systems and provides real-time event notification to the Research Computing and our Copilot service support team. With their sophisticated technology called PhoneHome, Copilot is able to review and gather data to provide system recommendations and proactively resolve issues before they become problems. This ensures that Research Computing provides a highly available system that can be used when you need it the most.
When thinking about utilizing storage, it is important to think about the lifecycle of your data. Example questions that you should ask yourself include: How long do you need to keep your data? What kind of access do you need? Are your collaborators outside the university? Have you signed a data use agreement or are bound by Federal or State rules and regulations for storing and processing data? Do you need high speed access to the data? What is the change rate for the data being stored? Are you required to retain the data for journal publication or project close out?
These questions are part of a project Research Data Management plan that covers the planning, collecting, organising, managing, storage, security, backing up, preserving, and sharing your data. Research Computing provides information here on the importance of data management. We can help you design a plan and storage solution based on your needs through our consultation services. Data management plans are now required for many new grant opportunities and are actively reviewed for elements describing the preservation, sharing, and access for your data.
NOTE: A complimentary service is also provided by research service librarians at the Kelvin Smith Library (KSL). KSL does not provide storage solutions, but can help with discipline-specific metadata standards for your data, help with securing sensitive data in collaboration with [U]Tech, or assistance writing a data management plan for a grant.
In addition to consultation services, we have designed our RDS service to fit the needs of many data management plans. RDS is appropriate for critical data, that requires additional protection over a regular hard drive or consumer RAID storage device. Data is stored on an enterprise class system, with active monitoring, that can withstand multiple drive failures. The filesystem and RAID volumes are scanned regularly to detect and proactively fix issues. Data is replicated nightly to a secondary system. Snapshots are included for file recovery purposes within a seven day window.
Although the system is highly reliable and redundant, Research Computing recommends that a third copy of your data should be created if it is determined to be non-reproducible and highly valued. In addition to the RDS service, Research Computing and [U]Tech has other services such as backup and archival that can provide additional protection as needed.
Research Computing provides a full catalog of services and support options available to the research community. Please go to http://www.case.edu/utech/research-computing for more information.
HPC Bootcamp - Tuesday, Sep 13, 9-5 at Toepfer Room, Adelbert Hall
The first session in our HPC training series will kick off in September. Topics include introduction to HPC at CWRU, submitting jobs, using MATLAB on RedCat and much more!
Cleveland Hadoop Users Group - Monday, Sept 12, 5 PM, Cleveland Museum of Natural History
The Cleveland Big Data and Hadoop User Group is proud to announce a Mega-CHUG for September 12, 2016 at the Cleveland Museum of Natural History! Dr. Roger French, from Case School of Engineering will be presenting on data processing for solar farms, along with Dr. Evalyn Gates, Director of Cleveland Museum of Natural History. Dr. Gates wrote the 2009 book "Einstein’s Telescope: The Hunt for Dark Matter and Dark Energy in the Universe." Did we mention Dinosaurs?
OSC Workshop - Tuesday, Aug 9, 10-1 at Toepfer Room, Adelbert Hall
This three hour workshop will provide an introduction to OSC resources and how to access them. Topics include:
Hardware and software available at OSC
Getting allocations and accounts
How to work with the systems available at OSC
For more information and to RSVP click here.