Data Repositories List

Subscribe to Our Mailing List

Contact Us About HPC

When you're ready to deposit your data for sharing and archiving, look into whether there is already a repository in your field where the most likely users of your data would look. The following sites maintain lists of many repositories that accept research data:

This following is a selected list of data repositories available through other institutions. If you know of any other data repositories that should be included, please send the details to the ITS Service Desk ( CWRU is not responsible for any of the content of the sites listed here.

Long Term Ecological Research Network

The Long Term Ecological Research (LTER) Network is a collaborative effort involving more than 1800 scientists and students investigating ecological processes over long temporal and broad spatial scales. The Network promotes synthesis and comparative research across sites and ecosystems and among other related national and international research programs.

American Mineralogist Crystal Structure Database

This site is an interface to a crystal structure database that includes every structure published in the American Mineralogist, The Canadian Mineralogist, European Journal of Mineralogy and Physics and Chemistry of Minerals, as well as selected datasets from other journals.

Arts and Humanities Data Service

The Arts and Humanities Data Service (AHDS) is a UK national service aiding the discovery, creation and preservation of digital resources in the arts and humanities. Currently, their collection covers history, archaeology, Literature, Languages & Linguistics, visual and performing arts. Funding for the AHDS ceased in 2008, however links to its partner sites are still active.

Crystallography Open Database

The COD, once finalized, will be nothing else than a keyword-searchable Web server of crystal structure atomic coordinates, preserving the data after publication as well as unpublished data.

Digital Library for Earth System Education

DLESE is a distributed community effort involving educators, students, and scientists working together to improve the quality, quantity, and efficiency of teaching and learning about the Earth system. In pursuing this mission DLESE provides access to Earth data sets and imagery, including the tools and interfaces that enable their effective use in educational settings.

e-Depot Nederlandse Archeologie

An archive of digital data on archaeological research from the Netherlands

e-Crystals - Crystal Structure Report Archive

eCrystals - Southampton is the archive for Crystal Structures generated by the Southampton Chemical Crystallography Group and the EPSRC UK National Crystallography Service.

Oak Ridge National Laboratory Distributed Active Archive Center

The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) is a NASA-sponsored source for biogeochemical and ecological data and models useful in environmental research. All of our data sets and model products are free of any costs to you (including shipping).

Inter-University Consortium for Political and Social Research

The Inter-university Consortium for Political and Social Research is an organization of member institutions working together to: Acquire and preserve social science data, provide open and equitable access to these data, and promote effective data use.

International Food Policy Research Institute

IFPRI provides the following types of agriculture and socio-economic datasets: Geospatial Data, Household and Community-level Surveys, Institution-level Surveys, Regional Data, and Social Accounting Matrices.

National Digital Archive of Datasets

The National Digital Archive of Datasets (NDAD) preserves and provides online access to archived digital datasets and documents from UK central government departments on a wide range of subjects.

National Geoscience Data Repository System

The NGDRS is a system of geoscience data repositories, providing information about their respective holdings accessible through a web-based super catalog.

University Corporation for Atmospheric Research

Climate atmospheric data from the UCAR organization and other participating institutions.

Publishing Network for Geoscientific & Environmental Data

PANGAEA is a public digital library for science aimed at archiving, publishing and distributing geo-referenced data with special emphasis on environmental, marine and geological basic research.

RRUFF Project

The RRUFF Project is an integrated database of Raman spectra, X-ray diffraction and chemistry data for minerals, with the goal of creating a complete set of high quality spectral data from well characterized minerals.

Scripps Institution of Oceanography Explorer

Data, documents and images from 822 expeditions by the Scripps Institution of Oceanography (SIO) since 1903.

Strasbourg Astronomical Data Center

The CDS is a data center dedicated to the collection and worldwide distribution of astronomical data and related information.

Data Archiving and Networked Services

DANS is responsible for providing permanent access to research material from the humanities and social sciences. The present DANS collection contains the datasets of the Netherlands Historical Data Archive (NHDA), the Steinmetz Archive and the Scientific Statistical Agency (WSA).

British Atmospheric Data Centre

The BADC is the Natural Environment Research Council's (NERC) Designated Data Centre for the Atmospheric Sciences.

National Geoscience Data Centre

A comprehensive collection of information about the subsurface of any given area in Great Britain. The NGDC comprises data gathered or generated by the British Geological Survey in addition to data provided by external organizations.

NERC Earth Observation Data Centre

The NEODC is tasked with the acquisition, archiving and provision of access to remotely sensed data of the surface of the Earth acquired by satellite and airborne sensors.

British Oceanographic Data Centre

BODC holds wealth of publicly accessible marine data collected using a variety of instruments and samplers and collated from many sources. They handle biological, chemical, physical and geophysical data and their databanks contain measurements of nearly 10,000 different oceanographic variables.

Antarctic Environmental Data Centre

The AEDC coordinates the management of data collected by UK funded scientists in Antarctica and the Southern Ocean.

United Kingdom Data Archive

The UK Data Archive (UKDA) is a centre of expertise in data acquisition, preservation, dissemination and promotion and is curator of the largest collection of digital data in the social sciences and humanities in the UK.

Centre for Ecology & Hydrology

CEH is a major custodian of environmental data for the UK. We have significant capabilities in data collation and management, and information systems development. We use these skills, together with our data archives, to support large-scale, long-term environmental research.


Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes.

United States National Virtual Observatory

NVO's objective is to enable new science by greatly enhancing access to data and computing resources. NVO makes it easy to locate, retrieve, and analyze data from archives and catalogs worldwide.

RCSB Protein Data Bank

The Protein Data Bank (PDB) is the single worldwide depository of information about the three-dimensional structures of large biological molecules, including proteins and nucleic acids. These are the molecules of life that are found in all organisms including bacteria, yeast, plants, flies, and mice, and in healthy as well as diseased humans.

National Center for Biotechnology Information

Established as a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information

European Molecular Biology Laboratory - European Bioinformatics Institute

The EBI is a centre for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.

Maize Genetics and Genomics Database

MaizeGDB is the community database for biological information about the crop plant Zea mays ssp. mays. Genetic, genomic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the data types accessible through this site.

Scholars Digital Library of Analytics

The Scholars Digital Library of Analytics prides itself as an intact repository of data sets for use in research, education, and reference. Included with each set of data is a description of what the data was initially used for, its subject area, and its number of rows and columns.

National Nuclear Data Center

The NNDC collects, evaluates, and disseminates nuclear physics data for basic nuclear research and for applied nuclear technologies. The NNDC is a worldwide resource for nuclear data.

Veterinary Medical Database

The VMDB compiles patient encounter data from nearly all North American veterinary medical colleges. Related databases from the Canine Eye Registration Foundation, Health Information Managers, Equine Eye Registration Foundation and a registry of dogs who have passed the DNA tests for various genetics disorders.

Geosciences Network

The GEON project is a collaboration among a dozen PI institutions and a number of other partner projects, institutions, and agencies to develop cyberinfrastructure in support of an environment for integrative geoscience research.

Incorporated Research Institutions for Seismology

The IRIS is a university research consortium dedicated to exploring the Earth's interior through the collection and distribution of seismographic data. Their collection includes waveform data, channel response data, and Event (earthquake) catalogs.

Southern California Earthquake Center

The SCEC's mission is to gather data on earthquakes in Southern California and elsewhere, integrate information into a comprehensive and physics-based understanding of earthquake phenomena; and communicate understanding to society at large as useful knowledge for reducing earthquake risk.


The UNAVCO Facility exists to support research investigators in their use of Global Positioning System technology for Earth sciences research. The Facility performs this task in part by archiving GPS data and data products for current and future applications.

Biomedical Informatics Research Network Data Repository

To further promote a collaborative research environment, the BIRN has undertaken the development of the public BIRN Data Repository (BDR) for the biomedical research community. The BDR will provide researchers with a venue to share and exchange their data with the broader biomedical research community, providing for the means to capture, curate, store, query, view, and download imaging and related data.

National Center for Atmospheric Research

Data sets include information collected from research facilities and tools, as well as information from climate and weather models created and compiled by NCAR scientists and those in our science community.

Encyclopedia of DNA Elements

The National Human Genome Research Institute launched ENCODE to carry out a project to identify all functional elements in the human genome sequence. The project is being conducted in three phases: a pilot project phase, a technology development phase and a planned production phase.

The Arabidopsis Information Resource

The Arabidopsis Information Resource collects information and maintains a database of genetic and molecular biology data for Arabidopsis thaliana, a widely used model plant.

Alaska Satellite Facility - Synthetic Aperture Radar Distributed Active Archive Center

The Alaska Satellite Facility, downlinks, processes, archives, and distributes SAR data from the European Space Agency's ERS-1 and ERS-2 satellites, NASDA's JERS-1 satellite, and the Canadian Space Agency's RADARSAT-1 satellite.

Goddard Earth Sciences Data and Information Services Center

The GES DISC is the home (archive) of Precipitation, Atmospheric Chemistry and Dynamics, and information, as well as data. We are one of eight NASA Science Mission Directorate DAACs that offer Earth science data, information, and services to research scientists, applications scientists, applications users, and students.

Global Hydrology Resource Center

The GHRC provides both historical and current Earth science data, information, and products from satellite, airborne, and surface-based instruments. The GHRC acquires basic data streams and produces derived products from many instruments spread across a variety of instrument platforms.

National Oceanographic Data Center

NODC maintains and updates a national ocean archive with environmental data acquired from domestic and foreign activities and produces products and research from these data which help monitor global environmental changes. These data include physical, biological and chemical measurements derived from in situ oceanographic observations, satellite remote sensing of the oceans, and ocean model simulations.

Universal Protein Resource

The UniProt consortium aims to support biological research by maintaining a high quality database that serves as a stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces freely accessible to the scientific community.

Atmospheric Radiation Measurement Climate Research Facility Data Archive

The ARM Archive supports the scientific field experiments of the Atmospheric Radiation Measurement (ARM) Program by storing and distributing the large quantities of data collected from these experiments. These data are used to research atmospheric radiation balance and cloud feedback processes, which are critical to the understanding of global climate change.

National Space Science Data Center

The National Space Science Data Center serves as the permanent archive for NASA space science mission data. "Space science" means astronomy and astrophysics, solar and space plasma physics, and planetary and lunar science.

Harvard M.I.T. Data Center

HMDC is the principal distributor of quantitative social science data from major international data consortia for Harvard and MIT.

Purdue Ionomics Information Management System

PiiMS provides integrated workflow control, data storage, and analysis to facilitate high-throughput data acquisition, along with integrated tools for data search, retrieval, and visualization for hypothesis development. PiiMS currently contains data on shoot concentrations of P, Ca, K, Mg, Cu, Fe, Zn, Mn, Co, Ni, B, Se, Mo, Na, As, and Cd in over 60,000 shoot tissue samples of Arabidopsis (Arabidopsis thaliana), including ethyl methanesulfonate, fast-neutron and defined T-DNA mutants, and natural accession and populations of recombinant inbred lines from over 800 separate experiments, representing over 1,000,000 fully quantitative elemental concentrations.

National Snow and Ice Data Center

NSIDC support(s) "research into our world's frozen realms: the snow, ice, glacier, frozen ground, and climate interactions that make up Earth's cryosphere. Scientific data, whether taken in the field or relayed from satellites orbiting Earth, form the foundation for the scientific research that informs the world about our planet and our climate systems.


Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied bio-sciences, including ecology, biology, and medicine. From the National Evolutionary Synthesis Center (NESCent) and the University of North Carolina Metadata Research Center, in coordination with a large group of journals and societies.