Data repositories
From OAD
- This is a list of repositories and databases for open data.
- Please annotate the entries to indicate the hosting organization, scope, licensing, and usage restrictions (if any). If a repository is open in some respects but not others, please include it with an annotation rather than exclude it.
- Related lists in OAD: Disciplinary repositories (primarily for texts, not data).
|
Archaeology
- Also see Social sciences.
- Open Context. From the Alexandria Archive Institute.
Biology
- Also see Entrez databases, listed under Multidisciplinary repositories.
- BOND (Biomolecular Object Network Databank). From Unleashed Informatics.
- Databases at EBI. From the European Bioinformatics Institute (EBI). This is a web directory of the EBI databases. Also see the FTP interface.
- Dryad. For data in evolutionary biology and related fields. From NESCent and the UNC Metadata Research Center.
- Molecular Biology Databases. From Shirley Fung. A list of 34 databases with annotations to show their openness under six criteria. Also see her list of 7 databases which comply with the Science Commons Open Access Data Protocol.
- PaleoBiology Database. "We are bringing together taxonomic and distributional information about the entire fossil record of plants and animals." From a large number of researchers at a large number of institutions.
- RCSB Protein Data Bank. From the Research Collaboratory for Structural Bioinformatics (RCSB).
Chemistry
- Also see Entrez databases, listed under Multidisciplinary repositories.
- ChemSpider. Hosted by ChemZoo.
- ChemStar. Maintained by India's National Chemical Laboratory and sponsored by India's Department for Scientific & Industrial Research.
- ChemSynthesis. A database of chemicals and their physical properties.
- ChemXSeer. Hosted by Pennsylvania State University.
- CrystalEye. From the Unilever Cambridge Centre for Molecular Informatics at the University of Cambridge.
- Crystallography Open Database. A joint project of the Mineralogical Society of America, Mineralogical Association of Canada, European Journal of Mineralogy, International Union of Crystallography, and the US National Science Foundation. Data are in the public domain.
- eCrystals. From the Southampton Chemical Crystallography Group and the EPSRC UK National Crystallography Service.
- NMRShiftDB. For "organic structures and their nuclear magnetic resonance (nmr) spectra." Distributed nodes from the EBI, University of Mainz and the Max Plank Institute for Chemical Ecology. Data are license with the GNU FDL.
- ChemStar. Maintained by India's National Chemical Laboratory and sponsored by India's Department for Scientific & Industrial Research.
- Open Notebook Science Solubility Challenge. Maintained by Jean-Claude Bradley, Rajarshi Guha, Andrew Lang and Cameron Neylon. A database of non-aqueous solubility measurements with links to lab notebook pages where experiments were recorded. The database can be searched via Web Query or alternate means.
- PubChem. From the U.S. National Center for Biotechnology Information of the National Institutes of Health (NIH).
- WorldWideMolecularMatrix. "An Open collection of information on small molecules." From the University of Cambridge.
- ZINC. "A free database of commercially-available compounds for virtual screening." From the Shoichet Laboratory in the Department of Pharmaceutical Chemistry at the University of California, San Francisco.
Environmental sciences
- Also see PANGAEA, listed under Multidisciplinary repositories.
- British Atmospheric Data Centre (BADC). From the Natural Environment Research Council (NERC). Many datasets are openly accessible but some are restricted.
- California Water CyberInfrastructure. Hydrology data on California's watersheds. From the Berkeley Water Center.
- National Ecological Observatory Network (NEON). A joint project of 50+ US universities and laboratories.
Geology
- GSA Data Repository. From the Geological Society of America.
Geosciences and geospatial data
- Also see PANGAEA, listed under Multidisciplinary repositories.
- Commons of Geographic Data. "This site is intended for any data in any format that can be referenced to location on the earth." From the University of Maine.
- GeoCommons. From FortiusOne.
- GeoNames. A database of placenames, under a CC-BY license. Founded by Marc Wick.
- ShareGeo. Integrating the older GRADE (Geospatial Repository for Academic Deposit and Extraction) repository. From EDINA.
Marine sciences
- Also see PANGAEA, listed under Multidisciplinary repositories.
- SeaDataNet. Funded by the EU and coordinated by Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER).
Medicine
- Also see Entrez databases, listed under Multidisciplinary repositories.
- GenBank. From the U.S. National Center for Biotechnology Information of the National Institutes of Health.
- Gene Expression Omnibus. From the U.S. National Center for Biotechnology Information of the National Institutes of Health.
- Melanoma Molecular Map Project. On melanoma biology and treatment.
Multidisciplinary repositories
- Also see Social Sciences.
- 3TU.Datacentre. A consortial data repository for Delft University of Technology, Eindhoven University of Technology and the University of Twente.
- Data Archiving and Networked Services. Dutch research data in the humanities and social sciences. From the Royal Netherlands Academy of Arts and Sciences (KNAW) and the Netherlands Organisation for Scientific Research (NWO).
- Edinburgh DataShare and Edinburgh University Data Library. Two repositories for data produced by research at the University of Edinburgh.
- Entrez databases. A directory of chemical, biochemical, biomedical, and medical databases from the U.S. National Center for Biotechnology Information of the National Institutes of Health.
- PANGAEA. "PANGAEA" stands for "Publishing Network for Geoscientific & Environmental Data". Hosted by the Alfred Wegener Institute for Polar and Marine Research and the University of Bremen's Center for Marine Environmental Sciences. Open to deposits from any scientist. Most datasets are open; some are restricted.
- Public Data Sets on AWS. From Amazon Web Services. The site already hosts OA datasets in biology, chemistry, and economics, and is willing to host them in any field.
- UPSpace University of Pretoria Research Repository, South Africa.
Physics
- Blue Obelisk Data Repository. Repository of isotope masses, under MIT license. From the Blue Obelisk. Described in 10.1021/ci050400b.
- DOE Data Explorer. From the US Department of Energy (DOE). Data generated by DOE-sponsored research.
Social sciences
- Also see Multidisciplinary repositories.
- Australian Social Science Data Archive. From the Australian Demographic and Social Research Institute at the Australian National University.
- CESSDA Data Portal. From the Council of European Social Science Data Archives (CESSDA).
- Digital Repositories E-Science Network (DReSNeT). From the UK Engineering & Physical Sciences Research Council (EPSRC). A network of social science repositories for texts and data.
- Economic and Social Science Data Service. From the UK Data Archive (UKDA) and Institute for Social and Economic Research (ISER), University of Essex; Manchester Information and Associated Services (MIMAS), and the Cathie Marsh Centre for Census and Survey Research (CCSR), University of Manchester. Access to data requires registration.

