First objective of the JISC-supported Sonex initiative was to identify and analyse deposit opportunities (use cases) for ingest of research papers (and potentially other scholarly work) into repositories. Later on, the project scope widened to include identification and dissemination of various projects being developed at institutions in relation to the deposit usecases previously analyzed. Finally, Sonex was recently asked to extend its analysis of deposit opportunities to research data.

Saturday, 18 December 2010

Sonex at the "Digital Library Research and Open Access: Interoperability Strategies" workshop

After delivering its paper "Handling repository-related interoperability issues" last Sep at the 2nd workshop in Glasgow, Sonex will be contributing a presentation at the forthcoming "Digital Library Research and Open Access: Interoperability Strategies" one-day event to be held at the British Academy in London next Feb 4th.

Sonex contribution will be part of this workshop dealing with digital libraries, Open Access repositories and interoperability among them. Already available conference programme includes presentations on DL. org reference model, policy and quality interoperability survey, degree of progress of Open Access repositories with regard to interoperability issues in the UK and Europe and research data library management among others.

Sunday, 12 December 2010

A preliminary list of discipline-specific projects on research data management

A preliminary list follows of currently running discipline-specific projects and initiatives (as of Dec 2010) dealing with research data management. The list below is not comprehensive, but a sample of ongoing projects, brought together in order to find out potential biases by area in current research data management projects. Should there be relevant projects missing, we’d appreciate a notification for including them as well.

[projects/initiatives listed in alphabetical order]

Project name: ACRID: Advanced Climate Research Infrastructure for Data
Institution/Funder/Manager: U East Anglia, STFC, Met Office, JISC
Project Description: The ACRID Project aims to develop an approach to publishing climate research data in a way that facilitates citing, re-use and the provision of full provenance information for processed data.
Area/Discipline: Climate Science

Project name: ADMIRAL
Institution/Funder/Manager: U Oxford, JISC
Project Description: A data management infrastructure for research across the life sciences
Area/Discipline: Life Sciences

Service/Project name: ADS: Archaeology Data Service
Institution/Funder/Manager: U York, AHRC, JISC, EU (mandated repository for AHRC, NERC)
Service/Project Description: The Archaeology Data Service supports research, learning and teaching with high quality and dependable digital resources. It does this by preserving digital data in the long term, and by promoting and disseminating a broad range of data in archaeology. The ADS promotes good practice in the use of digital data in archaeology, it provides technical advice to the research community, and supports the deployment of digital technologies.
ADS is actively engaged with research projects working with partners in all sectors of UK archaeology.
Area/Discipline: Archaeology

Project name: Global Argo Data Repository
Institution/Funder/Manager: NOAA, NODC (National Oceanographic Data Center), GODAE (Global Ocean Data Assimilation Experiment), IFREMER (Institute for Research and Exploitation of the Sea)
Project Description: In the year 2000, a global array of approximately 3,000 free-drifting profiling floats, known as the Argo Ocean Profiling Network, was planned as a major component of the ocean observing system. Argo originated from the need to make climate predictions on both short and long time scales and has led to international participation and collaboration to ensure global coverage.
Centers to handle the data collected by profiling floats have been established in a number of countries. These centers normally handle data from their nationally deployed floats, but sometimes provide that service to other countries or organizations. All Argo data will be publicly available in near real-time via the GTS (Global Telecommunications System) and in scientifically quality-controlled form with a few months delay.
Area/Discipline: Marine Sciences, Oceanography

Project name: BlueObelisk
Institution/Funder/Manager: Group of chemists/ programmers/informaticians
Project Description: The Blue Obelisk Data Repository lists many important chemoinformatics data such as element and isotope properties, atomic radii, etc. including references to original literature
Area/Discipline: Chemoinformatics

Project name: BRIL: Biophysical Repositories in the Lab
Institution/Funder/Manager: CeRch-KCL, JISC
Project Description: The BRIL project aims to enhance the repository facilities at the Randall Division of Cell and Molecular Biophysics at King’s College London by:
- Embedding the repository within the researchers’ day-to-day research and experimental practices
- Allowing data and metadata to be captured in automated fashion
- Allowing the structure of experimental processes as a whole to be captured, modelled and stored within the repository
- Enhancing browse and access facilities and data exchange facilities to increase interoperability.
Area/Discipline: Biophysics

Project name: CAiRO: Curating Artistic Research Output
Institution/Funder/Manager: U Bristol, DCC, JISC
Project Description: Research data created by the UK’s performance and visual arts departments is often rich, technically complex and amazingly varied in nature. This work may include interconnected multimedia records of a single live event or software which exhibits complex behaviours dependant upon the choices made by a viewer. The CAiRO project, funded as part of the wider JISC Managing Research Data programme, aims to offer data management skills tailored to the special requirements of the arts researcher-practitioner.
Area/Discipline: Creative Arts

Project name: The CEACS Data Library
Institution/Funder/Manager: CEACS Library, Center for Advanced Study in the Social Sciences (CEACS), Instituto Juan March, Madrid, Spain
Project Description: The CEACS Data Library provides support to its research community in conducting quantitative research with primary and secondary data. The Data Library has a collection of over 2,000 secondary research datasets from major data centres. The service supports research data management through a thematic website, one to one support and a Dataverse data repository to help with the management, sharing and preservation of the data produced by researchers.
Area/Discipline: Social Sciences

Project name: Data Conservancy: A New Vision for Data-Driven Science
Institution/Funder/Manager: National Science Foundation (NSF), Johns Hopkins University (Lead institution)
Project Description: The Data Conservancy (DC) embraces a shared vision: scientific data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society.
Area/Discipline: Astronomy, Earth Sciences, Life Sciences and Social Sciences

Project name: DataONE
Institution/Funder/Manager: National Science Foundation (NSF)
Project Description: DataONE was conceived to ensure preservation and access to multi-scale, multi-discipline, and multi-national data about life on earth and the environment that sustains this life. It was recognized from the outset that such data are often difficult to discover, access, integrate and analyze.
Area/Discipline: Earth & Life Sciences

Project name: DataTrain
Institution/Funder/Manager: U Cambridge, ADS, DCC, JISC
Project Description: The DataTrain project aims to build on findings and tools developed in the Incremental project (JISC 07/09 funding strand), to design discipline-focused data-management training modules for post-graduate courses in Archaeology and Social Anthropology at the University of Cambridge.
Area/Discipline: Archaeology, Social Anthropology

Project name: DATUM for Health: Research data management training for health studies
Institution/Funder/Manager: Northumbria U, DCC, JISC
Project Description: This collaborative project seeks to promote research data management skills of postgraduate research students in the health studies discipline through a specially-developed training programme which focuses on qualitative, unstructured research data.
Area/Discipline: Health Sciences

Project name: DMBI: Data Management in Bio-Imaging
Institution/Funder/Manager: The John Innes Centre (BBSRC), Norwich BioScience Institutes, JISC
Project Description: DMBI aims to raise the level of data management/handling for high-throughput bio-imaging, and strengthen the interactions between image data silos, both internally and with partner organisations.
Area/Discipline: Biology/Bio-imaging

Project name: DMP-ESRC: Data management planning for ESRC research data-rich investments
Institution/Funder/Manager: UK Data Archive (UKDA), Economic and Social Research Council (ESRC), Joint Information Systems Committee (JISC)
Project Description: Data Management Planning (DMP) project aims to increase the data management and sharing capability within the social sciences community.
Area/Discipline: Social Sciences

Project name: DMTpsych: Data Management Training for psychologists
Institution/Funder/Manager: U York, U Sheffield, Sheffield Hallam U, DCC, JISC
Project Description: The aim of DMTpsych is to build capacity and skills within psychology postgraduates relating to research data management. The project builds upon existing research data management materials developed by the Digital Curation Centre (DCC) to create discipline-focused postgraduate training materials that can be embedded into postgraduate research training for the psychological sciences.
Area/Discipline: Psychology

Project name: DRYAD UK
Institution/Funder/Manager: British Library, University of Oxford, JISC
Project Description: Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences as published by a Consortium of Journals. Dryad UK aims to expand Dryad into the UK by establishing a UK mirror site and extending service to new publishers and disciplines.
Area/Discipline: Biomedical Sciences

Project name: EDgrid Central: Data Repository System for 3-D Full-Scale Earthquake Testing Facility
Institution/Funder/Manager: National Institute for Advanced Industrial Science and Technology, Japan
Project Description: A data repository system called EDgrid Central is designed for storing huge amount of experiment data by using a 3-D full-scale earthquake testing facility. The EDgrid Central prepares large storage capacity and implements a data modeling for the shake test in the backend. The frontend is a portal for users to retrieve the stored data by meta-data search and bulk download. This system uses the NEEScentral developed by the NEES project in the United States by enhancing search and download functionalities, according to the EDgrid users' requirements. The EDgrid Central allows facility sites to have a permanent repository of the shaking table experiment and it also enables civil engineering researchers to share their data and reports in their daily activities.
Area/Discipline: Geophysics

Project name: EIDCSR: Embedding Institutional Data Curation Services in Research
Institution/Funder/Manager: U Oxford, JISC
Project Description: The Embedding Institutional Data Curation Services in Research (EIDCSR) project aims to address the data management and curation requirements of three collaborating research groups in Oxford, by scoping their requirements and embedding selected elements of the digital curation lifecycle, including policy, workflow, and sustainability solutions within the research process. The workflows generated by the project are intended to scale to include other research domains and the outputs should be of use to other research intensive institutions. Project runs until Dec'10.
Area/Discipline: Medical & Life Sciences

Project name: ERIM: Engineering Research Information Management
Institution/Funder/Manager: U Bath, UKOLN, JISC
Project Description: ERIM aims to specify in practical terms how effective data management can be enabled and supported in research projects, particular to support reuse or more broadly what can be thought of as 're-purposing'. The project will look primarily at the engineering research domain.
Area/Discipline: Engineering

Project name: EURO VO: European Virtual Observatory
Institution/Funder/Manager: CNRS, ESO, INAF, U Edinburgh
Project Description: The Virtual Observatory (VO) is an international astronomical community-based initiative. It aims to allow global electronic access to the available astronomical data archives of space and ground-based observatories and other sky survey databases. It also aims to enable data analysis techniques through a coordinating entity that will provide common standards, wide-network bandwidth, and state-of-the-art analysis tools. The EURO-VO project aims at deploying an operational VO in Europe. Its objectives are the support of the utilization of the VO tools and services by the scientific community, the technology take-up and VO compliant resource provision and the building of the technical infrastructure.
Area/Discipline: Astronomy

Project name: FISHnet
Institution/Funder/Manager: Centre for e-Research, King’s College London, JISC
Project Description: Freshwater information sharing network
Area/Discipline: Freshwater Biology

Project name: HALOGEN - History Archaeology Linguistics Onomastics and GENetics
Institution/Funder/Manager: U Leicester, JISC
Project Description: The cross-disciplinary Roots of the British collaboration between scholars in humanities and genetics seeks to interrogate the evidence for the migration and/or continuity of human populations in the British Isles in the distant past. The HALOGEN project will support the data management needs of the researchers involved and thus establish organisational best practice in terms of data management planning and the support of diverse cross-disciplinary research data.
Area/Discipline: Ancient history/Genetics

Project name: I2S2
Institution/Funder/Manager: UKOLN/DCC/Soton/STFC, JISC
Project Description: Infrastructure for integration in structural sciences
Area/Discipline: Chemistry (with a view towards inter-disciplinary application)

Project name: Incremental: A step by step approach to informing, improving, & increasing research data curation practice
Institution/Funder/Manager: Cambridge University Library, Humanities Advanced Technology and Information Institute (HATII) at U Glasgow, DCC, JISC
Project Description: The aim of Incremental is to inform, improve and increase research data curation within UK HEIs, by providing exemplars and resources for others to use. Specific objectives are: (1) to investigate current practices and requirements at each institution; (2) to develop a plan for addressing these requirements; (3) to pilot tools and services at each HEI and then make further adjustments and recommendations; (4) embed the work within each institution; and (5) to deliver resources and findings to the DCC, DPC and JISC for wider dissemination. In addition to resources, the project will seek to provide information about their cost and sustainability.
Area/Discipline: Archaeology, Chemistry, English, Engineering and Medicine

Project name: IODP: Integrated Ocean Drilling Program
Institution/Funder/Manager: National Science Foundation (NSF), Japan’s Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Project Description: IODP is an international marine research program that explores Earth's history and structure recorded in seafloor sediments and rocks, and monitors subseafloor environments. IODP builds upon the earlier successes of the Deep Sea Drilling Project (DSDP) and Ocean Drilling Program (ODP), which revolutionized our view of Earth history and global processes through ocean basin exploration.
The IODP oversees repositories around the world. Samples are distributed according to ODP and IODP policies.
Area/Discipline: Marine Sciences

Project name: MaDaM
Institution/Funder/Manager: Manchester eResearch Centre, JISC
Project Description: Pilot data management infrastructure for biomedical researchers
Area/Discipline: Biomedical Sciences

Project name: Managing Research Data: Gravitational Waves (MRD-GW)
Institution/Funder/Manager: STFC, University of Glasgow, JISC
Project Description: MRD-GW aims to examine the way in which Big Science data is managed, and produce recommendations as appropriate. Gravitational Wave (GW) data generated by the LIGO Scientific Consortium (LSC) will be used as a case-study.
Area/Discipline: Particle physics/Astronomy

Project name: PANGAEA
Institution/Funder/Manager: Alfred Wegener Institute for Polar and Marine Research (AWI), DFG
Project Description: Publishing Network for Geoscientific & Environmental Data
Area/Discipline: Earth Sciences

Project name: PEG-BOARD
Institution/Funder/Manager: School of Geographical Sciences, University of Bristol, JISC
Project Description: Palaeoclimate and environment data generation - building open access to research data
Area/Discipline: Palaeoclimatology

Project name: Quixote
Institution/Funder/Manager: U Cambridge/CSIC
Project Description: The main objective/vision of the Quixote project is to design, test and deploy a modular, open source system of tools that allow computational chemistry data (now sitting in the darkness of individual hard-disks) to be organized, shared, and queried
Area/Discipline: Quantum Chemistry

Project name: Research Data MANTRA
Institution/Funder/Manager: U Edinburgh/JISC
Project Description: Aims to develop open, online learning materials which reflect best practice in research data management grounded in three disciplinary contexts: social science, clinical psychology, and geoscience. The resulting materials will be embedded in three participating postgraduate programmes and made available through the Transkills programme for use by all postgraduate and early career researchers as well as made available generally through an open license. In addition to web-based 'chapters' that students can work through at their own pace, the course will include video interviews with leading academics about data management challenges, and practical exercises in handling data in four software analysis environments: SPSS, NVivo, R and ArcGIS.
Area/Discipline: Social and political science, Geoscience, Clinical psychology

Project name: SageCite: Citing network models of disease and associated data
Institution/Funder/Manager: UKOLN, U Manchester, British Library, JISC
Project Description: SageCite will develop and test a Citation Framework linking data, methods and publications. The domain of bio-informatics provides a case study, and the project builds on existing infrastructure and tools. Citations of complex network models of disease and associated data will be embedded in leading publications, exploring issues around the citation of data including the compound nature of datasets, description standards and identifiers.
Area/Discipline: Bioinformatics

Project name: ShareGeo Open
Institution/Funder/Manager: EDINA, JISC
Project Description: ShareGeo Open is a spatial data repository that promotes data sharing between creators and users of geospatial data
Area/Discipline: Geography

Project name: SPQR: supporting productive queries for research
Institution/Funder/Manager: KCL, U Edinburgh, Humboldt U Berlin, JISC
Project Description: The overall aim is to investigate the potential of linked data for integrating datasets related to classical antiquity, in particular addressing the particular challenges raised by our material – its incompleteness, uncertainty and fuzziness. We will achieve this by developing mechanisms for breaking data out of silos and exposing it as linked data, using standard ontologies, and in particular the Europeana Data Model, as the semantic “glue” for linking data into a wider network of knowledge. The ultimate objective will be to create a common corpus or “RDF warehouse” of linked Classics data that can be explored, searched and enhanced by further annotations.
Area/Discipline: Classics, Epigraphy and Archaeology

Project name: SUDAMIH
Institution/Funder/Manager: University of Oxford, JISC
Project Description: Supporting data management infrastructure for the Humanities
Area/Discipline: Humanities

Project name: TARDIS
Institution/Funder/Manager: Monash University, Australian National Data Service (ANDS), University of Sidney and some other Australian institutions
Project Description: TARDIS is a multi-institutional collaborative venture that aims to facilitate the archiving and sharing of raw X-ray diffraction images (collectively known as a 'dataset') from the protein crystallography community.
Area/Discipline: Crystallography

Project name: VAMDC Project: Virtual Atomic and Molecular Data Centre
Project Description: VAMDC aims at building an interoperable e-Infrastructure for the exchange of atomic and molecular data. It embraces on the one hand scientists from a wide spectrum of disciplines in atomic and molecular (AM) Physics with a strong coupling to the users of their AM data (astrochemistry, atmospheric physics, plasmas) and on the other hand scientists and engineers from the ICT community used to deal with deploying interoperable e-infrastructure.
Area/Discipline: Astrophysics

Project name: WissGrid: Grid for Science
Institution/Funder/Manager: DFG, U Göttingen, Astrophysikalisches Institut (AIP), Alfred-Wegener-Institut (AWI), Deutsches Elektronen Synchrotron (DESY), Deutsches Klimarechenzentrum GmbH (DKRZ), Konrad-Zuse-Zentrum für Informationstechnik (ZIB), Universitätsmedizin Göttingen (UMG), Niedersächsische Staats- und Universitätsbibliothek (SUB), Technische U Dortmund (UDO), U Heidelberg, U Trier, U Wuppertal
Project Description: WissGrid’s objective is to establish long-term organisational and technical D-Grid structures for the academic world. WissGrid combines the heterogeneous needs from a variety of scientific disciplines and develops concepts for the long-term sustainable use of the organisational and technical grid infrastructure. In this context, the project aims to strengthen the organisational cooperation of scientists in the grid and to lower the entry barriers for new community grids.
Area/Discipline: Astrophysics, High Energy Physics, Climate Research, Medicine

Project name: XYZ Project
Institution/Funder/Manager: U Cambridge/IUCr/BioMed Central/Open Knowledge Foundation, JISC
Project Description: The XYZ Project will create a demonstrator of a new workflow for publishing data in support of full-text. The author prepares data for publication (if possible with validation) in a third-party trusted repository before the paper is submitted to a publisher. Our software will manage the deposition, release to reviewers, dis-embargo and for conventional publication or as a data journal
Area/Discipline: Crystallography

Besides this preliminary set of discipline-specific research data-related running projects -to be shortly enriched by Sonex with a complementary list of general purpose projects dealing with research data management- a thorough list of open data repositories for all areas may be found at the data repository section of the Open Access Directory (OAD).

Saturday, 11 December 2010

Hrynaszkiewicz & Cockerill "In defence of supplemental data files"

A valuable text on Open Research Data was published last Fri Dec 10th by Iain Hrynaszkiewicz and Matt Cockerill at the BioMed Central blog. International initiatives for data sharing such as DataCite are mentioned in the article, along with plenty of other interesting references.

Another article on Open Data by Iain Hrynaszkiewicz was recently released in BMC Research Notes: "A call for BMC Research Notes contributions promoting best practice in data standardization, sharing and publication".