After delivering its paper "Handling repository-related interoperability issues" last Sep at the 2nd DL.org workshop in Glasgow, Sonex will be contributing a presentation at the forthcoming DL.org "Digital Library Research and Open Access: Interoperability Strategies" one-day event to be held at the British Academy in London next Feb 4th.
Sonex contribution will be part of this DL.org workshop dealing with digital libraries, Open Access repositories and interoperability among them. Already available conference programme includes presentations on DL. org reference model, DL.org policy and quality interoperability survey, degree of progress of Open Access repositories with regard to interoperability issues in the UK and Europe and research data library management among others.
Saturday, 18 December 2010
After delivering its paper "Handling repository-related interoperability issues" last Sep at the 2nd DL.org workshop in Glasgow, Sonex will be contributing a presentation at the forthcoming DL.org "Digital Library Research and Open Access: Interoperability Strategies" one-day event to be held at the British Academy in London next Feb 4th.
Sunday, 12 December 2010
A preliminary list follows of currently running discipline-specific projects and initiatives (as of Dec 2010) dealing with research data management. The list below is not comprehensive, but a sample of ongoing projects, brought together in order to find out potential biases by area in current research data management projects. Should there be relevant projects missing, we’d appreciate a notification for including them as well.
[projects/initiatives listed in alphabetical order]
Project name: ACRID: Advanced Climate Research Infrastructure for Data
Institution/Funder/Manager: U East Anglia, STFC, Met Office, JISC
Project Description: The ACRID Project aims to develop an approach to publishing climate research data in a way that facilitates citing, re-use and the provision of full provenance information for processed data.
Area/Discipline: Climate Science
Project name: ADMIRAL
Institution/Funder/Manager: U Oxford, JISC
Project Description: A data management infrastructure for research across the life sciences
Area/Discipline: Life Sciences
Service/Project name: ADS: Archaeology Data Service
Institution/Funder/Manager: U York, AHRC, JISC, EU (mandated repository for AHRC, NERC)
Service/Project Description: The Archaeology Data Service supports research, learning and teaching with high quality and dependable digital resources. It does this by preserving digital data in the long term, and by promoting and disseminating a broad range of data in archaeology. The ADS promotes good practice in the use of digital data in archaeology, it provides technical advice to the research community, and supports the deployment of digital technologies.
ADS is actively engaged with research projects working with partners in all sectors of UK archaeology.
Project name: Global Argo Data Repository
Institution/Funder/Manager: NOAA, NODC (National Oceanographic Data Center), GODAE (Global Ocean Data Assimilation Experiment), IFREMER (Institute for Research and Exploitation of the Sea)
Project Description: In the year 2000, a global array of approximately 3,000 free-drifting profiling floats, known as the Argo Ocean Profiling Network, was planned as a major component of the ocean observing system. Argo originated from the need to make climate predictions on both short and long time scales and has led to international participation and collaboration to ensure global coverage.
Centers to handle the data collected by profiling floats have been established in a number of countries. These centers normally handle data from their nationally deployed floats, but sometimes provide that service to other countries or organizations. All Argo data will be publicly available in near real-time via the GTS (Global Telecommunications System) and in scientifically quality-controlled form with a few months delay.
Area/Discipline: Marine Sciences, Oceanography
Project name: BlueObelisk
Institution/Funder/Manager: Group of chemists/ programmers/informaticians
Project Description: The Blue Obelisk Data Repository lists many important chemoinformatics data such as element and isotope properties, atomic radii, etc. including references to original literature
Project name: BRIL: Biophysical Repositories in the Lab
Institution/Funder/Manager: CeRch-KCL, JISC
Project Description: The BRIL project aims to enhance the repository facilities at the Randall Division of Cell and Molecular Biophysics at King’s College London by:
- Embedding the repository within the researchers’ day-to-day research and experimental practices
- Allowing data and metadata to be captured in automated fashion
- Allowing the structure of experimental processes as a whole to be captured, modelled and stored within the repository
- Enhancing browse and access facilities and data exchange facilities to increase interoperability.
Project name: CAiRO: Curating Artistic Research Output
Institution/Funder/Manager: U Bristol, DCC, JISC
Project Description: Research data created by the UK’s performance and visual arts departments is often rich, technically complex and amazingly varied in nature. This work may include interconnected multimedia records of a single live event or software which exhibits complex behaviours dependant upon the choices made by a viewer. The CAiRO project, funded as part of the wider JISC Managing Research Data programme, aims to offer data management skills tailored to the special requirements of the arts researcher-practitioner.
Area/Discipline: Creative Arts
Project name: The CEACS Data Library
Institution/Funder/Manager: CEACS Library, Center for Advanced Study in the Social Sciences (CEACS), Instituto Juan March, Madrid, Spain
Project Description: The CEACS Data Library provides support to its research community in conducting quantitative research with primary and secondary data. The Data Library has a collection of over 2,000 secondary research datasets from major data centres. The service supports research data management through a thematic website, one to one support and a Dataverse data repository to help with the management, sharing and preservation of the data produced by researchers.
Area/Discipline: Social Sciences
Project name: Data Conservancy: A New Vision for Data-Driven Science
Institution/Funder/Manager: National Science Foundation (NSF), Johns Hopkins University (Lead institution)
Project Description: The Data Conservancy (DC) embraces a shared vision: scientific data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society.
Area/Discipline: Astronomy, Earth Sciences, Life Sciences and Social Sciences
Project name: DataONE
Institution/Funder/Manager: National Science Foundation (NSF)
Project Description: DataONE was conceived to ensure preservation and access to multi-scale, multi-discipline, and multi-national data about life on earth and the environment that sustains this life. It was recognized from the outset that such data are often difficult to discover, access, integrate and analyze.
Area/Discipline: Earth & Life Sciences
Project name: DataTrain
Institution/Funder/Manager: U Cambridge, ADS, DCC, JISC
Project Description: The DataTrain project aims to build on findings and tools developed in the Incremental project (JISC 07/09 funding strand), to design discipline-focused data-management training modules for post-graduate courses in Archaeology and Social Anthropology at the University of Cambridge.
Area/Discipline: Archaeology, Social Anthropology
Project name: DATUM for Health: Research data management training for health studies
Institution/Funder/Manager: Northumbria U, DCC, JISC
Project Description: This collaborative project seeks to promote research data management skills of postgraduate research students in the health studies discipline through a specially-developed training programme which focuses on qualitative, unstructured research data.
Area/Discipline: Health Sciences
Project name: DMBI: Data Management in Bio-Imaging
Institution/Funder/Manager: The John Innes Centre (BBSRC), Norwich BioScience Institutes, JISC
Project Description: DMBI aims to raise the level of data management/handling for high-throughput bio-imaging, and strengthen the interactions between image data silos, both internally and with partner organisations.
Project name: DMP-ESRC: Data management planning for ESRC research data-rich investments
Institution/Funder/Manager: UK Data Archive (UKDA), Economic and Social Research Council (ESRC), Joint Information Systems Committee (JISC)
Project Description: Data Management Planning (DMP) project aims to increase the data management and sharing capability within the social sciences community.
Area/Discipline: Social Sciences
Project name: DMTpsych: Data Management Training for psychologists
Institution/Funder/Manager: U York, U Sheffield, Sheffield Hallam U, DCC, JISC
Project Description: The aim of DMTpsych is to build capacity and skills within psychology postgraduates relating to research data management. The project builds upon existing research data management materials developed by the Digital Curation Centre (DCC) to create discipline-focused postgraduate training materials that can be embedded into postgraduate research training for the psychological sciences.
Project name: DRYAD UK
Institution/Funder/Manager: British Library, University of Oxford, JISC
Project Description: Dryad is an international repository of data underlying peer-reviewed articles in the basic and applied biosciences as published by a Consortium of Journals. Dryad UK aims to expand Dryad into the UK by establishing a UK mirror site and extending service to new publishers and disciplines.
Area/Discipline: Biomedical Sciences
Project name: EDgrid Central: Data Repository System for 3-D Full-Scale Earthquake Testing Facility
Institution/Funder/Manager: National Institute for Advanced Industrial Science and Technology, Japan
Project Description: A data repository system called EDgrid Central is designed for storing huge amount of experiment data by using a 3-D full-scale earthquake testing facility. The EDgrid Central prepares large storage capacity and implements a data modeling for the shake test in the backend. The frontend is a portal for users to retrieve the stored data by meta-data search and bulk download. This system uses the NEEScentral developed by the NEES project in the United States by enhancing search and download functionalities, according to the EDgrid users' requirements. The EDgrid Central allows facility sites to have a permanent repository of the shaking table experiment and it also enables civil engineering researchers to share their data and reports in their daily activities.
Project name: EIDCSR: Embedding Institutional Data Curation Services in Research
Institution/Funder/Manager: U Oxford, JISC
Project Description: The Embedding Institutional Data Curation Services in Research (EIDCSR) project aims to address the data management and curation requirements of three collaborating research groups in Oxford, by scoping their requirements and embedding selected elements of the digital curation lifecycle, including policy, workflow, and sustainability solutions within the research process. The workflows generated by the project are intended to scale to include other research domains and the outputs should be of use to other research intensive institutions. Project runs until Dec'10.
Area/Discipline: Medical & Life Sciences
Project name: ERIM: Engineering Research Information Management
Institution/Funder/Manager: U Bath, UKOLN, JISC
Project Description: ERIM aims to specify in practical terms how effective data management can be enabled and supported in research projects, particular to support reuse or more broadly what can be thought of as 're-purposing'. The project will look primarily at the engineering research domain.
Project name: EURO VO: European Virtual Observatory
Institution/Funder/Manager: CNRS, ESO, INAF, U Edinburgh
Project Description: The Virtual Observatory (VO) is an international astronomical community-based initiative. It aims to allow global electronic access to the available astronomical data archives of space and ground-based observatories and other sky survey databases. It also aims to enable data analysis techniques through a coordinating entity that will provide common standards, wide-network bandwidth, and state-of-the-art analysis tools. The EURO-VO project aims at deploying an operational VO in Europe. Its objectives are the support of the utilization of the VO tools and services by the scientific community, the technology take-up and VO compliant resource provision and the building of the technical infrastructure.
Project name: FISHnet
Institution/Funder/Manager: Centre for e-Research, King’s College London, JISC
Project Description: Freshwater information sharing network
Area/Discipline: Freshwater Biology
Project name: HALOGEN - History Archaeology Linguistics Onomastics and GENetics
Institution/Funder/Manager: U Leicester, JISC
Project Description: The cross-disciplinary Roots of the British collaboration between scholars in humanities and genetics seeks to interrogate the evidence for the migration and/or continuity of human populations in the British Isles in the distant past. The HALOGEN project will support the data management needs of the researchers involved and thus establish organisational best practice in terms of data management planning and the support of diverse cross-disciplinary research data.
Area/Discipline: Ancient history/Genetics
Project name: I2S2
Institution/Funder/Manager: UKOLN/DCC/Soton/STFC, JISC
Project Description: Infrastructure for integration in structural sciences
Area/Discipline: Chemistry (with a view towards inter-disciplinary application)
Project name: Incremental: A step by step approach to informing, improving, & increasing research data curation practice
Institution/Funder/Manager: Cambridge University Library, Humanities Advanced Technology and Information Institute (HATII) at U Glasgow, DCC, JISC
Project Description: The aim of Incremental is to inform, improve and increase research data curation within UK HEIs, by providing exemplars and resources for others to use. Specific objectives are: (1) to investigate current practices and requirements at each institution; (2) to develop a plan for addressing these requirements; (3) to pilot tools and services at each HEI and then make further adjustments and recommendations; (4) embed the work within each institution; and (5) to deliver resources and findings to the DCC, DPC and JISC for wider dissemination. In addition to resources, the project will seek to provide information about their cost and sustainability.
Area/Discipline: Archaeology, Chemistry, English, Engineering and Medicine
Project name: IODP: Integrated Ocean Drilling Program
Institution/Funder/Manager: National Science Foundation (NSF), Japan’s Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Project Description: IODP is an international marine research program that explores Earth's history and structure recorded in seafloor sediments and rocks, and monitors subseafloor environments. IODP builds upon the earlier successes of the Deep Sea Drilling Project (DSDP) and Ocean Drilling Program (ODP), which revolutionized our view of Earth history and global processes through ocean basin exploration.
The IODP oversees repositories around the world. Samples are distributed according to ODP and IODP policies.
Area/Discipline: Marine Sciences
Project name: MaDaM
Institution/Funder/Manager: Manchester eResearch Centre, JISC
Project Description: Pilot data management infrastructure for biomedical researchers
Area/Discipline: Biomedical Sciences
Project name: Managing Research Data: Gravitational Waves (MRD-GW)
Institution/Funder/Manager: STFC, University of Glasgow, JISC
Project Description: MRD-GW aims to examine the way in which Big Science data is managed, and produce recommendations as appropriate. Gravitational Wave (GW) data generated by the LIGO Scientific Consortium (LSC) will be used as a case-study.
Area/Discipline: Particle physics/Astronomy
Project name: PANGAEA
Institution/Funder/Manager: Alfred Wegener Institute for Polar and Marine Research (AWI), DFG
Project Description: Publishing Network for Geoscientific & Environmental Data
Area/Discipline: Earth Sciences
Project name: PEG-BOARD
Institution/Funder/Manager: School of Geographical Sciences, University of Bristol, JISC
Project Description: Palaeoclimate and environment data generation - building open access to research data
Project name: Quixote
Institution/Funder/Manager: U Cambridge/CSIC
Project Description: The main objective/vision of the Quixote project is to design, test and deploy a modular, open source system of tools that allow computational chemistry data (now sitting in the darkness of individual hard-disks) to be organized, shared, and queried
Area/Discipline: Quantum Chemistry
Project name: Research Data MANTRA
Institution/Funder/Manager: U Edinburgh/JISC
Project Description: Aims to develop open, online learning materials which reflect best practice in research data management grounded in three disciplinary contexts: social science, clinical psychology, and geoscience. The resulting materials will be embedded in three participating postgraduate programmes and made available through the Transkills programme for use by all postgraduate and early career researchers as well as made available generally through an open license. In addition to web-based 'chapters' that students can work through at their own pace, the course will include video interviews with leading academics about data management challenges, and practical exercises in handling data in four software analysis environments: SPSS, NVivo, R and ArcGIS.
Area/Discipline: Social and political science, Geoscience, Clinical psychology
Project name: SageCite: Citing network models of disease and associated data
Institution/Funder/Manager: UKOLN, U Manchester, British Library, JISC
Project Description: SageCite will develop and test a Citation Framework linking data, methods and publications. The domain of bio-informatics provides a case study, and the project builds on existing infrastructure and tools. Citations of complex network models of disease and associated data will be embedded in leading publications, exploring issues around the citation of data including the compound nature of datasets, description standards and identifiers.
Project name: ShareGeo Open
Institution/Funder/Manager: EDINA, JISC
Project Description: ShareGeo Open is a spatial data repository that promotes data sharing between creators and users of geospatial data
Project name: SPQR: supporting productive queries for research
Institution/Funder/Manager: KCL, U Edinburgh, Humboldt U Berlin, JISC
Project Description: The overall aim is to investigate the potential of linked data for integrating datasets related to classical antiquity, in particular addressing the particular challenges raised by our material – its incompleteness, uncertainty and fuzziness. We will achieve this by developing mechanisms for breaking data out of silos and exposing it as linked data, using standard ontologies, and in particular the Europeana Data Model, as the semantic “glue” for linking data into a wider network of knowledge. The ultimate objective will be to create a common corpus or “RDF warehouse” of linked Classics data that can be explored, searched and enhanced by further annotations.
Area/Discipline: Classics, Epigraphy and Archaeology
Project name: SUDAMIH
Institution/Funder/Manager: University of Oxford, JISC
Project Description: Supporting data management infrastructure for the Humanities
Project name: TARDIS
Institution/Funder/Manager: Monash University, Australian National Data Service (ANDS), University of Sidney and some other Australian institutions
Project Description: TARDIS is a multi-institutional collaborative venture that aims to facilitate the archiving and sharing of raw X-ray diffraction images (collectively known as a 'dataset') from the protein crystallography community.
Project name: VAMDC Project: Virtual Atomic and Molecular Data Centre
Institution/Funder/Manager: EU, CNRS, CMSUC, UCL, OU, UNIVIE, UU, KOLN, INAF, QUB, AOB, ISRAN, RFNC-VNIITF, IAO, IVIC, INASAN
Project Description: VAMDC aims at building an interoperable e-Infrastructure for the exchange of atomic and molecular data. It embraces on the one hand scientists from a wide spectrum of disciplines in atomic and molecular (AM) Physics with a strong coupling to the users of their AM data (astrochemistry, atmospheric physics, plasmas) and on the other hand scientists and engineers from the ICT community used to deal with deploying interoperable e-infrastructure.
Project name: WissGrid: Grid for Science
Institution/Funder/Manager: DFG, U Göttingen, Astrophysikalisches Institut (AIP), Alfred-Wegener-Institut (AWI), Deutsches Elektronen Synchrotron (DESY), Deutsches Klimarechenzentrum GmbH (DKRZ), Konrad-Zuse-Zentrum für Informationstechnik (ZIB), Universitätsmedizin Göttingen (UMG), Niedersächsische Staats- und Universitätsbibliothek (SUB), Technische U Dortmund (UDO), U Heidelberg, U Trier, U Wuppertal
Project Description: WissGrid’s objective is to establish long-term organisational and technical D-Grid structures for the academic world. WissGrid combines the heterogeneous needs from a variety of scientific disciplines and develops concepts for the long-term sustainable use of the organisational and technical grid infrastructure. In this context, the project aims to strengthen the organisational cooperation of scientists in the grid and to lower the entry barriers for new community grids.
Area/Discipline: Astrophysics, High Energy Physics, Climate Research, Medicine
Project name: XYZ Project
Institution/Funder/Manager: U Cambridge/IUCr/BioMed Central/Open Knowledge Foundation, JISC
Project Description: The XYZ Project will create a demonstrator of a new workflow for publishing data in support of full-text. The author prepares data for publication (if possible with validation) in a third-party trusted repository before the paper is submitted to a publisher. Our software will manage the deposition, release to reviewers, dis-embargo and for conventional publication or as a data journal
Besides this preliminary set of discipline-specific research data-related running projects -to be shortly enriched by Sonex with a complementary list of general purpose projects dealing with research data management- a thorough list of open data repositories for all areas may be found at the data repository section of the Open Access Directory (OAD).
Saturday, 11 December 2010
Thursday, 25 November 2010
A SONEX meeting was held last Sat Nov 20th at JISC Office in Brettenham House, London. The meeting was intented to produce some feedback on the RFC version of the DL.org Technology and Methodology Digital Library Cookbook. SONEX feedback on featured interoperability solutions was mainly focused on enhancing the Sword protocol description in the Cookbook as to cover functionality updates in the new version of Sword.
The second half of the SONEX meeting was devoted to preliminary analysis of deposit into Open Access repositories of raw research data produced either as specific research output or as supplementary material of research publications. Raw data as a further SONEX usecase deposit scenario was already included in the list of issues for the SONEX Bird-of-Feather session held at the Open Repositories Wokshop (OR2010) last July in Madrid, where it was identified as 'the missing piece in the general deposit picture' at the time.
Some deposit-related projects are already running since Jul 2010 along the JISC Deposit Call (JISCdepo), but none of them so far is dealing with deposit of research data. However, dataset handling is already being considered as a forthcoming candidate for Sword-based transfer, and preliminary analysis of this new deposit usecase scenario may well be partially carried out under the SONEX umbrella.
Some of the discussed ideas on research data and their deposit via Sword into repositories follow:
- A JISCdepo meeting will be held in early Mar 2011 as an internal coordination event for JISC Deposit Call projects. It's a good opportunity for SONEX to fine-tune analysis of usecase scenarios at running projects, as well as for sharing potential new deposit usecases arising both from the Kultivate project (digital versions of creative works ie non-textual materials) and the research data-based approach.
- Regarding deposit of research data into repositories, the DRYAD international repository of data underlying peer-reviewed articles in the basic and applied biosciences was highlighted as a pioneering implementation of infrastructure for research data filing and preservation. DRYAD acts as a kind of PubMed Central for research data – with an equivalent mandate by a group of 50 journals (so far) to their authors for depositing publication-related research data into this specific repository (besides archiving them in their IR or with the publisher).
- The JISC-funded DRYAD UK project was also discussed. DRYAD UK, currently being developed within the JISC Managing Research Data (JISCMRD) programme, is planning to expand Dryad into the UK by both establishing a UK mirror site and extending service to new publishers and disciplines.
- A JISC Managing Research Data Programme (JISCMRD) International Workshop will be held in Mar 2011 for analysis and evaluation of outputs and progress of the JISCMRD Programme. There will be a place in the Workshop programme for issues related to research data, such as citation, deposit and metadata/identifier exchange with publishers. SONEX is expected to bring in some input into some of those subjects.
- Regarding creation of research data management infrastructure for collection, digital organization, metadata annotation and controlled sharing of datasets, the ADMIRAL project (A Data Management Infrastructure for Research Across the Life sciences) was identified as the main presently running initiative to be followed. DataPac, an idea for a standard data shipping container for submitting research data with identifier and other information in RDF and HTML formats, was mentioned too as a potential complementary infrastructure to ADMIRAL.
In terms of SONEX deposit usecase analysis, deposit of research data poses a double usecase framework,
- R2R usecase scenario (IR to DRYAD, other)
- Publisher to repository usecase scenario
- metadata-related issues – very case-specific and different from metadata standards being used for research papers (previous work on the subject done by JISCMRD MRDonto Group: “Metadata for Datasets: Identifiers and Ontologies”)
- SONEX should definitely NOT get into identification schemas for datasets – DOIs should do for identification purposes
- issue of attached file sizes – should deposit by reference be considered instead/besides binary data transfer?
- At what point along the publication lifecycle should dataset deposit take place? Picturing the process via workflow diagrams would help
- How should Sword deal with this particular deposit usecase?
Some interesting examples of international initiatives dealing with dataset management are also being examined by SONEX, such as:
- PANGAEA [Germany]: Publishing Network for Geoscientific & Environmental Data, see example dataset with attached DOI
- [Dutch] NARCIS (National Academic Research and Collaborations Information System) FAQ page contains info on handling datasets.
Further references on submission of research data to repositories:
- Research Remix blog, by Heather Piwowar
- JISC XYZ project for publishing data in support of full-text
- "A Scientist and the Web": Peter Murray-Rust's blog - Archive for the 'data' Category
- "SPARC Digital Repositories meeting includes session on open data" (blog post at IASSIST website, 2010/11/10)
- 'Embedding Institutional Data Curation Services in Research' (EIDCSR) project blog
- 6th International Data Curation Conference (Chicago, IL, Dec 6-8th, 2010) - Programme
- "Riding the Wave: How Europe can gain from the rising tide of scientific data": Final Report of the High Level Expert Group on Scientific Data: a submission to the European Commission (Oct 2010)
Tuesday, 26 October 2010
According to DL.org, "the Cookbook is aimed at collecting and describing a portfolio of best practices and pattern solutions to common challenges faced when it comes to developing large-scale interoperable Digital Library systems. The current version of the Cookbook should not be considered neither authoritative nor final but rather as a 'work in progress' with the aim of enhancing it through external feedback".
After the Sonex work was presented last September at the 2nd DL.org workshop held in Glasgow, Sonex reached an agreement with DL.org as to provide technical feedback on the Cookbook regarding those usecase scenarios identified by Sonex.
Wednesday, 6 October 2010
The 10th REBIUN Workshop on Digital Proyects will be held Oct 7-8, 2010 in Valencia, Spain. Among the technical presentations scheduled for the workshop, there is one on Sonex called 'Interoperabilidad y Repositorios: el Grupo de Trabajo SONEX' (Interoperability and Repositories: the SONEX Workgroup).
In the Sonex presentation, to be delivered on Thu Oct 7th, the main Sonex worklines will be discussed, as well as incipient implementations of Sonex usecase scenarios in Spanish Institutional repositories.
As of Sep 29th, BioMed Central Update announced BMC Shared Support Membership as a new kind of low-cost membership for sharing article processing fees between institutions and their research teams. Main issue from a Sonex point of view in this new type of BMC membership is it includes a feature for Automated Article Deposit into repositories via Sword, by which "any article published [in BMC open access journals] will be automatically deposited into the Shared Support Member's institutional repository". This means extension to further institutions for the http://sonexworkgroup.blogspot.com/2010/04/biomed-central-partners-with-mit.html announced last Apr 29th.
See BMC Automated Article Deposit for further information on this deposit service and BMC customers entitle to it, and BMC Member list by country to check for potential institutional users.
Wednesday, 15 September 2010
On 8th September 2010 the JISC-funded Kultur project group gathered for a meeting at the JISC Offices in London, to carry out some post-project discussions and to look to the future with the Kultivate project. During the meeting William Nixon, University of Glasgow, presented "Minding your P's and Q's: Enrich-ing Enlighten at the University of Glasgow" on their work at the Enrich project and the enhancement of Enlighten Institutional Repository, while Richard Jones from Symplectic (and SONEX) presented "A whirlwind tour of repository deposit technology and use cases". This latter presentation covered his work at Symplectic and the Symplectic Repository Tools deposit technology (c.f. the CRIS to Repository use case), as well as the current state of the SWORD 1.3 standard and the future of SWORD through version 2.0. He also then presented some slides on SONEX describing the key identified use cases and suggestions on the way that Creative and Applied arts might engage with the SONEX process.
Some key realisations from this meeting for SONEX are that:
1) The deposit use cases in Creative and Applied Arts may not be significantly different from the use cases in STM, but the devil will be in the details
2) CRIS systems are being used to some degree in Arts Institutions, and undoubtedly there is work which will be considered research in these fields, but automatic acquisition of content for these systems is virtually impossible, because ...
3) There are no comprehensive or even substantial Creative and Applied Arts data sources online, because ...
4) The publishing lifecycle for the Creative and Applied Arts is not only significantly different to STM but also non-standard across the discipline. It was suggested, for example, that YouTube and Vimeo were likely to be some of the largest repositories of research outputs from these fields.
It is hoped that if the 4th JISCdepo project goes ahead it should be easier for SONEX to engage in this field. In the meantime, any people working in Creative and Applied Arts should feel very welcome to contact SONEX members with a view to understanding the variations in the standard deposit use cases which would meet their needs.
Tuesday, 14 September 2010
The paper 'Handling Repository-Related Interoperability Issues: The Sonex Workgroup' was presented last week at the 2nd DL.org workshop held in Glasgow in conjunction with the 14th European Conference on Digital Libraries (ECDL2010, Sep 6-10, 2010). The DL.org workshop was scheduled under title "Making Digital Libraries Interoperable: Challenges and Approaches" and it featured several presentations by DL.org working groups on the DL Reference Model, such as DL content, functionality, users, architecture, quality & policy (see programme). An invited talk by MS Research Alex Wade, "Digital Library Interoperability: An Industrial Perspective", was also held, where most recent MS developments in the area of DL interoperability were summarized (Zentity, Article Authoring Add-in for Word, DepositMO Project, MS Academic Search or the WorldWide Telescope among others).
The Sonex presentation was delivered on Fri Sep 9th by Peter Burnhill and Pablo de Castro along the workshop's Day I. Sonex approach to interoperability being quite pragmatic in scope, it fitted in well alongside DL.org's more theoretical-founded model. Complementary approaches by both initiatives may in fact offer perspectives for further collaboration between them after this DL.org workshop.
Friday, 3 September 2010
Further talks were also held at RF2010 on CRIS systems and IRs. A lot of universities do already have CRIS systems running, and some voices in the community start wondering whether CRIS systems might eventually replace institutional repositories as an "entrance door" to the institutional research output. Projects like RePosit (see Queen Mary University of London Sara Molloy's presentation for more info) on the contrary are exploring ways for batch ingestion of contents flowing from CRIS systems into a currently low-populated array of repositories.
Quite a number of other subjects were amusingly dealt with by other speakers, such as timestamping the web through the Memento project as presented by Herbert van de Sompel (LANL), Topic Models by Michael Fourman (University of Edinburgh Informatics Dept), or Repositories and data at Closing Keynote by Kevin Ashley (DCC).
RF2010 had also its traditional 20-slide-20-secs-per-slide Pecha Kucha sessions once again. There was a Pecha Kucha on the work by Sonex on Fri Sep 3rd, and projects like Enlighten, Jorum, Open Access Repository Junction, ERA, ShareGeo and some others were represented at this light speed presentation variety as well.
On Sep 1st a SHERPA RoMEO API workshop was also held by Peter Millington, Jane H. Smith and colleagues from Nottingham at the e-Science Institute facilities in Edinburgh as a RepoFringe pre-event. As presented last July at Open Repositories Conference in Madrid, major improvements in the RoMEO service are being worked at, and this workshop was an opportunity to get feedback from the RoMEO API users on its performance and suggestions on possible enhancements for version 3 currently in its final stages of development (due Autumn 2010). There were also interesting presentations from outside the UK on the implementation of RoMEO mirrors such as SHERPA RoMEO deutsch in Germany and service internationalisation was extensively discussed along the meeting (Portugal and Spain were scoped as potential areas for development of specific interfaces). An online survey on the RoMEO API was previously distributed among the workshop delegates and its results were discussed and analysed in fruitful specific breakout sessions.
From a Sonex perspective, the RoMEO service does fit into the Sonex proposal for a distributed national- or regional-level automatic ingest system based on an array of brokers dealing with publisher- or funder-driven ingest of contents into a network of national or regional institutional repositories. From this point of view, RoMEO, such as other general-purpose services as OpenDOAR or the broker itself, are pieces of the required infrastructure for this approach to grow real. Find more information about this Sonex proposal in the Sonex paper 'Handling Repository-Related Interoperability Issues: the SONEX Workgroup' to be presented in Glasgow later this month at the 2nd DL.org workshop "Making Digital Libraries interoperable: challenges and approaches".
Tuesday, 3 August 2010
See below an analysis of several CRIS-IR integration possibilities for creating an institutional research information infrastructure that will live up to the challenge posed by future research assessment exercises.
Considerations on the role of CERIF standard were intentionally left out of the picture, as some debate is still taking place on whether or not it should be the base standard for CRIS-IR integration. Most implementations available have until now
chosen CERIF-based integration strategies to tackle the issue, but from ad-hoc light-CERIF versions to non-CERIF solutions whatsoever, there's still a high level of diversity in the way institutions are facing this challenge. At the same time,
CERIF4REF is being steadily worked out at KCL, and CERIF architecture is also being gradually brought into ePrints new versions.
A variety of research information system implementation usecases for RAE/REF purposes was also shown at Peter Burnhill's (Sonex/EDINA) "Repository Update UK" presentation at JISC/CNI meeting last month: from IRs being used as REF-gateways to the challenge it poses in terms of open access availability of contents, a whole set of issues arise as IRs undergo enhancement for fulfilling their new role.
Monday, 19 July 2010
Following projects have been selected at the jiscDEPO call as of today - with some extra one still to come:
- DepositMO: Modus Operandi for Repository Deposits. Developed by teams from the University of Southampton (Lead Institution) and Edinburgh University, and with a close liaison with Microsoft, the DepositMO projects aims to create a repository deposit workflow connecting the user’s computer desktop, especially popular apps such as MS Office, with digital repositories based on EPrints and DSpace. A first DepositMO presentation was delivered by David Tarrant (U Southampton) at OR2010.
- RePosit: positing a new kind of deposit. The RePosit Project seeks to increase uptake of a web-based repository deposit tool embedded in a researcher-facing publications management system. Institutions involved in RePosit are University of Leeds (Chair), Keele University, Queen Mary University of London, University of Exeter and University of Plymouth, with close connection to Symplectic Ltd as commercial partner.
- DURA: Direct User Repository Access. The DURA project, lead by the University of Cambridge with Mendeley Ltd and Symplectic Ltd as consultant firms, aims to embed institutional deposit into the academic workflow at almost no cost to the researcher, by using Mendeley and Symplectic tools to allow researchers to synchronise their personal research collections with institutional systems.
The Sonex workgroup will be supplying its conceptual framework on deposit usecases to these projects and contributing to their coordination via the jiscDEPO project blog planet to be available shortly.
Friday, 2 July 2010
Here are some topics to be discussed along the session:
- The growing number of deposit-related initiatives and events should be properly summarized, classified and advertised somewhere: the Sonex website could widen its present coverage in order to play that role, especially in the US & Canada (out of Europe would probably be more accurate, Berlin 8 Open Access conference 2010 being held in Beijing next Oct), for keeping an eye on progresses wherever they may take place. Some ideas are already available.
- Main classes of deposit-related initiatives: Publisher-driven & CRIS transfers. Is the Sonex classification thorough enough? Are there any other possible groups that weren't accounted for and left under the 'Other' general section? Are all classes being adequately covered by some ongoing deposit-related project? What about e-Research repositories (datasets + software)? Could they be the [Sonex] missing piece of the institutional research systems integration jigsaw? Input on the issue by a representative of some related initiative attending the BoF could help.
- Common challenges in publisher-driven deposit initiatives. Re-usable procedures: NLM DTDs. The filtering strategy. SWORD endpoint (scarce) implementation and how OpenDOAR/ROAR may help. Author and institution persistent identifiers. Processing of citations. Everything being developed at the same time doesn't make things easier.
- CERIF as a spreading standard for CRIS/IR integration. Different ways for achieving the objective, and how the REF affects the whole environment. Hybrid CRIS/IRs: an alternative procedure.
Thursday, 1 July 2010
Topics for sessions include: Cloud Computing; Innovation in Learning and Teaching; Open Data Policies; Shared Services; Repositories; Digital Content and Institutional Planning; Resource Discovery; Digital Preservation; e-Science (see meeting programme).
Peter Burnhill from EDINA National Data Centre Edinburgh and member of the Sonex workgroup will deliver a presentation on "Repositories Update UK" at the Repositories update session on Conference Day 2 - which includes also a talk on "Repositories Update US" by Sandy Payette, DuraSpace. Presentation by Peter Burnhill will be shortly available here.
Tuesday, 8 June 2010
Some comments below on the outcome of discussions at CRIS2010:
- All across Europe and beyond, CERIF is spreading as an increasingly accepted standard for building Current Research Information Systems (CRIS) both at national and institutional levels. Previously existing databases and management systems at HEIs are frequently undergoing adaption to CERIF.
- CERIF-based National Research Information Systems for research management and assessment (such as NARCIS in The Netherlands, Frida in Norway, U-GOV in Italy or the USDA-CRIS in the United States) usually rely on CERIF-compliant Institutional CRISs for supplying the underlying institutional information. This often leads to a two-way strategy for infrastructure development, where National and Institutional Systems are simultaneously being built.
- As a consequence, there is an increasing number of CERIF-compliant National Research Information Systems in operation, and more of them are in progress (eg DeGóis in Portugal or SEMAT in Iran).
- At institutional level there is also a growing trend towards adoption of CERIF-based solutions, either developed inhouse or based on CERIF-compliant commercial CRISs: there are already several examples of ePrints being upgraded to PURE (presently the most successful of such commercial solutions) in the UK and elsewhere. This trend leads to a variety of resources available at institutional level depending on the adopted strategy: some institutions have plain CRIS systems, others work with CRIS/IR integrated solutions and finally there are also some universities running CERIF-based enhanced-IRs.
- The Common European Research Information Format (CERIF) is by no means a closed standard at this point, but it benefits from interaction with existing National, Subject and Institutional Research Information Systems in order to "epitaxially" enrich its description features for providing solutions to various system needs.
- A wide array of commercial solutions is presently flourishing around the area of institutional research system implementation or enhancement, such as Atira PURE, Avedas Converis or Symplectic Repository Tools to mention just some examples.
- There are interoperability issues still to be tackled at various points of CRIS/OAR and CRIS/CRIS integration, but remarkable progress is underway, both from publicly-funded international projects and from private companies.
- The presently soundest example of Author ID standard, Dutch DAI, having been driven by institutional integration purposes, CERIF & euroCRIS initiatives could possibly bring in a new momentum for solving pending Author ID issues, as it is a basic requirement for operation of both National and Institutional Research Information Systems.
- Despite the fact that "everything being seemingly developed at the same time doesn't make things easier" (quote from Sonex BoF at OR10 preliminary list of issues), important progresses are clearly taking place worldwide on the field of research information system implementation. Some integrated research system development strategy from planning bodies, particularly at institutional environments, may therefore be useful for adapting to the rapidly changing landscape.
Wednesday, 26 May 2010
- Workshop on CRIS, CERIF and Institutional Repositories: Maximising the Benefit of Research Information for Researchers, Research Managers, Entrepreneurs and the Public (Istituto di ricerche sulla Popolazione e le Politiche Sociali, IRPPS, Consiglio Nazionale delle Ricerche, CNR, Rome, Italy, May 10-11, 2010).
- CRIS2010: Connecting Science with Society: The Role of Research Information in a Knowledge-Based Society (10th International Conference on Current Research Information Systems, Aalborg, Denmark, June 2-5, 2010).
Monday, 24 May 2010
|PEER Project||STM-Assoc/ESF/Max Planck G/UGöttingen/INRIA/SURF/ UBielefeld||EU||Julia Wallace (STM)/Foudil Bretel (INRIA)|
|Open Access Repository Junction (OA-RJ)||EDINA||UK||Theo Andrew (EDINA)|
|BMC Deposit into DSpace@MIT||BMC/MIT||UK/US||Matthew Cockerill (BMC)|
|National & Institutional CRIS/IR integration initiatives|
|NARCIS||Royal Netherlands Academy of Arts and Sciences (KNAW)||NL||Elly Dijk (KNAW)|
|Enrich: Repository and Research System Integration||University of Glasgow||UK||William Nixon (U Glasgow)|
|CRISPool||University of St. Andrews||UK||Anna Clements (U St. Andrews)|
|TDC Systems Integration||TCD||IE||Niamh Brennan (TCD)|
|U-GOV||CINECA Consorzio Interuniversitario||IT||Nicola Bertazzoni (CINECA)|
|CRIStin||University Centre for Information Technology|
|NO||Anne Asserson (U Bergen, UiB)|
|Aramis||State Secretariat for Education and Research (SER)||CH||Beat Sottas (SER)|
|USDA-CRIS||US Dept Agriculture. National Institute of Food and Agriculture||US||Carolyn Deckers, Juanita Hammond, Teresa Bailey (USDA)|
|RCAAP/DeGóis Integration||UMIC/FCCN/FCT||PT||Eloy Rodrigues (UMinho)|
|RIS/IR Integration at UPC||Universitat Politècnica de Catalunya (UPC)||ES||Jordi Serrano, Toni Prieto (UPC)|
|CRIS/OAR Interoperability Project||KE/DTU||DK||Mikael K. Elbæk (DTU), Mogens Sandfær (DTIC)|
|CCLRC Corporate Data Repository (CDR)||Council for the Central Laboratory of the Research Councils (CCLRC)||UK||E. Grabczewski (CCLRC)|
|SEMAT||Iranian Research Institute for Information Science & Technology (Irandoc)||IR||Omid Fatemi (Irandoc)|
|Article Authoring Add-in||MS Research||US||Lee Dirks/Alex Wade (MS Research)|
|Repository tools||Symplectic Ltd||UK||Richard Jones (Symplectic Ltd)|
|Pure||Atira||DK||Bo Alroe (Atira)|
|Converis||Avedas||DE||Rudolf Weiss (Avedas)|
|Enovation Solutions||Enovation||IE||Gavin Henrick (Enovation)|
|SWORD Project||UKOLN/JISC||UK||Adrian Stevenson (UKOLN), Julie Allinson (U York)|
|York Digital Library - Integration for the Next Generation (YODL-ING)||University of York/University of Leeds||UK||Julie Allinson (University of York)|
|EasyDeposit – SWORD deposit tool creator||University of Auckland||NZ||Stuart Lewis (U Auckland)|
Friday, 21 May 2010
Along last year Sonex workgroup has been devoted to analysis of Scholarly Output Notification and Exchange, that is, of potential deposit processes into repositories for scholarly publications from various sources and related interoperability issues. Several relevant usecases have been selected for following their implementation at institutional environments. After recent publication of Deposit call by JISC (Feb’10), Sonex was assigned the new role of providing support and eventual coordination for selected bids. We are therefore inviting colleagues taking part in ongoing or future deposit-related projects to debate on different approaches, common problems and chances for avoiding redundancies among them.
Tuesday, 11 May 2010
Andy McGregor from JISC set the scene for the event, giving us a little background on JISC involvement, and talking about different approaches that could be taken to integration, such as the use of CERIF or of Linked Data for the sharing of information. He then passed us over to Simon Kerridge from ARMA, who discussed in a bit more detail what a CRIS is; he also gave us some better terminology that we might prefer to use: RMAS (Research Management and Administration System) and ERA (Electronic Research Administration). The briefing paper that accompanies the event tells us that "by communicating research information more effectively ... the process of sharing data becomes more efficient, duplication of effort is reduced and information becomes more accurate", and this clearly drives the purpose of the day. Particularly, there is no intention here to merge CRIS and Repositories - the two communities have sufficiently different use cases that this is unlikely to happen - but simply to enhance communication between them in the correct way.
Anna Clements then introduced the CRIS that they use at St Andrews, while William Nixon and Valorie McCutcheon from the University of Glasgow presented Enlighten. Particularly, Enlighten is an interesting case as it is based on the EPrints software, and started life as an institutional repository in around 2003, but has now grown into a fully fledged publications management system. The presentations were then wrapped up by Jackie Knowles, from the Welsh Repository Network (the event organisers), who gave us an insight into things that went well and things that didn't during development of CRIS and Repository systems at institutions around the country. The ones that stuck for me were:
- Don't overcomplicate your requirements
- Don't develop DIY solutions which turn into single points of failure (i.e. ensure they are robust against staff changes)
- Ensure that your requirements are well specified and met; she cites an unfortunate and extreme tale of a team who lost their jobs after failing to successfully implement a system which had no formal requirements in the first place!
The afternoon of the event was given over to discussion among delegates, and this observer did not attend due to his position as representing a supplier - the event coordinators felt that without the suppliers present the conversation would be more candid. The results of those discussions should be made available soon, and we'll link them when they are. Meanwhile, I therefore represented Symplectic in the exhibition stall, alongside Avedas, EPrints, Atira, ARMA, ThomsonReuters, IDEATE and DuraSpace; it was busy for much of the afternoon, which I think shows a clear interest in this space at this time.
Sunday, 9 May 2010
Following issues -among others- were discussed at the meeting (see event programme for contributions):
- Why a CRIS? The perspective from the repository and research management communities
- The ideal CRIS: a view from euroCRIS
- DIY Success: Case study from the University Glasgow - How repository and research management systems have been successfully integrated
- Where did it all go wrong?: Case study on how repository and research management systems have not been so successfully integrated
Tweets about the event were saved, and presentations are already available online as well. Finally, Richard Jones from Sonex workteam was attending the seminar at Leeds Met and will also be delivering a brief report on the main issues dealt with at the event.
Thursday, 29 April 2010
In order to make it easier for MIT authors to submit articles to DSpace@MIT, the MIT Libraries worked with BioMed Central to set up an automatic feed of MIT articles, using a version of the Simple Web-service Offering Repository Deposit (SWORD) protocol. The SWORD protocol allows the institutional repository to receive newly published articles from any of BioMed Central's 200+ journals as soon as they are published, without the need for any effort on the part of the author and streamlining the deposit process for the repository administrator.
In describing the importance of the SWORD integration, Matthew Cockerill, BioMed Central's Managing Director said, "Campus open access policies are hugely important, but the effort involved in compliance can be a major obstacle to their success. That is why we think that automated deposit has an important role to play. We hope that this pioneering work by BioMed Central in collaboration with MIT Libraries will encourage other institutions to work with us to establish similar automated feeds, and we encourage other publishers to adopt a similar approach".
Read more at BMC Press release.
Wednesday, 21 April 2010
Richard is Head of Repository Systems at Symplectic Ltd, and is responsible for integrating their research and publications management system (Symplectic Elements) with a variety of digital repository platforms.
Prior to joining Symplectic, Richard built and deployed repository systems for three large universities: the University of Edinburgh, the University of Bergen, and Imperial College London. He also spent some time as a research engineer at Hewlett-Packard Laboratories, working with cloud services and content management systems.
Richard is a founder member of the DSpace Committer Group, although is now much less active in that community than he would like. He plays an active role in open standards development; he was on the technical committee defining the Open Archives Initiative Object Re-use and Exchange (OAI-ORE) standard, and has recently taken up technical lead for SWORD standard, in which he has been involved in since near its inception. He is also chair of the Developer Focus group, part of DevCSI representing developers in and around higher education. He has written numerous articles on repository development and Open Access, as well as a book concentrating explicitly on Institutional Repositories.
Thursday, 15 April 2010
Contact person at TCD: Niamh Brennan
Wednesday, 14 April 2010
Some meeting conclusions from a Sonex viewpoint (general conclusions summarized by Theo in a post at the OARJ blog):
- The PEER project should be considered as a key reference for OA-RJ, at least at its initial stages, for there are important similarities between both projects. NPG has also taken part in the PEER project, which dealt mainly with publishers depositing authors' manuscripts into an array of IRs, and warns about risk of redundancy at this stage. However, publishers as deposit agents is just one of the OA-RJ lines of work, so overlapping between both projects should be just partial. Nevertheless, whenever PEER previous developments may be reused for OA-RJ purposes, a strong effort should be made to ensure this is done. Sonex may be of great help in achieving some degree of cooperation between both projects.
- The Sonex workgroup may eventually support the OA-RJ project for tackling some of the basic issues at design stage (such as the multiple copy vs one copy+multiple links dilemma or the way deposited items may be kept at the OA-RJ deposit until -and even after- receiving notification from target IRS of the item going live). These points are dealt with in the meeting, but there are still questions remaining and new issues are likely to show up along the project development. If Sonex succeeds in organising the proposed Deposit meeting (initially set for Oct'2010) on ongoing deposit initiatives worldwide, it may also be a good opportunity for discussing different approaches to the same objectives among members of the represented projects.
• CRIS2010: Connecting Science with Society (Aalborg, Denmark, Jun 2-5, 2010)
• Learning how to play nicely: Repositories and CRIS (Leeds Metropolitan University, Leeds, UK, May 7, 2010)
• Repository Multiple Deposit meeting (London, UK, Apr 8, 2010)
• Readiness for REF (R4R) Workshop (King's College London, Mar 23, 2010)
• OpenAIRE Inaugural Conference (Athens, Greece, Jan 13-14, 2010)
• JISC Deposit Show-and-Tell Barcamp (University College London, Oct 12, 2009)
Repository Handshake strand - Action plan
• Edinburgh workgroup meeting (EDINA, Edinburgh, June 10, 2009)
Notes on meeting
Repository Handshake (actor-based) use case scenarios: a summary
Notes from whiteboard
• OAI6 Workshop on Innovations in Scholarly Communication (CERN, Geneva, June 17-19, 2009)
The Repository Handshake - a followup
• Copenhagen workgroup meeting (DTU, Copenhagen, Aug 12, 2009)
Notes on meeting
• Madrid workgroup meeting (CSIC, Madrid, Nov 2, 2009)
Notes on meeting
• Informal workgroup meeting in Cambridge (Cambridge, UK, Mar 21, 2010)
• JISC-EDINA’s Open Access Repository Junction (OA-RJ) Project
• PEER Project
• Knowledge Exchange ‘CRIS/OAR Interoperability Project’ for defining a CERIF-based metadata exchange format between CRIS and Institutional Repositories,
• SWORD: Simple Web-service Offering Repository Deposit
Sword2 final report
• JISC-Heriot-Watt University’s JournalTOCsAPI project,
• JISC-The Deposit Plait
• JISC-EIDeR (Enhanced ingest to digital e-research repositories)
• EU OpenAIRE