Name: 2017 GBIF Ebbe Nielsen Challenge
Start: 2017-06-15T05:15:00.000-04:00
End: 2017-09-05T17:00:00.000-04:00
Location: 2017 GBIF Ebbe Nielsen Challenge

Summary

The 2017 GBIF Ebbe Nielsen Challenge will award a total of €14,000 to developers and data scientists who create tools capable of liberating species records from open data repositories for scientific discovery and reuse.

Background

This year's Challenge will seek to leverage the growth of open data policies among scientific journals and research funders, which require researchers to make the data underlying their findings publicly available. Adoption of these policies represents an important first step toward increasing openness, transparency and reproducibility across all scientific domains, including biodiversity-related research.

To abide by these requirements, researchers often deposit datasets in public open-access repositories. Potential users are then able to find and access the data through repositories as well as data aggregators like OpenAIRE and DataONE. Many of these datasets are already structured in tables that contain the basic elements of biodiversity information needed to build species occurrence records: scientific names, dates, and geographic locations, among others.

However, the practices adopted by most repositories, funders and journals do not yet encourage the use of standardized formats. This approach significantly limits the interoperability and reuse of these datasets. As a result, the wider reuse of data implied if not stated by many open data policies falls short, even in cases where open licensing designations (like those provided through Creative Commons) seem to encourage it.

The Challenge

The 2017 GBIF Ebbe Nielsen Challenge seeks submissions that repurpose these datasets and adapting them into the Darwin Core Archive format (DwC-A), the interoperable and reusable standard that powers the publication of almost 800 million species occurrence records from the nearly 1,000 worldwide institutions now active in the GBIF network.

The 2017 Ebbe Nielsen Challenge will task developers and data scientists to create web applications, scripts or other tools that automate the discovery and extraction of relevant biodiversity data from open data repositories. Such tools might generate datasets ready for publication on GBIF.org by:

Automating searches of open data available in public repositories
Effectively mining the information needed to generate checklists, species occurrence and sampling-event datasets (e.g. scientific names, date and location of occurrence et al.) from datasets in these repositories
Mapping datasets’ column headings and/or contents with standardized Darwin Core terms
Routinely converting the reformatted data into Darwin Core archive formats ready for publication through GBIF.org

Resources and reference material

Background on Darwin Core and Darwin Core Archives

Examples of datasets manually harvested and published from open-data repositories

Global compendium of Aedes aegypti and Ae. albopictus occurrence

Kraemer MUG, Sinka ME, Duda KA, Mylne A, Shearer FM, Brady OJ, Messina JP, Barker CM, Moore CG, Carvalho RG, Coelho GE, Van Bortel W, Hendrickx G, Schaffner F, Wint GRW, Elyazar IRF, Teng H, Hay SI (2015) Data from: The global compendium of Aedes aegypti and Ae. albopictus occurrence. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.47v3c.2
Originally published in
Kraemer MUG, Sinka ME, Duda KA et al. (2015) The global distribution of the arbovirus vectors Aedes aegypti and Ae. albopictus. eLife 4:e08347 http://dx.doi.org/10.7554/eLife.08347
Kraemer MUG, Sinka ME, Duda KA et al. (2015) The global compendium of Aedes aegypti and Ae. albopictus occurrence. Scientific Data 2(7): 150035. http://dx.doi.org/10.1038/sdata.2015.35
On new GBIF.org: Global compendium of Aedes albopictus occurrence:
https://demo.gbif.org/dataset/33614778-513a-4ec0-814d-125021cca5fe
On new GBIF.org: Global compendium of Aedes aegypti occurrence
https://demo.gbif.org/dataset/d4eb19bc-fdce-415f-9a61-49b036009840

LTER sampling-event dataset, Bird census at the beach of Doñana Natural Space

On DataOne: https://search.dataone.org/#view/knb-lter-europe-deims.13610.15384
on LTER-Europe: https://data.lter-europe.net/deims/dataset/2a0762f2-4630-11e3-aeb9-005056ab003f
On GBIF Spain IPT: http://www.gbif.es/ipt/resource?r=donana
On new GBIF.org: https://demo.gbif.org/dataset/9a57e938-3616-4f8c-985a-c9b66e7a1347

Open-data repositories and aggregators

The following list is not by any means exhaustive. We welcome suggestions on other relevant services to highlight for prospective Challenge entrants.

Extra credit

Keeping the 2016 Ebbe Nielsen Challenge in mind, GBIF is particularly interested in tools that address data biases and fill gaps by mobilizing occurrences from under-represented geographies, taxa, time periods, or thematic areas like vectors of human disease or alien and invasive species.

GBIF is also eager to see tools capable of converting open-access repository datasets into the quantitative 'sampling-event' format recently supported in the Darwin Core standard. Such datasets can capture richer information like species abundance, presence/absence, level of effort, and standard sampling methodologies and protocols.

Special thanks to the Swedish Research Council for its support of the 2017 Ebbe Nielsen Challenge.

Eligibility

Summarized from the Official Rules The Challenge is open to individuals, teams of individuals, companies and their employees, and governmental agencies and their employees. The Challenge is NOT open to:

Members of the GBIF Secretariat
Individuals currently under an external contract issued by the GBIF Secretariat
Members of the GBIF Science Committee
Heads of Delegation to GBIF

Requirements

Submissions will consist of three main elements:

Entry details, including the names of all team members; identification of a lead team representative; and the objective of the Submission
Narrative description, which explains the approach taken in the Submission; identifies which open data repository (or repositories) the Submission uses and why; estimates the number of datasets that could be mobilized by applying the Submission to the repository; and discusses any addition data processing, quality assessment or quality control steps required prior to publishing the output through GBIF.org.
Example dataset, liberated from an open data repository and prepared for formatting and publishing as a Darwin Core Archive.

In addition, Submissions must:

Attribute and credit data originators, metadata authors, and others involved in the preparation of the original datasets
Produce outputs that are usable by and clearly documented for others, so that resulting datasets can be used without contacting the original authors

Hackathon Sponsors

Prizes

$14,000 in prizes

First Prize

1 winner

Second Prize

1 winner

Runners-up

3 winners

How to enter

Register for the Challenge. Registrants must either create a ChallengePost account or log in with an existing ChallengePost account. There is no charge for creating a ChallengePost account, and doing so will ensure that you receive updates and can access the “Enter a Submission” page. Note that all team members must also create a ChallengePost account in order to be added to a Submission.
Familiarize yourself with Darwin Core standard as well as methods for publishing data through the GBIF network or accessing data through GBIF.org, GBIF web services, or other tools like rgbif.
Explore relevant public open data repositories and aggregators and their APIs as well as open-source data access and publishing tools (R packages, etc.)
Produce an application, script or other tool for liberating species records from open data repositories for scientific discovery and reuse
Submit a narrative description detailing your Submission, along with any relevant technical requirements or implementation details. Consider including a video, slides and/or and a sample output from the tool.
Complete and enter all of the required fields on the “Enter a Submission” page of the Challenge Website (each a “Submission”) by the end of the Challenge Submission Period—that is, by 2300 CEST (UTC +1) on 5 Sept 2017.

Judges

Roderic Page
Professor of Taxonomy, University of Glasgow | Chair, GBIF Science Committee

Alexandre Antonelli
Professor in Biodiversity & Systematics, University of Gothenburg

Brenda Daly
Information Systems Manager, SANBI: South African National Biodiversity Institute

Rob Guralnick
Associate Curator, University of Florida

Ana Cláudia Mendes Malhado
Co-coordinator LACOS21 | Lecturer, Federal University of Alagoas, Brazil

Anabela Plos
Node Manager, GBIF Argentina | Museo Argentino de Ciencias Naturales (CONICET) and Sistema Nacional de Datos Biológicos (SNDB-MinCyT)

Amy Zanne
Associate Professor of Biology, George Washington University

Judging Criteria

Universality and Scale
Is the Submission reusable and interoperable across different public data repositories and aggregator systems? Based on these choices, approximately how many datasets might the Submission be expected to liberate?
Innovation
How creative is the Submission?
Functionality
How well does the Submission work? Does it have a working prototype? Does it respect and maintain existing licences?

Questions? Email the hackathon manager

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Online	Public
$14,000 in prizes	87 participants

2017 GBIF Ebbe Nielsen Challenge

Liberating species records from open data repositories for scientific discovery and reuse