Legume Occurrences Working Group

Assembling a global, expert-verified species occurrence dataset for family Leguminosae

Coordinators: Edeline Gagnon (University of Guelph, Canada), Joe Miller (Global Biodiversity Information Facility, Denmark) and Jens Ringelberg (University of Edinburgh, UK).


The Legume Occurrence Working Group has three main goals: 1) to promote communication between researchers interested in legume occurrence data; 2) to provide resources and advice about the assembly, georeferencing, and quality control of occurrence datasets; and 3) to present an up-to-date list of available quality-controlled legume occurrence datasets. To promote these goals, we have organised several online meetings for researchers interested in legume occurrence data, and we regularly update this page.

If you have any questions, comments, or would like to help run the working group, please get in touch with us!

Available occurrence datasets

We aim to keep this list up to date, but please let us know if any occurrence datasets are missing.


  • Edeline Gagnon’s GitHub page contains multiple R scripts to assemble and clean occurrence data from various sources. These were applied to clean the Solanaceae Source dataset for the genus Solanum (1169 spp), analysed in Gagnon et al. 2023.
  • The Legume Occurrences Working Group Zenodo page features a custom R script and protocol for downloading and cleaning occurrence data. This script has been used to assemble the occurrence datasets analysed in Gagnon et al. 2019, Ringelberg et al. 2020, and Ringelberg et al. 2023.
  • The GitHub page of Yagos Barros-Souza contains R scripts used to assemble and clean occurrence data from the Brazilian campos rupestres (see Barros-Souza & Borges 2022).
  • Georeferencing Best Practices by Arthur Chapman and John Wieczorek offers theoretical background and methods for georeferencing descriptive localities. The document updates best practices, recommendations, and common terms and technologies developed and refined since publication of the same authors’ 2006 Guide to Best Practices for Georeferencing.
  • The Georeferencing Quick Reference Guide by Paula Zermoglio, Arthur Chapman, John Wieczorek, Maria Celeste Luna and David Bloom provides a citable protocol in the form of a practical how-to guide with rules and procedures for determining the shapes of geospatial features and using their outcomes as the basis for georeferencing.
  • The Georeferencing Calculator Manual by David Bloom, John Wieczorek and Paula Zermoglio lays out instructions for the Georeferencing Calculator. This browser-based tool works both online and offline, helping users georeference descriptive localities using the point-radius method based on the theory given in Georeferencing Best Practices.