Interim Register of Marine and Nonmarine Genera (IRMNG)
Frequently Asked Questions
What is the purpose of IRMNG?
IRMNG, the Interim Register of Marine and Nonmarine Genera, exists to provide a machine- and human- queryable system that is able to answer some basic questions about organisms based on the genus component
(or in around 50% of cases, the genus+species component) of their scientific name.
Such questions in the first instance comprise:
Additional questions that may also be answerable in a subset of cases:
- What is the correct spelling (orthography) and authorship of the relevant genus name?
- Where does this taxon (i.e. species / genus) fit in a taxonomic hierarchy (i.e., attribution to family wherever possible, plus relevant higher taxa)?
- Is this a known extant or fossil taxon?
- Is this a known marine or nonmarine taxon?
- Is this genus name unique or non-unique (i.e., a homonym) either within a taxonomic group, or across multiple taxonomic groups or codes?
- What genus and/or species names are spelled similarly to my input (search) term? (in the event that no names match the input term exactly)?
When initially conceived in 2006, this information was not available in a comprehensive, internally consistent form in any other system, although at some future point it may be (for example via the proposed Global Names Architecture or its components).
- Nomenclatural and taxonomic information regarding this taxon, e.g.
- Nomenclatural: (e.g. is the genus name validly published or a nomen nudum, a replacement for a previously published name, a published misspelling, etc.)
- Taxonomic: is this genus currently considered a synonym of another (valid) genus name (and if so, which name)
- (Partial) list of species names for this genus – including nomenclatural and taxonomic information in some cases
- Citation of the place of publication of a genus name (e.g. as given in Nomenclator Zoologicus or elsewhere)
- • Link to additional information on the same name held in other databases, for example in WoRMS (World Register of Marine Species) and BioNames at this time.
How does IRMNG differ from other, apparently comparable biodiversity databases?
IRMNG aspires to maximize coverage at the level of genus across all groups, i.e. animal, plants, algae, protists, fungi, prokaryotes, and viruses, both extant and fossil,
and also include flags to indicate extant/fossil and basic habitat status as above.
Other comparable biodiversity databases are either limited to a single taxonomic group
(e.g. Index Nominum Genericorum for plants, Index Fungorum for fungi, Catalog of Fishes for fish, etc. etc.), or to taxa of a particular type (e.g. Paleo Database for fossils,
World Register of Marine Species for marine species only), or to taxa from a particular geographic region (e.g. Fauna Europaea, Australian Plant Checklist, Species 2000 New Zealand).
The most comparable, broad spectrum initiative (though excluding fossils) is probably the Catalogue of Life, however this omits much detail at genus level (such as genus authorities and genus level
synonyms) and in addition presently aims for completeness at species level, so proceeds much more slowly towards completion.
Another noteworthy compilation, that of Nomenclator Zoologicus, has excellent coverage of many zoological genus names from 1758-2004 approx., but omits family allocation, habitat flags, and consideration
of current taxonomic validity in all but a few cases.
The Thomson Reuters Index to Organism names (ION) lists a large number of animal names (over 4 million) but includes many duplicates and name variants which are not flagged as such, while the Global Names Index from the Global Names initiative includes even more names (18 million, from all groups) but again with significant duplication problems, and again lacks the extant/fossil and habitat flags which are a key feature of IRMNG.
What is the significance of "Interim" in the IRMNG context?
In the context of IRMNG, "Interim" indicates that this is largely a first-pass compilation of data from a wide range of sources which may contain some internal inconsistencies and data errors,
that have not subsequently received the degree of scrutiny and validation found in more authoritative, single-group sources. Over time, these aspects of IRMNG should be improved however in the first instance,
it is deemed desirable to have a system with the range of IRMNG available for use in the interim rather than wait for all residual taxonomic or data issues to be resolved, or for the appearance of an equivalent compilation from other sources.
How has IRMNG been populated?
For obvious reasons, IRMNG draws heavily on pre-existing genus level compilations which in a number of cases, have been generously made available to the project by their respective compilers.
In approximate order of incorporation, the major sources utilised to date have been as follows:
Plus in addition, a wide range of print sources, more recent updates (e.g. fishes current taxonomy from FishBase courtesy Dr. N. Bailly) and smaller electronic compilations including CAAB (Codes for Australian Aquatic Biota) and others,
contributing the balance of current IRMNG holdings .
From the above list it is also clear that other major sources exist which could potentially be utilized in IRMNG, including IPNI, uBio,
the Paleobiology Database and more, but have not yet been as yet, mainly for reasons of time.
- Parker, S.P. (ed.), 1982. Synopsis and Classification of Living Organisms. McGraw-Hill, New York. [Print source] (Initial family and higher level classification – 6,800 family names)
- The Taxonomicon & Systema Naturae 2000 online compilation, 2006 version, courtesy Dr. S. Brands, Netherlands (112,000 genus names plus additional 2,300 family names) – current web address:Taxonomicon
- Catalogue of Life 2006 version, incorporating contributions from over 40 GSDs (Global species databases) plus ITIS, the Integrated Taxonomic Information System, courtesy Catalogue of Life partnership (36,000 additional genus names, 2,100 additional families, 1,282,000 species names) - current web address (latest version): Catalogue of Life
- Museum Victoria KEmu database (Oct 2006) (9,000 additional genus names, 900 additional families, 56,000 additional species names)
- Sepkoski, J.J., 2002. A compendium of fossil marine animal genera. Bulletins of American Paleontology, 364. Ithaca, NY (27,000 additional genus names, no families but sorted by order). Available online
- Benton. M. (ed.), 1993. The Fossil Record 2. Chapman & Hall, London. (2,900 additional fossil + extant families). Spreadsheet version available online
- Index Nominum Genericorum (2007 version) for plant genera, courtesy Dr. E. Farr (35,000 additional plant genus names, 400 additional families) – current web address Index Nominum Genericorum
- Aphia databases maintained at VLIZ, Belgium (supporting European Register of Marine Species and 17 other region or taxon-specific databases), 2006 version, courtesy ERMS editors (3,300 additional genus names, 120 additional families, 45,000 additional species names)
- Australian Faunal Directory (October 2007 version) (9,800 additional genus names, 190 additional families, 55,000 additional species names) - current web address: Australian Faunal Directory
- Pre-publication (as at 2007) Species 2000 New Zealand compilation, courtesy Dr. D. Gordon (1,800 additional genus names, 54 additional families, 10,000 additional species names)
- List of Names with Standing in Prokaryotic Nomenclature (2008 version), courtesy Dr. J-P. Euzéby (all taxonomic allocations checked, plus 450 additional prokaryote genus names, 77 additional families) – current web address: List of Names wit Standing in Prokaryotic Nomenclature
- Nomenclator Zoologicus (2006 electronic version), (205,000 additional genus names, 440 additional families) – current web address: Nomenclator Zoologicus
- Melville, R.V. & Smith, J.D.D., (eds). Official Lists and Indexes of Names and Works in Zoology. ICZN, London. (Approx. 50% of taxonomic status information on generic names from relevant ICZN Opinions uploaded to IRMNG, covering 1,800 genera)
- Index Fungorum, electronic database and nomenclator for fungi (2009 version) (all taxonomic allocations checked, plus 1,800 additional genus names, 150 additional family names) – current web address: Index Fungorum
- GBIF taxonomy, May 2010 (incorporating Catalogue of Life 2009, Paleobiology Database and numerous other sources not otherwise consulted): upgraded taxonomic placement for 46,000 genera not previously placed to family level – current web address: GBIF
- Hallan Biology Catalog (2012 version): additional 208,000 species names, 9,400 genus names and 700 families – current web address: Biology Catalog
- World Register of Marine Species (2013 version): additional 226,000 species names, 3,400 genus names and 1,000 families – current web address: World Register of Marine Species
- ION/BioNames data (2014, 2015) versions: additional 15,300 animal genus names plus some higher taxa, most published 2005-2014 – current web addresses: Index to Organism Names and Bionames
How complete is IRMNG at this time?
This is difficult to answer exactly, since no reliable estimates of total numbers of extant and fossil, valid names and synonyms exist at either genus, family or species level;
however the author estimates from a range of sources and guesstimates that there may be a total 6.5–7m published species names to date, of which approximately 2.2m are valid
(the latter increasing at around 25,000 per year); 500,000 published genus names of which perhaps 250,000 are valid (increasing at around 2,500 per year); and perhaps 30,000 published family names of which maybe
17,000 are valid, for both extant and fossil taxa. On the basis of these approximations, IRMNG currently includes "most" valid published family names plus a small subset of synonyms, "most" published genus names,
both valid and non-valid (485,000 of approx. 500,000, i.e. around 97%), and a subset of species names only at this time (1.9m out of perhaps 6.5–7m, or a little over 28%, though the figure rises to around 50% if synonyms are excluded).
How many homonyms / non-unique genus names are in IRMNG?
One important function of IRMNG is to indicate, at least as far as data already held, whether a particular genus name is unique or whether it occurs in multiple instances,
either between or within the same taxonomic groups. Currently there are almost 69,000 genus-level homonyms (around 29,000 separate names) included in IRMNG, representing around 15% of all names or approx.
one name in every 7 (this figure also includes nomina nuda plus a small number of misspellings that accidentally coincide with a different, correctly spelled name).
The name with the largest number is probably Wagneria of which there are 12 listed instances in zoology and 2 in botany, of which a maximum of one instance can be valid in either
domain with the remainder invalid, of which a subset may be either synonyms (for example replaced by subseqently published new names), or orthographic variants/misspellings of otherwise valid names.
What editing is required for IRMNG compilation?
The majority of name data (taxon names and authorities) are imported into IRMNG from the relevant data sources without modification, except in the case of database errors apparent from cross-comparison
with external sources and a limited amount of authority normalization to produce a consistent "house style" (including expansion of botanical authors for genera when supplied in abbreviated form, to match
the format used in Index Nominum Genericorum).
Family attribution may be adjusted from that given with incoming data where a more recent, authoritative source is available, and editorial input may be required to decide which source to follow
in instances where opinion is divided. Missing data (such as authorities, also nomenclatural/taxonomic comments, and habitat and extant/fossil flags) is frequently added from a variety of supplementary sources
and in these cases, editorial decisions are sometimes required as to which instance of a genus name is involved in each case (often self evident, but sometimes not so). Editorial decisions are also required to
decide whether two highly similar names and cited authorities in different base datasets represent either the same or different genus publication instances, for example some animal names may also be represented as plants,
plants or protists as fungi, corals as sponges, etc. etc., particularly in early literature; where such cases are detected, a decision is then made either to retain both records as separate instances or to combine
them into a single record for IRMNG.
Editorial input is also involved in determining the status of names from some of the less authoritative sources as either genuine new instances, or as misspellings of names already on the list,
in which case a note is added together with a pointer to the name variant deemed to be the correct spelling.
What gaps remain to be filled in IRMNG?
IRMNG can be deemed complete (at genus level) when:
- all published genus names to date are included (a moving target of course);
- all name variants not yet "verified" from appropriate trusted sources (i.e., Nomenclator-grade compilations) are either verified from other sources e.g. primary literature, or assessed to be misspellings of "verified" names already on the list;
- all genus names are assigned to actual families rather than "placeholders" such as "Mollusca (unallocated)";
- the higher taxonomic categories are all filled (e.g., no gap between family and class, or between order and phylum);
- all genera have an assigned (and perhaps, separately verified) status flag for extant/fossil and marine/nonmarine status (or both as applicable); and
- the taxonomic status of all generic names is known (i.e. valid or non-valid; if a synonym, what is the current valid name). Progress according to these various metrics is shown below, as at March 2016.
- Genus names held
As detailed above, currently IRMNG holds some 485,000 of an estimated 500,000 published genus names to date (the latter increasing at perhaps 2,500 per year),
indicating that at present some 15,000 genus names are possibly missing (although this figure could vary by perhaps +/- 10,000 according to the basis of the estimates used).
- Verified versus Unverified Names
Approximately 19,000 of the 485,000 genus names in IRMNG are "unverified" from appropriate authoritative sources at this time.
Experience suggests that perhaps 50% of these will turn out to be database errors in sources used to construct IRMNG, and the remaining 50% "good" new names verifiable from additional sources.
- Genera assigned to "placeholder" rather than actual families
At present, approximately 112,000 of the 485,000 genus names in IRMNG are allocated to "placeholders" at family level (example: "Mollusca (awaiting allocation)") rather than true families.
Mechanisms to address this deficiency are currently being investigated.
- Families assigned to "placeholder" rather than actual orders
At present, approximately 970 of the 23,100 family names in IRMNG are allocated to "placeholders" at order level (example: "Mollusca (awaiting allocation)") rather than true orders. This deficiency is being corrected over time.
- Completeness of flagging (extant/fossil/both, marine/nonmarine/both) at genus level
Currently approx. 397,000 of the 485,000 genus names in IRMNG (82%) are currently allocated an extant/fossil status flag and 88,000 are not, while
423,000 (87%) are allocated a marine/nonmarine status flag and 62,000 are not.
- Assessment of current taxonomic status of generic names
This is a low level priority for IRMNG at this time, however at present some 221,000 genus names of 485,000 (46%) are flagged as currently valid, and 103,000 (22%) are flagged as non-valid
synonmys and are pointed to the relevant valid name instance, leaving 33% without valid/non-valid flags at this time.
How is IRMNG currently maintained, and what are its long term development plans?
For the initial 10 years of its existence (2006-early 2016) IRMNG has been developed and maintained by Tony Rees with the support of CSIRO Marine and Atmospheric Research in Australia,
assisted by financial contributions from OBIS Australia, GBIF and the Atlas of Living Australia, and in-kind contributions (in the form of taxonomic data) from multiple sister projects.
In 2016, IRMNG hosting was taken over by VLIZ, the Flanders Marine Institute in Belgium, which has agreed to continue the ongoing development of IRMNG in conjunction with the WoRMS and OBIS data systems
which are also hosted there. Co-hosting of IRMNG with other related data systems at VLIZ is anticipated to result in benefits to both IRMNG content and that of other VLIZ systems, as well as utilizing
a standardized IT infrastructure and set of web interfaces across multiple data systems which can continue to be developed further according to future user and editor needs.