Annals of Human Genetics

Annals of Human Genetics #

This is a canonical data source description,
The current dataSourceVersion described by this documentation is 1. The dataSource name for this data is Annals of Human Genetics.

Coverage: Annals of Eugenics and Annals of Human Genetics, from first publication (October 1925) to August 1, 2022
Size: 4,009 articles
Copyright: Copyright Wiley Journals
License: Wiley Journals TDM License
Credits: C.H. Pence and Nicola Bertoldi

How we got it #

PDFs for these articles were downloaded directly from Wiley, via querying their text and data-mining API.

Processing #

  • Bibliographic Information: Crossref
  • PMIDs, PMCIDs, and PubMed Manuscript IDs: PubMed scraping
  • Full text:
    • For Annals of Eugenics: the disclaimer placed at the front of every article was removed, and OCR was re-run
    • For Annals of Human Genetics: text was extracted directly from PDFs
  • Keywords and Tags: “Subject” tags saved in Crossref have been preserved as tags.

Changelog #

  • Data Source Version 1 (2021-08-29): initial content import.