Mining scientific articles is hard when many of them are inaccessible behind paywalls. The Public Library of Science (PLOS) is a non-profit Open Access science publisher of the single largest journal (PLOS ONE), whose articles are all freely available to read and re-use. allofplos is a Python package for maintaining a constantly growing collection of PLOS’s 230,000+ articles. It also efficiently parses these article files into Python data structures. This article will cover how allofplos keeps your articles up-to-date, and how to use it to easily access common article metadata and fuel your meta-research, with actual use cases from inside PLOS.

Keywords:Text and data miningmetascienceopen accessscience publishingscientific articlesXML