We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Digital Libraries

Title: Accessing United States Bulk Patent Data with patentpy and patentr

Abstract: The United States Patent and Trademark Office (USPTO) provides publicly accessible bulk data files containing information for all patents from 1976 onward. However, the format of these files changes over time and is memory-inefficient, which can pose issues for individual researchers. Here, we introduce the patentpy and patentr packages for the Python and R programming languages. They allow users to programmatically fetch bulk data from the USPTO website and access it locally in a cleaned, rectangular format. Research depending on United States patent data would benefit from the use of patentpy and patentr. We describe package implementation, quality control mechanisms, and present use cases highlighting simple, yet effective, applications of this software.
Subjects: Digital Libraries (cs.DL)
Cite as: arXiv:2107.08481 [cs.DL]
  (or arXiv:2107.08481v1 [cs.DL] for this version)

Submission history

From: Raoul Wadhwa [view email]
[v1] Sun, 18 Jul 2021 16:10:41 GMT (150kb,D)

Link back to: arXiv, form interface, contact.