Practitioners, critics, and popularizers of new methods of data-driven research treat the concept of “data cleaning” as integral to such work without remarking on the oddly domestic image it makes—as though a corn straw broom were to be incorporated, Rube-Goldberg-like, into the design of the Large Hadron Collider. In reality, “data cleaning” is often the most opaque part of data-intensive research.
- Curating Menus in Laying the Foundation 16 March 2016
- Understanding How menus.nypl.org Handles Data Updates 11 October 2014
- When a Woman Collects Menus 16 April 2014
- Borrow a Cup of Sugar? Or Your Data Analysis Tools? — More work with NYPL's open data, Part Three 10 January 2014
- Refining the Problem — More work with NYPL's open data, Part Two 19 August 2013
- What IS on the menu? — More work with NYPL's open data, Part One 08 August 2013
- Organizing historical menus: a data curation experiment 21 June 2013
A reference resource for those interested in using the open data from the New York Public Library's menu transcription project.
The New York Public Library Rare Book Division holds over 45,000 historical menus. About half of these were collected and curated by Frank E. Buttolph between 1900 and 1921. The menus date from the 1850s to the present and include menus from restaurant, railroad and steamship companies, as well as a range of other organizations.
Beginning in 2011, menus from the NYPL's collection were digitized and transcribed with the help of thousands of volunteers. Through the NYPL’s What’s on the Menu? project, volunteers looked at digitized copies of the menus and typed in the many pieces of information included on each one, such as restaurant names, locations, dishes, prices, and dates.
The What's on the Menu? project makes all the data from its crowdsourced transcriptions available via bulk downloads and via an application programming interface (API). The current data set includes around 400,000 data points from the transcription project and the library's metadata on the over 17,000 menus digitized so far.
is Humanities Librarian for English at Emory University's Robert W. Woodruff Library. Katie has published on food in Faulkner, labor at Waffle House, collaboration in the academy, and data curation in the humanities. She has a PhD from the Graduate Institute for the Liberal Arts at Emory University and was previously the managing editor of Southern Spaces and the Coordinator for Digital Research at the University of Pennsylvania.
is Assistant Dean for Digital Humanities Research at the University of Maryland Libraries and an Associate Director of the Maryland Institute for Technology in the Humanities (MITH). He works to foster digital projects that involve close collaboration between librarians, archivists, and other digital humanities researchers. As part of this work, he has written, spoken, and consulted about the strategic opportunities and challenges of doing digital humanities work within the institutional and cultural structures of academic research libraries.