September 23, 2014

Update Wed Sept 24

The author of the NYTimes interactive graphic of restaurant ratings pointed me to his data which I have also uploaded to CartoDB as an open data set and visualization. The author of the NYTimes visualization collected the data by scraping the city’s servers (Department of Health and Mental Hygiene) over several days. The author has 12,290 restaurants, where as the data set I geocoded from NYC’s Open Data Portal has ~24,957.

I have struggeled to find a public database of all NYC restaurants and their geocoded address. There is a Quora question on the topic, but not great answer. Dan Kozikowski of FirstMark Capital has a great post from 2 years ago with a heatmap of restaurant density, but sadly didn’t have the raw data anymore. The NYTimes has an interactive graphic of geoded NYC Health Department ratings of restaurants, but I couldn’t get the data from the author or from the graphic (retrieved).

So, I am open sourcing a geocoded version of NYC’s Health Department Restaurants Ratings database. To get this data set I:

  • Download all the health ratings from NYC’s Open Data Portal
  • Subsetted to just the business name, building, street, and zipcode
  • Uploaded to a Google sheet
  • Created and ran an appscript to Geocode each address and save down the lat / lon
  • It isn’t perfect as there are a few restaurants geocoded outside of NYC and a few that couldn’t be coded

Here is the Google Sheet

Here is the CartoDB visualization

