Thanks in large part to the NSA, we are increasingly aware of the extent to which our digital lives are being tracked, recorded and analyzed. But it is easy to forget that our seemingly analogue activities can leave just as significant of a digital footprint as digital services.
Citibike is a fine example. Over the past 10 months I made 268 trips on Citibike, and like all Citibike trips, each of these was meticulously recorded and stored away online by the Citi service. Unfortunately Citibike does not currently have an API, nor any export functionality, making it difficult for anyone (NSA or otherwise) to explore their data.
Enter Kimono Labs, to the rescue!
Kimono recently released released Authenticated API creation. This new tool makes it incredibly simple to scrape data from behind log-in portals. In less than 5 minutes I set one up to extract my Citibike trips.
With the raw data in hand, I turned to CartoDB to visualize it. CartoDB is the best product I have found for visualizing geo-temporal data. I teamed up with Andrew Hill, developer evangelist at CartoDB, to make a beautiful moving map of my life zipping around NYC.
Looking at the trips and parsing the data it becomes clear that de Blasio could easily figure out things like: where I live and work, if I lost my job, if I was dating someone (or having an affair if I was listed as married in the census).
Animation of My Roommate Bay Gross’ Trips
Static Maps of Our Trips
Some Fun Facts
- 324 Miles traveled (that’s the equivalent to biking to Boston from New York and half way back again)
- 1,963 minutess spent biking (thats over 31 hours on my toush!)
- Average distance traveled per trip: 1.26 miles (max: 3.68 min: 0.27)
- Average trip time: 7mins and 49secs
- Average trip speed: 10.57 mph
- Weekday morning commute average speed: 12.10mph (oh shit, i’m late!)
- Weekday evening commute average speed: 8.97mph
- Weekday average speed: 10.87mph
- Weekend average speed: 9.68mph
- Average Speed vs. Google Estimate e: 1.81% faster
- Average arrival time to work: 8:42 (28.8% of the time I arrive after 9am and 10.58% of the time I arrive before 8am)
- Evening weekday trips that begin outside our office start on average at 7:36pm (31.57% start after 8pm)
- For weekday trips that finish at the station closest to my apartment end on average at 8:50pm (31.03% end after 10pm)
In the graph below you can easily see how much faster I bike during morning commute relative to other trips, how my weekend rides tend to start later in the day, and also the handful of late night bikes home.
- Set up an API with Kimono
- Name each of the four fields: Start_Station, Start_Time, End_Station, End_Time
- Get the API credentials and add to the Ruby script
- Sign up for a Google API account and turn on Google Distance Matrix API and add the API key to the ruby script
- Download a copy of the station JSON feed to geocode the station names
- Run this ruby script as
- Upload the CSV to CartoDB
- Create linestrings: UPDATE table_name SET the_geom = ST_MakeLine(cdb_latlng(start_station_lat::numeric, start_station_lon::numeric), cdb_latlng(end_station_lat::numeric, end_station_lon::numeric))
- Download and upload the OSM street data
- Load Andrew’s SQL functions into the SQL Editor
- Snap the linestrings to roads: UPDATE tablename SET the_geom = axh_blend_lines(the_geom)
- Generate points for a Torque map: WITH a AS (SELECT (axh_linetime_to_points(the_geom, start_time, end_time, 20)).* FROM table) SELECT geom as the_geom, when_at FROM a
- Run ‘Table from Query’ in the interface to create a table for the Torque map
Thanks: A big thanks for Andrew Hill for doing things in PostGIS I have no idea about, and to Bay Gross for his edits.