Keeping a log of lessons learned and complications that are being resolved:

FourSquare

  1. Frankenstorm check-ins. Many users checked-in to the Frakenstorm Apocalypse event on FourSquare:

     "text" : "I'm at Frankenstorm Apocalypse- Hurricane Sandy (New York, NY) w/ 287 others http://t.co/hlxfg1R6",
     "source" : "<a href=\"http://foursquare.com\" rel=\"nofollow\">foursquare</a>",
     "id_str" : "262592956303814657",
     "place" : {
         "country_code" : "US",
         "place_type" : "city",
         "full_name" : "Queens, NY",
         "name" : "Queens",
         "country" : "United States",
         "id" : "b6ea2e341ba4356f",
     "coordinates" : {
         "type" : "Point",
         "coordinates" : [
             -73.79179716,
             40.78953415
         ]
     },
     "created_at" : ISODate("2012-10-28T16:33:34Z"),
     "user" : {
         "screen_name" : "BrittlynGleeson",
         "id_str" : "33439309",
    

Of the (unique) keyword collection, 1421 users checked into this event. This should be all of them because these tweets would show up in the keyword collection.

  1. In the keyword collection alone, there are 13,027 check-ins that are geo-tagged. These are from 8,575 users. These are very valuable data points, but they also cannot be fully trusted when they represent very popular events.