← Back to Pages

Well, this is interesting. We’ve got a new graph on oceansofnyc.com/stats/ which shows the “first sighting rate” on each day, compared to what we’d expect given the TLC data.

The first-sighting rate is the fraction of recent sightings that turned out to be brand-new Oceans, i.e., not already in the catalog. Early on, it was near 100% — almost every sighting added a new vehicle. As the catalog has filled in, the rate has dropped, because more and more sightings are duplicates of Oceans we’ve already logged.

The interesting move is comparing the actual first-sighting rate to a theoretical rate computed from the NYC TLC dataset. If we had perfect, uniform random sampling of every Ocean on the road, what fraction of new sightings would we expect to be first-time finds? That’s the baseline.

Where the actual rate sits relative to that baseline tells us something:

The graph is live on the stats page. It’s also the basis for Ocean Points, which scale inversely with the recent first-sighting rate: rarer finds, more points.