Heatmaps Over Time of Average Rents in the US

Published 4/19/2016

Living in the San Francisco Bay Area gives you access to some of the most interesting and capable people on the planet.

If you work in tech you come to the Bay Area, otherwise you are a nobody/has-been in the eyes of the industry. This talent pool drives greater competition from tech companies and in turn this competition for talent drives up wages. Once the word gets out that tech companies are hiring $100k+ for entry level software developer positions then you will start seeing the logic to the astronomical movements in average rental prices in the Bay Area.

Having experienced this phenomenon first hand has inspired me to build an app that would display a heatmap of average rental prices within the US. I think this type of visualization is useful especially in the context of real estate because you are able to generalize large amounts of data and see how certain regions on a map compare in affordability.

Gathering the Data

The best source for rental data would allow the app to visualize the average rental rates over several years for areas within the entire country. It is clear that craigslist.org is the best source candidate for rental data given that it is ranked as #13 in popularity by Alexa within the US.

While craigslist is a good source for current rental rates throughout the US it doesn't provide archived historical data. The Internet Archive's Wayback Machine has archived craigslist feeds going back to 1998! I have published the craigslist-wayback-apa package on npm designed to request and parse RSS feeds from the craigslist apartments/housing rentals section that were archived by the Wayback Machine.

var collector = require('craigslist-wayback-apa');
var cb = function(err, posts) {
   if (!err) {
      console.log(posts);
   }
};

collector.get({
   city: 'sfbay'
}, cb);

Providing only the name of the city will return a JSON list of any archived post for the given area, e.g.

{
   date: '2006-07-16T12:47:23-07:00',
   title: 'Fantastic view of the city.  Beautiful hardwood floors.  One car parki (glen park) $2600 2bd',
   location: 'glen park,sfbay',
   price: '2600',
   bedroom: '2',
   url: 'http://sfbay.craigslist.org/sfc/apa/182667311.html'
}, {
   date: '2006-07-16T12:47:13-07:00',
   title: 'Potrero Hill Furnished 2 br, 2 ba, w/ parking and use of gym and pool. (potrero hill) $4000 2bd',
   location: 'potrero hill,sfbay',
   price: '4000',
   bedroom: '2',
   url: 'http://sfbay.craigslist.org/sfc/apa/182668193.html'
}, {
   date: '2006-07-16T12:47:02-07:00',
   title: 'SOMA condo, 24-hour doorman, just remodeled (SOMA / south beach) $4900 2bd',
   location: 'SOMA,south beach,sfbay',
   price: '4900',
   bedroom: '2',
   url: 'http://sfbay.craigslist.org/sfc/apa/182669668.html'
},
...

Aggregating

In order to to display a heatmap that will allow a comparison of of average prices over time the list of posts needs to be grouped by location and creation date.

The normalization of locations should account for not only cities but neighborhoods within metropolitan areas. For every post check if there are any matches on any of the location tokens within a known set of locations then calculate a new average rent for the location and creation date. The results will list unique average rents by location and date, e.g.

{
   "neighborhood": "Russian Hill",
   "count": 46,
   "average": 4275.91304347826,
   "year": 2012,
   "region": "sfbay"
}, {
   "neighborhood": "Russian Hill",
   "count": 6,
   "average": 4075,
   "year": 2013,
   "region": "sfbay"
}, {
   "neighborhood": "Russian Hill",
   "count": 5,
   "average": 4749,
   "year": 2014,
   "region": "sfbay"
},
...

See rent-heatmap/data/index.js for matching and aggregation sample code.

Mapping

I decided to use Leaflet to render the basemap as it is open-source and mobile-friendly and leaflet-providers to select a suitable tile layer provider that will allow the heatmap coloring to stand out. heatmap.js provides a Leaflet plugin which leaves us with the straightforward task of feeding the plugin with the aggregated rental rates and location coordinates.

The demo provides an interactive timeline that displays a circle on each year where data is available for a given area. Clicking on a circle in the timeline will load a new dataset for that year. As the map is panned it will automatically fetch data for a new area that is closest to the center location of the map.

Source: https://github.com/csbrandt/rent-heatmap
Demo: https://csbrandt.cloudant.com/rent-heatmap/_design/rent-heatmap/index.html