This is the first in a series of posts about a side project I’ve been developing and maintaining for the past few years. This introduction will be accompanied by subsequent posts dealing with the technical, commitment, and emotional issues associated with running a project on the side. Read part two here.
Back in 2009, I moved down to Glassboro, New Jersey to be close to work. I had spent that summer contributing to OpenStreetMap and wanted to spend more time working with local data. In searching for a new home, I wanted to use property information and other points-of-interest data to get (re)acclimated to the area.
At the time, I recall web GIS falling into two categories: Google Map mashup or ArcIMS/ArcGIS Server desktop-on-the-web GIS. Typically, the map mashups were tailored to a specific region or data set. The clunky web viewers set up by county and regional governments had valuable data, but an absolutely abysmal user experience.
So, in the Summer of 2009, I planned on making a web site with some local GIS data for Gloucester County. I originally planned to pull in information from OSM and other sources to paint a picture of the community, highlighting what would be most useful to someone considering moving to or visiting the area. Setting it up on Dreamhost shared hosting, I eventually bought GlassboroMap.com that fall and began building a web map, starting with property information for the County.
Upset with the clunkiness of ArcIMS and early ArcGIS Server (remember the web ADF?) I wanted to make something that was slick like these Google Map mashups, but felt more polished, feature and data rich. Being a self-styled GIS geek, I dove into getting Mapserver and Tilecache up and running on shared hosting, and getting the tiles inserted into a Google Maps-driven, full page map of the region. I spent a lot of hours coming up with a kludge to get the necessary server-side software running on shared hosting, which wasn’t the easiest of tasks. I also spent time working on getting a really cool looking web map with the property data overlaid on top.
Eventually I had to put the site aside, as my first-born was born that October and I had just started in the Geography department that September. Coming back to the project after a few months helped me learn my first lesson from this project.
Lesson One: Your technical achievements may only matter to you.
While I was proud of my ability to shoehorn the site on to shared hosting, I realized that the majority of my traffic was coming from Google and directly visiting an individual property’s page, bypassing my map entirely. I was initially upset – I put the majority of my effort into the map and other GIS users gave me kudos, but ultimately the users of my site didn’t care. Hell, they likely didn’t even know the front page map existed.
It made me realize my own hypocrisy – I set out to solve a problem, namely getting information about the area into the hands of those looking for answers. Why should I be upset that they got the information, but not in the manner I assumed they would? It also made me think more critically about the map – the parcels were unlabeled and finding an address was difficult. If you knew where the property was and could recognize it from the air, you could easily get to the information. Just because I enjoyed panning around the map, marveling at a tiled parcel layer, did not necessarily mean that it was the best way for the public to get access to the data.
It wasn’t until 2011 when I started to work on the site again in earnest. I used the site as a proof-of-concept for several other projects, but only once web logs showed that there was some traffic directly to the pages, did I realize how useful this site could be.
New Jersey’s assessment data is not easy to work with. Raw data is delivered as zipped flat files, and (at the time) the only site that had individual records accessible through the web was a spartan application with no frills. Details on each property was accessed using a long query string, which kept most of the pages out of the search engines indexes.
Returning to the work I started for GlassboroMap, I bought NJParcels.com in the summer of 2011 and extended the work I did to be applicable state-wide. It was slow growing, as the first year, I had virtually no traffic to the new site. The indexing and traffic to GlassboroMap.com just sort of happened, so I needed to spend some more time working on how to drive traffic to the site, or at least convince Google to crawl it.
Lesson Two: Be persistent, but not stubborn.
I had to try something different. The front page of NJ Parcels had no maps whatsoever. Instead, I had a search form and a series of links, allowing search engines to crawl through municipalities, blocks, and lots, eventually drilling down to the individual property page. (This too would get tweaked to ensure the best balance between human and machine readability.) I also learned more about SEO, creating sitemaps and other methods to get traffic to the site. I had to change a lot about how I thought the web works (or should work) and adapt.
By 2013, I was receiving AdSense checks approximately every other month. It was a nice reward to having spent a lot of time over the past few years working on the site. However, I very rarely spoke about NJ Parcels, unless what I could contribute to the conversation was directly relevant. Partially because non-technical people were confused by why I worked on it (“This is fun to you?”) or because making all of the property information – already public information – accessible was “creepy”. I also was hesitant to talk about it in fear that the traffic would dry up and this small stream of extra income would disappear.
Lesson Three: Don’t be afraid to fail.
In October of 2013, Google changed its search ranking algorithm and my traffic dried up overnight.
While I was considerably bummed, I half considered taking the site down. Throughout 2013, I’d occasionally get emails from privacy-minded people asking to be removed from the site (more on this in a later post) and thought it might be time to pull the plug. Instead, I experimented some more with the site, making incremental changes and hoping for the best.
Eventually the traffic returned and continued to grow. While grateful that the traffic was back and increasing, I faced the same problems before and they only got worse.
First, the site was haphazardly updated. While other public records sites exist now, many are not kept up to date. NJ Parcels was definitely guilty of that. I started work on semi-automated processes to update the site, but was limited by another constraint – shared hosting.
While Dreamhost’s shared hosting served me well for the previous 10 years worth of small websites and other side projects, NJ Parcels definitely pushed the limits. I was also hampered by having a limited development environment on the server. I finally implemented some version control and started the initial work of setting up a testing and a production environment. With the increased traffic, I realized I would likely need to move off of shared hosting and start spending some money on more robust hosting. I formed an LLC in the summer of 2014 and started making plans to move the site over to Amazon Web Services. By that October, I had received a notice from Dreamhost that it was now time to move, as I pushed the shared hosting and MySQL to its limits. Thankfully, I was nearly ready to migrate and spent several nights after work getting everything over on to a dedicated instance on AWS.
Lesson Four: Enjoy what you do and have fun.
Now that I was on AWS, I had the space to continue growing, but more importantly I had the freedom to set the site up the way I wanted it developed.
Personally, I hate working with MySQL. Migrating to PostgreSQL enabled me to have more control over my data and have real spatial data support. Now, when I wanted to change the site and work with the data, I could do so and easily push it out to the public. No more crunching away on my laptop, exporting to a CSV or something and loading it into MySQL. (For example, the “Adjacent Properties” list on each property’s page was “pre-rendered” in PostgreSQL at home and then exported as a 10 million record lookup table for use in MySQL.)
Now that I had the right development environment, I could really have fun with the site. I recently added a list of Broadband Providers to the property detail page, so that you can determine the type of Internet service (DSL, Fiber, Cable) available to the property. I was able to download the data from the National Broadband Map, import it into PostgreSQL and get it published within the span of a weeknight. It was fun doing it, and now house hunters can tell if they’re going to get stuck with Comcast or not when researching a house using my site.
I’ve probably done more work on the site within the last year, mainly because it’s become more fun than before. Cloud services make it so easy to do system administration work that even more mundane tasks are enjoyable.
What makes the fun even more rewarding is that the site has had considerable traffic over 2015. 9.7 million pageviews, over 3 million users, still just 1 developer.
When I taught GIS in higher education, I always stressed to my students to do work that could be used in a portfolio. Whenever possible, I incorporated final projects into the courses, encouraging them to tackle an interesting problem using the skills they learned over the semester. I’ve also spoken to them about this project in particular. I’ve heard excuses as to why they couldn’t do something similar – “you know programming”, or “but you’re good at this, I’m not.” Those same excuses I’ve said myself to talk myself out of taking on a big project. But you’ve got to keep trying if you want to succeed. Did I know about SEO, systems administration, database administration, cloud technology, etc. when I was a senior in college? Of course not. I had some experience with Perl and that was about it. I learned everything else by digging into projects and increasing my knowledge by doing.
Find something you’re passionate about, and see what’s out there. My masters is in urban planning and I’ve always had a fascination with real estate. Take what’s out there and try making something new. Your first attempts are going to suck, they’re going to fail, and that’s completely okay. You learn, you adapt, you grow.
Throughout this coming year, I am going to continue adding features to the site, as well as conducting some research and publishing some interactive visualizations of the findings. I need to get back into web mapping and statistics, feeling somewhat deficient in both lately, so I’m going to try developing some new things slightly outside of my comfort zone. Hopefully they will be successful projects, and even if they aren’t, I’ll have learned from the process, which is the most valuable asset of all.