Over the course of 2009, I got involved with OpenStreetMap. If you haven’t used OSM, I suggest you check it out. It’s being updated and used throughout the world, from mapping campuses in New Jersey to aiding the relief efforts in Haiti.
So, during 2009, I had noticed that on OSM, the State of Georgia had land use data. I started to look into how Georgia was so lucky. OSM relies on user contributions, so some savvy user must have added all of those polygons to the map. I contacted that savvy user to find out more. Liber pointed me to some of the methods he and others have used to import GIS data into OpenStreetMap. I was unsatisfied with the existing software, so I looked into the OSM API and wrote my own code to export directly from ArcGIS into the .osm file format.
ExportToOSM.py is my crack at programming an export utility. I wanted something that would export multipolygons from ArcGIS as OSM multipolygon relations and would produce a file free of redundant nodes. I used an earlier version of my script to export the buildings on Rowan’s campus. After fixing a few issues – namely the multipart polygons (take a look at Evergreen Hall, still need to punch in the interior courtyard as a doughnut hole) – I began developing a plan to export NJ’s 2002 Land Use data to OSM.
The 2002 Land Use is complex set of data. Over 800,000 polygons mapping land uses as small as an acre. Now, some of the land use classifications needed reclassing or removal from the data that would ultimately be imported. There are several classifications that would be defined in OSM as “wetland“. These polygons were merged together. Other classifications, such as “residential”, seemed redundant, as the LU data does not include names of areas. Other LU classifications, such as “Transportation/Utilities” do not have a comparable “landuse” tag in OSM. These polygons were deleted from the import data.
After removing LU polygons that I thought were too specific or would create duplicates in existing OSM data, I was left with approximately 500,000 polygons. This became a problem, as the sheer complexity of the data would have been a burden to upload and manage. The LU data also had many superfluous vertices, which is not as much of an issue in ArcGIS as it is in OSM. I wanted to simplify the polygons, reducing the node count as well as making the polygons easier to edit. I used the POINT_REMOVE method of polygon simplification, with a tolerance of 25 feet. 25 feet seemed like a logical choice to me; narrower than the width of many roadways. Existing features drawn from Yahoo! imagery are often drawn at a coarser scale than what I was importing, so the simplification wouldn’t have an impact on the look of the data once plotted on the map.
I also buffered the major roadways through the state approximately 50 feet. I used these buffers to erase the land use polygons, breaking the polygons in to more manageable chunks. This step was needed because after the simplification and dissolve by land use types, I had one forest polygon stretch from the Delaware Bay to the Raritan Bay.
I also wrote a script that iterates through all the polygons, assigning them into groups based on adjacency with other polygons. This generated about 9,000 groups out of 200,000 polygons. I then modified the script to group the groups into “super groups”, each with roughly 35,000 vertices in them. Uploads to OSM are limited to 50,000 items, whether that’s nodes or ways. By putting the limit on the group at around 35,000, I left enough room for the ways and relations. This grouped the polygons into 75 “supergroups” which I then exported into individual feature classes. I then ran my ExportToOSM.py script on each through a batch script. I now had 75 .osm files for upload.
An issue that I ran into was that ways are limited to 2,000 nodes. I found that out the hard way, when one of my uploads failed. Of course, that was after the upload added 31,000 nodes without any ways to OSM. I manually broke large ways (one was 12,000+ nodes in length) using JOSM. I have to give JOSM credit, it handled editing of very large datasets beautifully. I ended up using JOSM to upload some of the .osm files – I had used bulk_upload.py for the majority of the uploads.
Feel free to contact me with any questions. I hope this will help someone looking to do a bulk upload to OpenStreetMap. I also hope my script will help ArcGIS users feel comfortable contributing their GIS data to OSM.