Driving Log 3.0: KML to GeoPackage Conversion

(updated )

I’ve been maintaining my driving log since 2010, and a few years ago, I began storing my driving log in a KML file and exporting to KMZ. This was a big improvement over manually managing a bunch of GPX and KML files, as I’d been doing before. However, the KML/KMZ method ended up having a number of flaws:

Based on these experiences, I’ve decided to revisit my prior decision to use KML for my driving log data, and to use GeoPackage instead.

Driving Log and Output Files

When I refer to my driving log, I’m referring to my canonical driving data—that is, the main copy of all of my driving data that acts as the single “source of truth” for where I’ve driven. For the past few years, this was a single KML file.

Because different software or use cases prefer different file formats, I also use some output files. These are exported from the driving log—so changes to the canonical data should be reflected in the output files the next time they’re exported, but changes to the output files do not affect the driving log data. Usually, this is a KMZ file so that I can open it in Google Earth, but I’ve also exported to other formats such as GeoJSON depending on what I need it for.

Diagram showing a driving log file, flowing into an Export step, flowing into an output file.

The driving log holds the canonical driving data, and output files are exported from it.

GeoPackage vs. KML

GeoPackage is a standard for storing geographic data in an SQLite database. Since I’m storing a lot of data (in this case, coordinates and metadata for driving tracks), a database makes sense for my driving log. Since it’s SQLite, the database is an easy to manage single file that I can store locally.

KML is an XML-based geographic data format, and is natively used by Google Earth and Google Maps. As an XML file, the KML format can be opened in any text editor and read directly without further decoding. (KML also supports a zipped KMZ format, which requires unzipping before it can be opened in a text editor.)

Human Readability

One of the reasons I originally chose KML over GeoPackage was because KML was human readable (and human-editable) in a text editor, while GeoPackage was a binary format that required special software to open. I initially was thinking that I’d need to do manual editing of coordinates, or that I’d want to search for particular coordinates within my driving log, so a human-readable format seemed logical.

Ultimately, I never really ended up reading coordinates from my KML driving log directly, and editing coordinates manually was so tedious that I would always write a Python script for editing them instead. If I did end up needing to read coordinates, it’s easy enough to convert GeoPackage geometries to human-readable coordinate strings with QGIS or geopandas. Having a human-readable XML file such as KML didn’t turn out to be as important as I originally thought, and I no longer consider it to be a requirement for my driving log.

Editing Geometry and Attributes

Google Earth has limited abilities to edit KML/KMZ files. It’s possible to create and move geometries, and edit names and descriptions. However, doing complex geometric operations such as closing gaps in adjacent tracks isn’t feasible, nor is editing custom attributes.

I was able to solve the closing track gaps issue by using subfolders in my KML driving log file; these served as a signal that tracks within a subfolder should be treated as a single line when I was exporting an output file to KMZ. Of course, this meant that I now needed to have a separate output file to see the combined tracks, when a main reason for using KML in the first place had been so I could open my driving log directly in Google Earth without needing to export an output file.

This also led to needing a pretty complicated update_kml.py script. It had to handle importing new tracks from GPX, merging them with existing tracks, writing a new KML driving log file (without track gaps closed), and writing a new KMZ output file (with track gaps closed). Since Google Earth couldn’t edit custom attributes in a KML file, I also had to write an update_attributes.py script, which allowed me to edit attributes within my driving log by date range.

Diagram showing input files (GPX) flowing into an Import GPX step, flowing into a Merge Tracks step. The merge track step flows to both an Export KML and an Export KMZ step. The Export KMZ step creates an output file (KMZ). The Export KML step creates a driving log (KML), which also flows back into the Merge Tracks step. A container labeled 'update_kml.py' encloses the Import GPX, Merge Tracks, Export KML, and Export KMZ steps. Inside another container named 'update_attributes.py' is an Update Attributes step; this step both takes input from and gives output to the driving log (KML).

Process flow for KML driving log with KMZ output

So if I needed to add tracks, I’d use update_kml.py to do so. If I needed to edit the shape of tracks or merge tracks with subfolders, I’d have to open the KML driving log, make my changes, save them to the correct file and folder1, and re-run update_kml.py. If I needed to update attributes, I’d have to run update_attributes.py, and then also re-run update_kml.py.

Meanwhile, switching to the database-based GeoPackage format simplified my process immensely.

Diagram showing input files (GPX) flowing into an import_gpx.py container with an Import GPX step, flowing into a driving log (GeoPackage), flowing into an export_kmz.py container with an Export KMZ step, flowing into an output file (KMZ).

Process flow for GeoPackage driving log with KMZ output

QGIS can directly edit GeoPackage database files, including merging tracks (features) and updating all attributes. Since I can do all of my driving log editing in QGIS, I only need an import from GPX script, and an export to KMZ script.

Improved Speed

Another problem with the old KML to KMZ process flow was that it required creating two XML files: a refreshed KML driving log and a KMZ export file. These XML files were relatively large (tens of thousands of records, each with hundreds to thousands of longitude/latitude pairs), so it took a while to generate both of them. On the other hand, editing or appending to the GeoPackage driving log is a database operation, so it’s nearly instant. The GeoPackage process only needs to generate a single XML file for the export, cutting this time in half. In practice, I save even more time since I don’t have to refresh an XML file every time I edit geometry or update an attribute; I now only need to generate XML when I actually want to update the output KMZ file.

GeoPackage also supports spatial indexing, which KML doesn’t. When I’m creating custom maps with my driving data (such as for my year in travel posts), I’m doing a lot of panning and zooming. Without spatial indexes in a KMZ file, QGIS has to look at every one of the tens of thousands of driving tracks every time I change my view to determine whether the track contains points within my viewing window, leading to tracks visibly popping in over a few seconds. With spatial indexes, QGIS can effectively do a database query on the extents of the window to determine which tracks are visible, so it only needs to deal with those tracks. While it’s only a few seconds, it makes a huge difference to how fast the user interface feels.

Results

With no need for human-readable text files, and with KML requiring an export step, using KML for my canonical driving log data no longer held any advantages over GeoPackage. On the other hand, GeoPackage gave me easier attribute editing, easier insertion of data, a simpler and faster workflow, and better integration with QGIS. I thus made the decision to migrate my canonical file from KML to GeoPackage.


  1. Google Earth always defaults to saving a file to the last place you used it to save any file, not the place that you opened the current file from. Thus, it’s pretty easy to accidentally save the edited KML file to the wrong place. ↩︎

Tags: