Imagine we're setting out to create our very own map. Now, you might be wondering, "What's a map, really?" It's like your trusty guide on a piece of paper, showing you the way around towns and countries.
So, there I was, super curious about how Google Maps works its magic. It all begins with coordinates. Ever wondered how they're figured out? There’s a neat link I stumbled upon that explains just that: "coordinates-where-do-they-come-from?" Definitely worth a look!
Now, when you're building something cool like this, there are a few things to think about. What problem are you solving with your software? What steps will you take to build it? What resources do you need? And how about the data you're using - what does that look like? For example, map data might look something like this:
{"_id":172539,"_lat":52.5652055,"_lon":13.3355015,"_timestamp":"2018-07-05T10:13:09.000+05:30","_version":12,
"tag":[{"_k":"ele","_v":"39.1"}]}
{"_id":172540,"_lat":52.5647252,"_lon":13.3364064,"_timestamp":"2010-09-07T02:39:18.000+05:30","_version":7}
{"_id":172541,"_lat":52.5656387,"_lon":13.3364104,"_timestamp":"2017-09-29T07:05:25.000+05:30","_version":3}
{"_id":172542,"_lat":52.5660003,"_lon":13.3375554,"_timestamp":"2009-03-03T19:43:01.000+05:30","_version":3}
{"_id":172543,"_lat":52.5663124,"_lon":13.3394369,"_timestamp":"2009-12-20T07:02:13.000+05:30","_version":4}
{"_id":172544,"_lat":52.5666157,"_lon":13.3432342,"_timestamp":"2022-05-06T01:05:24.000+05:30","_version":10,
"tag":[{"_k":"crossing","_v":"marked"},{"_k":"crossing:island","_v":"yes"},{"_k":"crossing_ref","_v":"zebra"},{"_k":"highway","_v":"crossing"},{"_k":"tactile_paving","_v":"yes"}]}
Here's a fun fact: Geospatial data is huge! I mean, trying to fit even a 50MB compressed file (like PBF/OSM) into Python is like stuffing a giant teddy bear into a backpack – it's a tight squeeze!
The data I’m talking about comes from Open Street Map. You can grab it from their site. But here's a pro tip: don't jump straight into writing a Python script to turn XML into JSON. That's a rocky road. Instead, why not use Apache Spark with the "databricks:spark-xml" package? It's like having a Swiss Army knife for data transformation!
Think of a path or road in a map as a collection of dots connected in a line. Each dot, or 'node', has its own ID, something like this:
{"_id":307227870,"_timestamp":"2014-10-11T03:39:15.000+05:30","_version":1,
"nd":[{"_ref":3123207254},{"_ref":3123207257},{"_ref":3123207273},
{"_ref":3123207275},{"_ref":3123207295},{"_ref":3123207289},
{"_ref":3123207309},{"_ref":3123207301},{"_ref":3123207254}],
"tag":[{"_k":"building","_v":"yes"}]}
this is representation of a path or road. It's essentially an array where 'ref' indicates a node's ID under the "nd" object. Each node might look something like this each node is going look like
{"_id":172540,
"_lat":52.5647252,
"_lon":13.3364064,
"_timestamp":"2010-09-07T02:39:18.000+05:30","_version":7}
It may also have a tag, specifying its attribute.
"tag":[{"_k":"crossing","_v":"marked"}
But here's a challenge: these data sets are enormous, and they don't play nice with ordinary computer memory. You could use something like PostgresGIS, but that’s kind of missing the point for what we’re doing.
So, what did I do? I picked a few waypoints and decided to map them out using an adjacency list. It’s like drawing a map with dots and connecting them with lines, making it easier to see how everything links together. This way, you can navigate using cool methods like two-way BFS or A-Star algorithms. Just a heads up, though – if we stick to the original OSM indexes, we're going to need a cheat sheet to remember all those coordinates, which is a bit of a memory hog.
A neat workaround? Encoding coordinates as binary indices, just like in that coordinates article I mentioned. It's a bit like translating a secret code to get the latitude and longitude. This saves on some heavy lifting but beware – it’s not perfect, especially when you have complex structures like multi-layered flyovers.
So, I thought, why not make our own mini map? Something like this
where each letter is a point on our map. Each point has its own story, like:
{"FF":[{"_ref":"A"},{"_ref":"B"},{"_ref":"C"},{"_ref":"D"}]}
{"ZF":[{"_ref":"D"},{"_ref":"E"},{"_ref":"U"}]}
{"FQ":[{"_ref":"K"},{"_ref":"L"},{"_ref":"T"},{"_ref":"U"}]}
{"RF":[{"_ref":"A"},{"_ref":"J"},{"_ref":"I"},{"_ref":"K"}]}
{"OP":[{"_ref":"I"},{"_ref":"H"},{"_ref":"F"},{"_ref":"E"}]}
Just a sample way point. and this is how it will look from "Node Point of view".
Where each of the red Dot will denote a node. with some properties.
{"A" :
{
"Lat":12.34,
"Lng":-78.99,
"Importance":1 // for Contraction Hierarchies.
....
} }
Now, imagine we want to find the shortest path from A to U. The real trick is simplifying the map. We’ll clear out the clutter, focusing on the essentials, and use something called Contraction Hierarchies. It’s like decluttering your room so you can move around more easily
TBH unless you won't see your PC getting freezes you wont appreciate Contraction Hierarchies😅.
In our implementation we have set the importance (Priority) based on the number of node, placing shortcut on indirect connected node. Now once we have applied the Contraction Hierarchies.
Map will transform from
To something like ..
yup, that all we need to do to get our own GUI-less map.
REST implementation is yet to be done, to support cool curl request. Will implement it someday.
I made it look easy but I'm still the champ😅.