mehrdadn / SOTA-Py
- суббота, 1 июля 2017 г. в 03:12:39
A discrete-time Python-based solver for the Stochastic On-Time Arrival routing problem
SOTA-Py is a Python-based solver for the policy- and path-based "SOTA" problems, using the algorithm(s) described in Tractable Pathfinding for the Stochastic On-Time Arrival Problem (also in the corresponding arXiv preprint) and previous works referenced therein.
What is the SOTA problem? Read on...
It's the reliable routing problem:
How do you travel from point A to point B in T time under traffic?
For example, you might have a meeting in San Jose at 3pm, and one to reach in San Francisco at 4pm.
Or you might need to get from your house to the airport in less than 1 hour.
No. It doesn't let you specify a time budget. It only lets you specify a departure or arrival time, but not both.
What it (probably) gives you is the path with the least expected (average) time to your destination.
No. That would only be the case if traffic was perfectly predictable.
If you don't have a lot of time, you might need to take a riskier path (e.g. a highway), otherwise you might have no chance of reaching your destination on time. But if you have a lot of time, you might take a safer path (like side roads) that no one uses, to avoid suddenly getting stuck in the middle of, say, a highway, due to traffic.
That means your time budget can affect your route.
It is the case of the SOTA problem where you decide which road to take based on how much time you have left. You'd probably need a navigation device for this, since there are too many possibilities in the "policy" to print on paper.
This is what you'd prefer to do, because it can potentially give better results depending on whether you get lucky/unlucky with traffic.
This is a dynamic-programming problem, because the probability of reaching your destination on time is just the maximum probability of reaching it from each road at the next intersection.
It is the SOTA problem in the case where you statically decide on the entire path before you depart.
You can just print out a map for this on paper, the old-fashioned way.
This is—counterintuitively!—a much tougher problem than finding the policy.
Even though the solution looks simpler (it's just a path rather than a policy),
it's much harder to compute.
Why? Intuitively, it's because after you travel a bit,
you won't necessarily be on the most optimal path anymore,
so you can't make that assumption to simplify the problem initially.
By contrast, in the policy-based scenario, you always assume that your future actions are optimal,
so you have an optimal subproblem structure to exploit.
The (unhelpful) ultra-short version is that Dijkstra's algorithm is used for policy-based SOTA and A* is used for path-based SOTA.
The (more helpful) short version is:
For the long version, please see the paper linked above, and others referenced inside. The paper should (hopefully) be quite easy to follow and understand, especially as far as research papers go.
Note that the pre-processing algorithms from the paper (such as arc-potentials) are not implemented, but they should be far easier to implement than the pathfinding algorithms themselves.
This code models the travel time across every road as a mixture of Gaussian distributions (GMM) ("censored" to strictly positive travel times). It discretizes the distributions and solves the discrete problem in discrete-time.
Obviously, realistic travel times are not normally distributed. But that's the model of the data I had. Getting good traffic data is hard, and encoding data efficiently is also hard. If you don't like the current model, you'd have to change the code to accommodate other models.
The road network and traffic data is assumed to be a concatentation of JSON objects, each as follows:
{
"id": [10000, 1],
"startNodeId": [1000, 0],
"endNodeId": [1001, 0],
"geom": { "points": [
{"lat": 37.7, "lon": -122.4},
{"lat": 37.8, "lon": -122.5}
]},
"length": 12,
"speedLimit": 11.2,
"lanes": 1,
"hmm": [
{"mode": "go", "mean": 1.2, "cov": 1.5, "prob": 0.85},
{"mode": "stop", "mean": 7, "cov": 0.1, "prob": 1.5E-1}
]
}
Note the following:
This code isn't intended to finish any job for you. It's certainly not production-quality. It's just meant to help any researchers working on this topic get started and/or cross-check their algorithm correctness.
Given that it's not meant to be used in any production, I don't plan on actively maintaining it unless I encounter bugs (or if I see enough interest from others).
There's no short "getting started" code example, sorry. The main startup file is basically a (very) long example.
It's pretty self-explanatory:
python Main.py --source=65280072.0 --dest=65345534.0 --budget=1800 --network="data/sf.osm.json"
The time discretization interval is automatically chosen to be the globally minimum travel time across any edge in the network, since it should be as large as possible (for speed) and smaller than the travel time of every edge. You would need to change this in the code for greater accuracy.
Note that a time budget that is too high can cause the pathfinding algorithm to thrash exponentially, because
the reliability of every path reaches 100% as your time budget increases, and the algorithm ends up
trying all of them.
However, realistically, you would not need to run this algorithm for very high time budgets.
A classical path would already be reliable enough.
Note that you (obviously) need both a map and traffic data to run this code. Unfortunately I can't release the dataset I used in the paper, but I have a sample map from OpenStreetMap, and the code attempts to naively fill in missing traffic data, so that should be good enough to get started.
Here's an example of what one can get in 15 seconds on my machine. The code runs in two phases:
Please refer to the license file.
For attribution, a reference to the aforementioned article (which this code is based on) would be kindly appreciated.
If you find a bug, have questions, would like to contribute, or the like, feel free to open a GitHub issue/pull request/etc.
For private inquiries (e.g. commercial licensing requests), you can find my contact information if you search around (e.g. see the paper linked above).