traveling salesman

The Traveling Salesman Problem (TSP) asks: what is the shortest route to visit a given set of cities and then return to the starting point? For example, if you wanted to visit the 15,112 largest cities in Germany in the shortest distance possible, which route do you take?

Solving such a TSP is currently very computationally expensive. If the shortest route could be determined and proved to be the shortest without checking every other route this would constitute a solution to one of six remaining Millennium Prize problems.

The TSP is one embodiment of a class of problems described with the shorthand P and NP. Proving shortest routes to a TSP can be calculated without checking all routes effectively proves P=NP, which is a way of saying "solutions to what we thought were hard computational problems actually have an easier compute solution." This has a much broader impact than simply route checking.

Peter Norvig, Google's head of R&D, wrote a fantastic notebook that builds up Python code for solving a TSP. He explains how computation time required to check every conceivable route increases exponentially as the number of cities increases linearly. Presently, shortest routes are proven to be so only by checking every route, so it can take dozens of computing-years to solve ten thousand city maps. Only one single hundred-thousand city map has ever been solved.

Another excellent Jupyter notebook written by Mortada Mehyar in 2015 for all of the Tesla supercharger stations is here: The Traveling Tesla Salesman

what would a study of only perfect TSP solutions reveal?

If we stepped around the route calculation challenge and studied only known solutions, what patterns might emerge?

The reason this question is interesting is this is very nearly the first time in history where it has been possible to ask it. 50-100 years ago, there weren't collections of large, solved TSPs. But now there are libraries of them representing (collectively) hundreds of computing-years of work.

This is the known solution to the Germany TSP (file: d15112.tsp). Exclusive of the path length, there are two qualities (that I can think of) we might consider analyzing:

  • What is the angle formed at the city by the inbound and outbound connection?

  • Did each segment connect the next-nearest neighbor, or 2ⁿᵈ-nearest, 3ʳᵈ-nearest, nᵗʰ-nearest?

The distributions of angle and nᵗʰ-nearest neighbor for each of the 15,112 connections for the known-shortest route for this map follow.

Qualitatively, the following features are apparent:

  • There is a dearth of acute angles <30°

  • There are a geometrically increasing number of instances with increasing angle until arriving at ~90°

  • There is no discernible pattern of angle abundance between 90°-180°

Qualitatively, the following features are apparent:

  • The best solution to d15112.tsp does not result from always connecting the nearest-neighbor. This is why so-called "Greedy" TSP algorithms, which are computationally very fast, don't deliver the shortest route.

  • There appears to be an exponentially-decreasing distribution of connections following shortest-to-longer

  • The shortest route requires connecting at least one instance of the 35ᵗʰ and 176ᵗʰ nearest cities!

If you are curious which connection bypasses 175 nearer cities to connect the 176ᵗʰ nearest city, here it is highlighted in red. It's the coastal city that must skip 175 inland cities to go out into the North Sea and pick up Heligoland.

Current working hypothesis: Shrink-wrapping a TSP may produce the optimal path. The conditions are:

  • As they bend and move, the black path lines never collide with one another, only cities

  • The formula for the curve bending inward may simply begin as a soap bubble film minimizing ℓcosθ, or it may adopt one of dozens of other curve equations: catenary, bézier, parabola, cissoid, conchoid, brachistochrone, etc.

  • It may prove fruitful to change from one curve to another mid-process. Or, adjust the curve parameters with time as the shrink-wrap proceeds. (12/11/2019)