Traveling salesman problem: mlrose / 2opt
I have a lot of points in Leuven (my hometown) and I am kind of obsessed with Pokemon Go. So I got hold of the coordinates of all the stops and gyms in Leuven.
Data and script is on Github.
What I was trying first was using mlrose (link to mlrose-hiive as that one has a few minor updates for the traveling salesman problem, notably the ‘six’ import issue).
The code was basically this:
dist_list = [
(x, y, distance(coordinates[x], coordinates[y]))
for x, y in combinations(range(len(coordinates)), r=2)
]fitness_dists = mlrose.TravellingSales(distances=dist_list)
problem_fit = mlrose.TSPOpt(length=len(coordinates), fitness_fn=fitness_dists, maximize=False)
best_state, best_fitness = mlrose.genetic_alg(problem_fit)
print("Best length after optimization:", best_fitness)
write_gps_file(coordinates[best_state, :], 'Leuven')
However, there was something wrong with this one… The genetic algorithm kept on running. So I reduced the max_iters from infinite to 20.
The result:
Length of default route: 6.882879081290284
Best length after optimization: 35394.49401502668
Time taken: 0:01:28.207838
That’s bad I think. The result is even worse. So likely I’m doing something wrong (any help would be welcome 😉).
I was thinking about optimizing this one, but the result was just too bad… There should be a better way.
I found a 2opt solution at StackOverflow. After a few improvements I ended up with this code:
def main_2opt(route):
def _swap_i_j():
new_route = route.copy()
new_route[i:j] = route[j - 1:i - 1:-1]
return new_route
route_distance = cost(route)
improved = math.inf
while improved > 0:
improved = 0
for i in range(1, len(route) - 2):
for j in range(i + 1, len(route)):
if j - i == 1:
continue
new_route = _swap_i_j()
new_distance = cost(new_route)
if new_distance < route_distance:
route = new_route
route_distance = new_distance
improved += 1
return route
The main changes with the one in the SO answer are that I make a copy()
of the data instead of a reference. The reason is that when I use the original code new_route = route[:]
, I ended up with a very similar route as above:
Length of default route: 6.882879081290284
Cost of route after 2opt: 7.781628384312426
Time taken: 0:00:08.219998
So, my guess is that the genetic algorithm is suffering from the same behaviour. Maybe something changed in numpy where the copy is actually just a view on the same data?
When I use copy()
on the np.array
, I get:
Length of default route: 6.882879081290284
Cost of route after 2opt: 0.5736902976055145
Time taken: 0:00:24.115002
A MUCH cleaner solution, don’t you think?
There might be a reason why mlrose is as bad as the 2opt solutions first try. And I have a gut feeling it has something to do with the copying and/or referencing of the data.
Just not going to dig into it right now…