Traveling salesman problem: mlrose / 2opt

3 min readJan 16, 2022

I have a lot of points in Leuven (my hometown) and I am kind of obsessed with Pokemon Go. So I got hold of the coordinates of all the stops and gyms in Leuven.

Data and script is on Github.

What I was trying first was using mlrose (link to mlrose-hiive as that one has a few minor updates for the traveling salesman problem, notably the ‘six’ import issue).

The code was basically this:

dist_list = [
    (x, y, distance(coordinates[x], coordinates[y]))
    for x, y in combinations(range(len(coordinates)), r=2)
]fitness_dists = mlrose.TravellingSales(distances=dist_list)
problem_fit = mlrose.TSPOpt(length=len(coordinates), fitness_fn=fitness_dists, maximize=False)

best_state, best_fitness = mlrose.genetic_alg(problem_fit)

print("Best length after optimization:", best_fitness)

write_gps_file(coordinates[best_state, :], 'Leuven')

However, there was something wrong with this one… The genetic algorithm kept on running. So I reduced the max_iters from infinite to 20.

The result:

Length of default route: 6.882879081290284
Best length after optimization: 35394.49401502668
Time taken: 0:01:28.207838

That’s bad I think. The result is even worse. So likely I’m doing something wrong (any help would be welcome 😉).

I was thinking about optimizing this one, but the result was just too bad… There should be a better way.

I found a 2opt solution at StackOverflow. After a few improvements I ended up with this code:

def main_2opt(route):
    def _swap_i_j():
        new_route = route.copy()
        new_route[i:j] = route[j - 1:i - 1:-1]
        return new_route

    route_distance = cost(route)

    improved = math.inf
    while improved > 0:
        improved = 0
        for i in range(1, len(route) - 2):
            for j in range(i + 1, len(route)):
                if j - i == 1:
                    continue
                new_route = _swap_i_j()
                new_distance = cost(new_route)
                if new_distance < route_distance:
                    route = new_route
                    route_distance = new_distance
                    improved += 1

    return route

The main changes with the one in the SO answer are that I make a copy() of the data instead of a reference. The reason is that when I use the original code new_route = route[:] , I ended up with a very similar route as above:

Length of default route: 6.882879081290284
Cost of route after 2opt:  7.781628384312426
Time taken: 0:00:08.219998

So, my guess is that the genetic algorithm is suffering from the same behaviour. Maybe something changed in numpy where the copy is actually just a view on the same data?

When I use copy() on the np.array, I get:

Length of default route: 6.882879081290284
Cost of route after 2opt:  0.5736902976055145
Time taken: 0:00:24.115002

A MUCH cleaner solution, don’t you think?

There might be a reason why mlrose is as bad as the 2opt solutions first try. And I have a gut feeling it has something to do with the copying and/or referencing of the data.
Just not going to dig into it right now…

Traveling salesman problem: mlrose / 2opt

Written by Steven Van Ingelgem