30 June 2024

Most of the optimization models we work with involve finding the best combination of discrete decision variables. For example, allocating people to teams, designing warehouses, or scheduling staff.

A frequent question is, "Why not just look at all possible combinations and pick the best one?"

The answer is, sometimes, that's exactly what we do. In some situations, enumerating all feasible combinations of the variables is the easiest method for solving a problem.

But, in most situations, the number of combinations is too large. Many orders of magnitude too large. That's when we need more sophisticated methods, like mixed integer linear programming and constraint programming.

In this series of articles, we look at a simple situation that requires deciding the best order for positioning devices in a rack. We use four methods for solving this problem:

• Model 1. Enumerate all possible position orders.
• Model 2. Search randomly for a specified time.
• Model 3. Constraint programming using OR-Tools.
• Model 4. Mixed integer linear programming using Pyomo.

Along the way, we see how a problem's size can quickly escalate to a colossal magnitude. We also demonstrate how, contrary to popular belief, that magnitude is not necessarily a barrier to finding a good solution.

The models described in this series of articles are built in Python.

The files are available on GitHub.

## Situation

We have a computer server room that houses network devices. The devices are arranged as a stack in a rack, with a vertical distance of 0.1m from device to device. We need to use cables to connect each device to one or more other devices in the rack. Each connection must be between the same ports in the devices (i.e., the cables must run vertically). The number of connection ports is not a limiting factor, as we can add more modules to each device.

For example, Figure 1 shows a rack of eight devices, each with four modules, with the following connection requirements:

• Device A ↔ Device B: 2 cables.
• Device A ↔ Device C: 1 cable.
• Device A ↔ Device F: 3 cables.
• Device A ↔ Device H: 1 cable.
• Device B ↔ Device D: 1 cable.
• etc.
• Device G ↔ Device A: 1 cable.
• Device H ↔ Device B: 1 cable.

In this example, we've arranged the devices in alphabetical order from top to bottom. The 23 cables have a combined length of 7.3m (e.g., 2x0.1m from Device A to Device B, plus 1x0.2m from Device A to Device C, etc). For convenience, we'll measure cable length in decimetres, so this arrangement has a total length of 73 decimetres.

To make the server room as tidy as possible, and reduce cabling costs, our objective is to minimize the total length of cable needed to make the required connections. That is, in what order should we position the devices to minimize the total cable length?

## Source for this situation

This situation is based on the Cabling Problem model by Håkan Kjellerstrand, using the CPMpy library to represent the situation as a constraint programming problem. That model is based on earlier models by Dennis Yurichev, using simulated annealing and Z3.

## Model 1 design

We need to decide the position of each device in the rack The length of cabling between each device, and the total cable length, can then be calculated.

With 8 devices, there are 8! = 40,320 possible permutations of device positions. That is, any of the 8 devices could be in the top position, any of the other 7 devices could be second from top, 6 devices in the next position, etc., down to the last device in the bottom position. Note that we're interested in permutations, rather than combinations, because the order of the devices matters.

We call each permutation of device orders a "case". Given that it should be quick to evaluate each case, forty thousand cases is a manageable number. Therefore, our first modelling method is to enumerate all possible cases and simply pick the case with the shortest total cable length.

## Model 1 implementation

### Data structure

We start with defining the data. For consistency, we use the same type of data definition as used in Håkan Kjellerstrand's Cabling Problem. The data for 8 devices is shown in Figure 2. This is a Python file called data_08.py, which we'll import directly into our Python code.

### Python program

The code for enumerating all cases is straightforward. After importing the data, we calculate the number of permutations and then loop over all of them, noting the best solution found so far. We also count the number of times a solution of that length is found. For optimal solutions, this is the number of alternative optima.

### Model 1 result for 8 devices

On our test PC, Model 1's code takes less than 0.1 seconds to run, evaluating the 40,320 cases at a rate of 477,129 cases per second. Since we evaluate all possible cases, we are guaranteed to have found an optimal solution.

One of the optimal solutions for 8 devices is shown in Figure 4, and some of Model 1's output is shown in Figure 5. The optimal solution has a total cable length of 46, meaning 4.6m, which is substantially better than the solution of 7.3m shown in Figure 1.

An advantage of enumerating all cases is that we can identify alternative optima. For our 8 device data set, there are 6 alternative optima, each with a total length of 4.6m.

### What if we have more devices?

If we have only 8 devices, then the problem is solved. Enumerating all possible cases is sufficient for this size problem. But what if we have more devices?

The number of cases equals the factorial of the number of devices. The problem with the factorial function is that it grows rather quickly:

• For 8 devices, there are 8! = 40,320 cases. At the rate of almost half a million cases per second, it takes Model 1 about 0.1 seconds to evaluate all cases.
• For 12 devices, there are 12! = 479,001,600 cases – which would take about 17 minutes to evaluate.
• For 16 devices, there are 16! = 20,922,789,888,000 cases – which would take more than 500 days to evaluate.
• For 20 devices, there are 20! = 2,432,902,008,176,640,000 cases – which would take about 160,000 years to evaluate.
• For 24 devices, there are 24! = 620,448,401,733,239,439,360,000 cases – evaluating that many cases would take about three times the current age of the universe.

That's assuming we can maintain a rate of almost half a million cases per second. But cases with more devices take longer to evaluate, so the time to evaluate the larger cases would be even longer than indicated above. For example, with 24 devices, Model 1 evaluates 216,212 cases per second. Therefore, evaluating all cases would take more than six times the current age of the universe – though the difference hardly seems to matter.

The extremely rapid growth in the number of cases is a type of combinatorial explosion, where a small increase in the number of items produces a large increase in the number of combinations or permutations that exist. Combinatorial explosions are very common in models that involve finding the best combination of discrete decision variables.

Note that we don't claim Model 1 is the fastest way to enumerate all cases in this situation. No doubt it is possible to write a faster program. For example, our program is single-threaded while, based on our recent experience, a multi-threaded approach might be 10 times faster. But, given the rate at which the number of cases increases, a faster program (and/or a faster computer) soon runs into the same problem: The number of cases grows too quickly to be enumerated in a reasonable time.

### Best solution for each data set, given up to one hour run time

We've created additional data sets for 9 to 24 devices by adding devices and connections to the 8 device data set. Each data set includes the previous data set as a subset, so the shortest total length must increase from data set to data set

Since we can't wait for hundreds of days, or billions of years, we allow a more reasonable time limit of 1 hour. The result of running each data set for up to 1 hour is shown in Figure 6.

The dark orange points represent data sets that complete evaluating all enumerated cases within 1 hour, meaning that they are optimal solutions. The lighter orange points are the best solution found at the 1 hour run time limit.

In this situation, Model 1's enumeration method works well up to 12 devices. Beyond that it works less-and-less well as the number of devices increases. For the larger data sets, Model 1 evaluates a tiny proportion of all possible device orders. Worse, it evaluates the device orders in only one specific part of the solution space, so it may be a long way from finding an optimal solution – or even a good solution. We need a better method for handling more than 12 devices.

## Next steps

Enumerating all possible cases works well, in this situation, if we have only a few devices – up to 12 devices if we wait up to 1 hour, or 13 devices if we are prepared to wait a few hours. If we have more devices, then there are simply too many possible cases for us to enumerate and evaluate the permutations in a reasonable time.

An alternative to attempting to enumerate all cases is to use a search method to find good solutions. There are many ways to search the solution space. The simplest way is to generate random device orders. This method has the advantage of sampling from all parts of the solution space, unlike Model's 1 enumeration method which is limited to a specific part of the solution space.

So, in the next article, we try a random search method – to see if it works better than enumeration, especially for the larger data sets.

## Conclusion

In this article, we attempt to enumerate and evaluate all possible cases for the positioning of devices in a rack.

For a small number of devices, the enumeration method is sufficiently fast, and the solution space sufficiently small, that we can evaluate all possible cases. Consequently, for a small number of devices, Model 1 finds optimal solutions in a reasonable time.

But for a larger number of devices, the time to enumerate all cases rapidly gets well out of hand. Therefore, the best we can do is evaluate a tiny proportion of the total cases. Consequently, the best solutions we find for larger data sets are likely to be far from optimal.

In the next article, we explore a different approach to solving the same problem – using a random search method, to see if it performs better for larger problems.