Solver Max - Optimal but not practical

21 November 2023

Paper coverage

We continue our article series looking at a common modelling issue: The solution is optimal, but not practical.

In the first article of this series, we made the assertion that we can implicitly mimic rotation of the items by sorting the item dimensions so that the width is always the longer side. Therefore, we didn't need to explicitly represent the orientation of the items in our earlier models.

This insight may not be obvious, so in this article we explicitly model rotation of the items using either/or constraints. That is, we want to represent items that are in either portrait or landscape orientation, with the model choosing only one of the orientations for each product.

The problem is that, in a mathematical programming model, all constraints are always evaluated. So, we can't have a condition like:

   If orientation is portrait
      then use constraint for portrait
      else use constraint for landscape

Instead, we need to create a pair of constraints that behave the way we want, while both are always evaluated. This is done using a binary variable, which acts as a switch to decide which constraint is "active", and some \(BigM\) parameters that change each constraint's right-hand-side in a very specific way.

Our objectives in this article are to:

Illustrate how to formulate and implement either/or constraints in an integer programming model, using \(BigM\) constraints.
See how our model behaves when explicitly including either/or constraints.
Demonstrate that our dimension sorting assertion is correct.

Articles in this series

The articles in this "Optimal but not practical" series are:

First attempt. Model 1: Non-linear model that finds optimal solutions for small data sets, but not for the full data set.
Linear, partial data set. Model 2: Linearized model using a small sub-set of the full data. Finds local optima.
Full data set. Model 3: Extends Model 2 to use the full data set. Finds globally optimal solutions, but requires a large amount of memory.
Column generation. Model 4: Variation of Model 2, with randomly-generated columns. May find globally optimal solutions, though not guaranteed to do so.
Either/or BigM. Model 5a and 5b: Explicitly models rotation of the items using BigM constraints. Three times larger than Model 3, so difficult to solve.
Either/or Disjunction. Model 5c: Explicitly models rotation of the items using Pyomo's Generalized Disjunctive Programming (GDP) extension.
Virtual machine. We set up a Google Compute Engine with 128 GB of RAM to run Model 3 on data sets larger than we can run on our local modelling computer.

Download the model

The models described in this series of articles are built in Python using the Pyomo library.

The files are available on GitHub.

Situation

The situation is the same as for the previous models in this series.

Either/or BigM constraints

Wanting to represent "if" conditions is a common modelling requirement. We describe some general techniques for modelling logical conditions in our article series Logic conditions as constraints. More techniques are described in the academic article Transformation and linearization techniques in optimization.

Modelling specific logic conditions can be difficult. Fortunately, there is a standard technique for modelling either/or constraints, as described in Section 7.3 of the AIMMS Optimization Modeling Handbook, Either-or constraints.

In the AIMMS situation, they have a pair of constraints where either one or the other should apply:

\begin{alignat}{1} &\sum_{j \in J} \text{a}_{1, j} \times x_{j} \quad &\le \quad \text{b}_1 \tag{1}\\ &\sum_{j \in J} \text{a}_{2, j} \times x_{j} \quad &\le \quad \text{b}_2 \tag{2} \end{alignat}

To achieve the desired effect, they modify Equations (1) and (2) by creating a binary variable \(y\) and associated sufficiently large upper bounds \(\text{M}_1\) and \(\text{M}_2\) (which are upper bounds on the activity of the constraints). The bounds are chosen such that they are as tight as possible, while still guaranteeing that the left-hand side of constraint \(i\) is always smaller than \(\text{b}_i + \text{M}_i\). This is called a \(BigM\) technique because the large constants are typically represented using the symbol \(\text{M}\).

Therefore, the constraints are rewritten as:

\begin{alignat}{1} &\sum_{j \in J} \text{a}_{1, j} \times x_{j} \quad &\le &\quad \text{b}_1 + \text{M}_1 \times y \tag{3}\\ &\sum_{j \in J} \text{a}_{2, j} \times x_{j} \quad &\le &\quad \text{b}_2 + \text{M}_2 \times (1 - y) \tag{4}\\ \end{alignat}

This formulation leads to two potential outcomes, depending on the value of the binary variable \(y\):

If \(y\) equals \(0\), then the RHS of Equation (3) becomes \(b_1\) and the RHS of Equation (4) becomes \(\text{b}_2 + \text{M}_2\). As a result, Equation (3) is the same as Equation (1) and the RHS of Equation (4) is large enough that it is always non-binding (effectively, it isn't there).
If \(y\) equals \(1\), then the RHS of Equation (3) becomes \(\text{b}_1 + \text{M}_1\) and the RHS of Equation (4) becomes \(b_2\). As a result, the RHS of Equation (3) is large enough that it is always non-binding (effectively, it isn't there) and Equation (4) is the same as Equation (2).

The result is effectively that if \(y\) equals \(0\), then the model applies Equation (1). If \(y\) equals \(1\), then the model applies Equation (2).

Formulation for Model 5

We start with Model 3, which finds globally optimal solutions. The formulation for Model 5 is almost the same as for Model 3. The only difference is that we replace each of the width and length constraints with a pair of constraints using the either/or technique described above. All going well, Model 5 should find the same globally optimal solutions (or equivalent alternative optima).

In Model 3 we have a constraint that requires an allocated stock product's width to be at least the width of the item it is applied to. That is, we cut an item from a stock product that is larger than, or the same size as, the item's width. We have a similar requirement for the lengths. The Model 3 width constraint is:

\begin{alignat}{1} &\sum_{c=1}^t Allocation_{i, c} \times \text{CandidateWidth}_{c} \quad &\ge \quad \text{Width}_i \tag{5} &\begin{array}{l} \forall \ i \in \{1 \ldots m\} \\ \end{array} \\ \end{alignat}

But if the item can be rotated (or, equivalently, the stock product can be rotated), then we need a pair of constraints:

Portrait. The width of the stock product must be >= the width of the item in portrait orientation.
Landscape. The width of the stock product must be >= the width of the item in landscape orientation (i.e. the length in portrait orientation).

We want the item to be in either portrait or landscape orientation – it cannot be both. So, we introduce a binary variable to be a switch that represents either portrait or landscape orientation for each item. We then need to adjust the constraint's right-hand-side (RHS) to make one of the constraint pairs potentially binding while the other cannot bind, and vice-versa. This is done by subtracting a carefully chosen \(BigM\) constant from the RHS.

Importantly, one of the pair of constraints refers to the item's width, while the other constraint refers to the item's length – because we rotate the item, so its width becomes its length, and vice versa.

When using a \(BigM\) constant, a common approach is to use the same value for all constraints. While this is simple, it is usually not a good idea because, as the AIMMS document says, "The bounds are chosen such that they are as tight as possible, while still guaranteeing that the left-hand side of constraint \(i\) is always smaller than \(\text{b}_i + \text{M}_i\)". Using the same value for all \(i\) is not as tight as possible. To illustrate the difference, we create two versions of Model 5:

Model 5a. Use a single value of \(BigM\) for all item constraints.
Model 5b. Use a value of \(BigM_i\) tailored to each item constraint.

A difference from the AIMMS example is that our constraints are expressed as \(\ge\) rather than \(\le\), so our \(BigM\) values need to be negative rather than positive.

Applying the technique to our constraints, in Model 5a we transform our one width constraint into a pair of constraints using a \(BigM\) constant and a binary rotation switch variable \(Rotate_i\) for each item:

\begin{alignat}{1} &\sum_{c=1}^t Allocation_{i, c} \times \text{CandidateWidth}_{c} \quad &\ge &\quad \text{Width}_i + \text{BigM} \times Rotate_i \tag{6} &\begin{array}{l} \forall \ i \in \{1 \ldots m\} \\ \end{array} \\ &\sum_{c=1}^t Allocation_{i, c} \times \text{CandidateWidth}_{c} \quad &\ge &\quad \text{Length}_i + \text{BigM} \times (1 - Rotate_i) \quad \tag{7} &\begin{array}{l} \forall \ i \in \{1 \ldots m\} \\ \end{array} \\ \end{alignat}

Note that the RHS of the Equation (7), for the rotated item (\(Rotate_i\) = 1), refers to \(\text{Length}_i\) rather than \(\text{Width}_i\), because we're modelling rotated items. We also need similar constraints for the length.

For Model 5b, we do the same thing, except we have an indexed \(BigM_i\) for each item's constraint.

Implementation of Model 5

When enumerating the candidate stock product sizes, we need to include all combinations of not only width by length, but also width by width and length by length. This is because the dimensions of each item could be in any sort order (unlike for Model 3), so we need to ensure that all possible combinations are considered. As a result, Model 5 has three times as many candidates as Model 3 – this is a key reason why we sorted the dimensions in the item data for Model 3, to keep the model size as small as possible.

In Model 3, the width and length constraints are written in Python/Pyomo as a direct translation of Equation (5) for the width and a similar constraint for the length of each item:

def rule_LBWidth(Model, i):   # Width of allocated product must be at least width of each item it is allocated to
    return sum(Model.Allocation[i, c] * Model.CandidateWidth[c] for c in Model.Candidate) >= Model.Width[i]
Model.MinWidth = pyo.Constraint(Model.Item, rule = rule_LBWidth)

def rule_LBLength(Model, i):   # Length of allocated product must be at least length of each item it is allocated to
    return sum(Model.Allocation[i, c] * Model.CandidateLength[c] for c in Model.Candidate) >= Model.Length[i]
Model.MinLength = pyo.Constraint(Model.Item, rule = rule_LBLength)

Model 5a

In Model 5a, we have pairs of width and length constraints translated Equations (5) and (6) for the width and a similar pair of constraints for the length:

def rule_LBWidth1(Model, i):   # Width of allocated product must be at least width of each item it is allocated to (original width|length order)
    return sum(Model.Allocation[i, c] * Model.CandidateWidth[c] for c in Model.Candidate) >= Model.Width[i] + (Model.BigM * Model.Rotate[i])
Model.MinWidth1 = pyo.Constraint(Model.Item, rule = rule_LBWidth1)

def rule_LBLength1(Model, i):   # Length of allocated product must be at least length of each item it is allocated to (original width|length order)
    return sum(Model.Allocation[i, c] * Model.CandidateLength[c] for c in Model.Candidate) >= Model.Length[i] + (Model.BigM * Model.Rotate[i])
Model.MinLength1 = pyo.Constraint(Model.Item, rule = rule_LBLength1)

def rule_LBWidth2(Model, i):   # Width of allocated product must be at least width of each item it is allocated to (rotated)
    return sum(Model.Allocation[i, c] * Model.CandidateWidth[c] for c in Model.Candidate) >= Model.Length[i] + (Model.BigM * (1 - Model.Rotate[i]))
Model.MinWidth2 = pyo.Constraint(Model.Item, rule = rule_LBWidth2)

def rule_LBLength2(Model, i):   # Length of allocated product must be at least length of each item it is allocated to (rotated)
    return sum(Model.Allocation[i, c] * Model.CandidateLength[c] for c in Model.Candidate) >= Model.Width[i] + (Model.BigM * (1 - Model.Rotate[i]))
Model.MinLength2 = pyo.Constraint(Model.Item, rule = rule_LBLength2)

Model 5a has a single \(\text{BigM}\) value for all constraints. We set \(\text{BigM}\) equal to the largest dimension of the width and length across all items. In our data for 100 items, that maximum value equals 1,006.

Model 5b

Similarly, for Model 5b, we have pairs of width and length constraints. The only difference is that in each constraint each item has an individual \(\text{BigM}_i\) (note the subscript) rather than a universal \(\text{BigM}\). That is, on the RHS of the constraints, we replace Model.BigM with Model.BigM[i].

We can set each of the \(\text{BigM}_i\) equal to the absolute difference between each item's width and length – this is the smallest (tightest) value that achieves the outcome we want.

Solutions

Test with 20 items

Model 5 has more than three times the number of variables that Model 3 has – because it has three times the number of candidate stock product sizes, plus additional binary variables to switch between portrait and landscape orientation for each item. Model 5 also has more constraints, as we've turned each dimension constraint into a pair of constraints. This is an issue because Model 3 with 100 items is already close to the maximum size we can solve on our computer, due to the model's memory requirements.

So, to test Models 5a and 5b, we compare the results with Model 3 using a smaller data set of 20 items and 6 stock products. Importantly, we give Model 3 data where the dimensions of each item are sorted so that the width is always >= the length. For Models 5a and 5b we randomly sort the dimensions of each item, so they are effectively unsorted.

The results, using the HiGHS solver, are:

Model 3. Starts with 8,400 variables, reduced to 3,127 by presolve. Solves to optimality in less than 1 second.
Model 5a. Starts with 25,220 variables, reduced to 10,237 by presolve. Solves to optimality in 14 seconds.
Model 5b. Starts with 25,220 variables, reduced to 10,160 by presolve. Solves to optimality in 10 seconds.

Both Model 5a and 5b finds the same solution as Model 3. This demonstrates that the either/or constraints are working as expected. We can further demonstrate that Models 5a and 5b work correctly by solving them using the sorted data we used for Model 3 – they both find the same solution, irrespective of how the dimensions are sorted.

The solution times are interesting. Models 5a and 5b both take substantially longer than Model 3. Given that they have more variables and more constraints, this isn't surprising. We note that Model 5b is faster than Model 5a. This is also expected, as Model 5b has tighter \(BigM_i\) values, which usually leads to an easier model for the solver and hence faster solve times.

Full 100 items

Running the models with the full 100 items and 6 stock products (with the dimensions sorted for Model 3, but unsorted for Models 5a and 5b) produces the following results:

Model 3. Starts with 1,010,000 variables, reduced to 339,057 by presolve. Solves to optimality in 8 minutes.
Model 5a. Starts with 3,030,100 variables, reduced to 1,041,708 by presolve. Solves to optimality in 1.5 hours.
Model 5b. Starts with 3,030,100 variables, reduced to 2,019,960 by presolve. Solves to optimality in 11.0 hours.

All three models find the same optimal solution, as expected. Again, this demonstrates that sorting the item dimensions is equivalent to including either/or constraint to explicitly model item rotation.

Model 3 uses a peak of about 7.5 GB of RAM. As expected, Model 5a and Model 5b both use about three times as much memory, with a peak of 20 GB of RAM, reflecting the larger number of variables and constraints.

But the run times are not expected. After presolve, Model 5b has twice as many variables as Model 5a, and it takes seven times longer to solve. In this case, having tighter \(BigM_i\) values performs poorly.

The issue with Model 5b becomes clearer when we look at where the HiGHS solver spends its time. For both Model 3 and Model 5a, almost all the time is in the presolve phase – about 90% of the time for Model 3, and 97% of the time for Model 5a. This is unusual as, for most models, the presolve phase is a small part of the solving process. For Model 5b, only 16% of the time is spent in presolve. For some reason, HiGHS spends a lot less time eliminating variables in Model 5b, resulting in a model that is much larger and takes much longer to solve compared with Model 5a.

Having tight \(BigM_i\) values, defined individually for each instance of a constraint instead of a single value for all constraints, is usually a good idea. However, as shown in this example, that isn't always true. The only way to be sure is to try both approaches.

In any case, our either/or constraints work correctly, albeit with reduced solving performance. If we can shortcut the formulation by, for example, sorting the data, then all good. Otherwise, the technique we've described for formulating either/or constraints can be useful.

Next steps

Applying the \(BigM\) constraint technique allows us to formulate either/or constraints, so that we can explicitly model having the items in portrait or landscape orientation.

An alternative way to formulate either/or constraints is to use the Generalized Disjunctive Programming (GDP) part of the Pyomo library. Disjunctive programming is a powerful tool to, as the documentation says, "bridge high-level propositional logic and algebraic constraints".

That is, using notation specific to the GDP library, we specify that the solver should use one or other pair of constraints. We don't need to formulate exactly how that condition is implemented, as the GDP library takes care of the details for us.

So, in the next article, we implement a variation of Model 5 using Generalized Disjunctive Programming.

Conclusion

In this article, we apply a \(BigM\) constraint technique to implement either/or conditions on pairs of constraints, instead of implicitly mimicking the same effect by sorting the data. The \(BigM\) constraint technique is fairly simple, and it works as expected.

Model 5 has two variants, 5a and 5b, which explore the effect of having a single \(BigM\) value for all constraints or having individual \(BigM_i\) values for each constraint. Usually, we expect the tighter individual \(BigM_i\) values to perform better – as Model 5b does with small data sets. But with larger data sets, the performance of Model 5b degrades drastically. This is unexpected, and unusual – it appears to be due to a peculiarity in the way the solver handles our model. Although having individual \(BigM_i\) values is usually better, our experience with this model demonstrates that it isn't always better.

In the next article, we implement either/or constraints using the Generalized Disjunctive Programming (GDP) part of the Pyomo library.

If you would like to know more about this model, or you want help with your own models, then please contact us.

References

This series of articles addresses a question asked on Reddit:

Require improved & optimized solution, 14 September 2023.

Python
Pyomo

Optimal but not practical - Either/or BigM