|
On the Multi-Criteria Nature of Real-World Problems
Real-life is rarely characterized by a position on the real number line, and even is frequently too impoverished in descriptive ability. Consider the apparently harmless question: “How do you like this article?”. If we ask you to rate it with an integer in the range [0 , 10], then (we claim, unless you feel it sings ‘zero’ from every corner), you will find it quite difficult, and feel compelled to qualify your answer (e.g. “Well, I think it’s a 5, but the list of references is better than that suggests, and the attempts at humor much worse, …”). If instead we provide a standard paper review form, which asks for ratings in perhaps ten distinct categories, you will feel that your responses give a much fairer reflection, and may even find it easier to fill out this form than distil your feelings into a single number. A more particular problem with asking for a single-value rating is this: suppose IEEE members have, about a year from now, rated, from 0 (poor) to 10 (excellent) every article that appeared in the first volume of the IEEE Neural Networks Society Newsletter. Suppose further that five of these articles have a mean rating of 9.5. Which of these is the ‘best’? Immediately it is clear that the response ‘they are equally good’ is unlikely to be correct, but the key point is that the single-value ratings do not help with this decision. If instead we had several ratings for each article, each rating for a different criterion, there would be much more information available to help discriminate between them. This would have allowed a rater to say ‘well, although I’d give it a rating of 9 out of 10 if I had to stick to one number, this multi-criteria system enables me to point out that the list of references is not really complete, and the overall impact it is likely to have is not quite at the level of other articles I have read’.
Thus characterized, we now have a vector of scores for each article, and the special multi-criteria concept of domination can come into play. One excellent article which scores (10, 8, 9, 10) respectively on a list of four criteria can be said to dominate another article whose scores are, respectively, (9, 8, 9, 10); the first article is at least as good as the second in every respect, and better in one respect (the first attribute). Crucially, however, we also have the concept of non-domination. Quite different articles may have the profiles (8,10,8,9) and (10,8,7,10) respectively. Neither of these dominates the other, and we say that they are non-dominated with respect to each other. It turns out that we now have a similar difficulty to the single-objective one in which we may require the ‘best’ of several articles with the same single-value score. The difficulty now is to choose the so-called ‘best’ of a set of articles which are all non-dominated with respect to each other.
However, the difficulty is not quite the same, for two increasingly revealing reasons. And to see this, we will move back into the world of optimization and consider that we are interested in, say, the design of a telecommunications network, or a factory production schedule, or a VLSI chip. First, in the single-scalar-value scenario, optimization methods are far more likely to present us with only one, or very few, ‘optimized’ designs. Even if there are several, these are highly likely to be very similar to each other, so it is essentially a single design which scores better than any other design that the optimizer could find in that trial run. Hence, this difficulty does not often arise in single-objective approaches. Second, when this arises in modern multi-criteria approaches, it is not a problem at all, but an opportunity. In fact, returning a set of non-dominated solutions is a priceless result of the optimization method, in which most likely there will be several very distinct solutions among the non-dominated set. This collection itself conveys a great deal of useful information about the space of possible workable designs to the owner of the problem.
For example, whereas the result of single-objective optimization may be a single design for an Olympic Games athletics schedule which minimizes well the degree to which medal ceremonies interfere with events happening at the same time, this solution may be quite awful from the athletes’ viewpoint, forcing, for example, successive heats of events such as the 100m, 200m, and 400m to be too close together for comfort. Especially for athletes involved in more than one of these events, performances will be severely hampered and the world-record toll of this Games would be sharply reduced. However, a proper multi-objective optimizer will give the problem solver a good range of quite distinct alternatives.
See Figure 1 – this shows a hypothetical picture of the space of good solutions in our Olympic Games example. The horizontal axis indicates the number of conflicts (overlaps in time) between athletics events and medal ceremonies, increasing to the right. We clearly aim to minimize this. The vertical axis is meant to indicate the degree to which two tiring events (involving one or more athletes performing in both of them) are uncomfortably close together in time. This increases as we go up, and we clearly want to minimize this too. The red and blue circles represent the non-dominated set of solutions returned by a multi-objective optimizer. In fact we will pretend that these are the best solutions possible (i.e. they are the same non-dominated set we would find if we searched the space of solutions exhaustively), and in this case they would represent what we call the Pareto front, and each of them is a Pareto-optimal point. The name comes from Vilfredo Pareto, an Italian economist and sociologist, known for, among other things, his ground-breaking work on the application of mathematical ideas in economics. This was crystallized in Pareto (1906), which set out the concept of a Pareto optimum.
Illustrating salient concepts of Multi-objective optimization on a contrived athletics events scheduling problem.
Now, note the topmost/leftmost blue circle in Figure 1. This corresponds to what we described earlier as the result from our single-objective optimizer. It scores very well at minimizing conflicts between athletics events and medal ceremonies, in fact it is the best possible solution in this sense, however note that it scores very poorly from the atheletes’ viewpoint; perhaps, for example, it involves all four successive stages of the 400m in one day.
This single objective optimization approach will probably have been trying to minimize , where m tallies the conflicts on the horizontal axis and d represents the degree of discomfort on the vertical axis, and and are positive weights, or penalty coefficients, roughly reflecting the respective importance of these two objectives. Notice that every distinct choice for the penalty coefficients corresponds to a different version of the optimization problem, and correspondingly has a different optimum. In fact, if we say , for some a, then we define a specific line with a specific slope, and all points in design space which fall on this line achieve the same single-objective quality. The purple line segments in figure 1 correspond to (parts of) two such lines, for two different values of a, for the same set of penalty coefficients. So, with the aid of figure 1 we can see that the optimum value with regard to a particular choice of weights corresponds to the smallest value of a for which a line is tangential to the Pareto front. Whatever choice of weights led to the purple lines, we can see that the extremal point we have discussed (the topmost/leftmost blue circle) is the optimum of that particular single objective function.
What is worth mentioning at this point is that all of the other Pareto solutions in this example are suboptimal for this set of weights, So, a simulated annealing approach which happened to find the optimum, for example, will have discarded any of the other Pareto optimal points it may or may not have met on the way. Had we a different set of weights instead, our earlier optimum would now be suboptimal, and a good optimizer would find another point along the Pareto front, corresponding to the tangent which relates to this choice of weights (e.g. perhaps this new set of weights will correspond to the slope of the orange line in figure 1. However, it is also worth noting that none of the red points in figure 1 is the optimal solution to any such penalty function. Such ‘concave’ regions in a Pareto front are very common, and may be very hard for single-objective optimizers to happen upon (and keep), although they may correspond to ideal tradeoffs between the objectives from the problem-solver’s viewpoint.
|