Google OR-Tools v9.14
a fast and portable software suite for combinatorial optimization
Loading...
Searching...
No Matches
pricing.h
Go to the documentation of this file.
1// Copyright 2010-2025 Google LLC
2// Licensed under the Apache License, Version 2.0 (the "License");
3// you may not use this file except in compliance with the License.
4// You may obtain a copy of the License at
5//
6// http://www.apache.org/licenses/LICENSE-2.0
7//
8// Unless required by applicable law or agreed to in writing, software
9// distributed under the License is distributed on an "AS IS" BASIS,
10// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11// See the License for the specific language governing permissions and
12// limitations under the License.
13
14#ifndef OR_TOOLS_GLOP_PRICING_H_
15#define OR_TOOLS_GLOP_PRICING_H_
16
17#include <cmath>
18#include <random>
19#include <string>
20
21#include "absl/log/check.h"
22#include "absl/random/bit_gen_ref.h"
23#include "absl/random/random.h"
25#include "ortools/util/bitset.h"
26#include "ortools/util/stats.h"
27
28namespace operations_research {
29namespace glop {
30
31// Maintains a set of elements in [0, n), each with an associated value and
32// allows to query the element of maximum value efficiently.
33//
34// This is optimized for use in the pricing step of the simplex algorithm.
35// Basically at each simplex iterations, you want to:
36//
37// 1/ Get the candidate with the maximum value. The number of candidates
38// can be close to n, or really small. You also want some randomization if
39// several elements have an equivalent (maximum) value.
40//
41// 2/ Update the set of candidate and their values, where the number of update
42// is usually a lot smaller than n. Note that in some corner cases, there are
43// two "updates" phases, so a position can be updated twice.
44//
45// The idea is to be faster than O(num_candidates) per GetMaximum(), most of the
46// time. All updates should be in O(1) with as little overhead as possible. The
47// algorithm here dynamically maintain the top-k (for k=32) with best effort and
48// use it instead of doing a O(num_candidates) scan when possible.
49//
50// Note that when O(num_updates) << n, this can have a huge effect. A basic O(1)
51// per update, O(num_candidates) per maximum query was taking around 60% of the
52// total time on graph40-80-1rand.pb.gz ! with the top-32 algo coded here, it is
53// around 3%, and the number of "fast" GetMaximum() that hit the top-k heap on
54// the first 120s of that problem was 250757 / 255659. Note that n was 282624 in
55// this case, which is not even the biggest size we can tackle.
56//
57// Note(user): This could be moved to util/ as a general class if someone wants
58// to reuse it, it is however tuned for use in Glop pricing step and might
59// becomes even more specific in the future.
60template <typename Index>
62 public:
63 // To simplify the APIs, we take a random number generator at construction.
64 explicit DynamicMaximum(absl::BitGenRef random) : random_(random) {}
65
66 // Prepares the class to hold up to n candidates with indices in [0, n).
67 // Initially no indices is a candidate.
69
70 // Returns the index with the maximum value or Index(-1) if the set is empty
71 // and there is no possible candidate. If there are more than one candidate
72 // with the same maximum value, this will return a random one (not always
73 // uniformly if there is a large number of ties).
75
76 // Removes the given index from the set of candidates.
77 void Remove(Index position);
78
79 void SetRandom(absl::BitGenRef random) { random_ = random; }
80
81 // Adds an element to the set of candidate and sets its value. If the element
82 // is already present, this updates its value. The value must be finite.
83 void AddOrUpdate(Index position, Fractional value);
84
85 // Optimized version of AddOrUpdate() for the dense case. If one knows that
86 // there will be O(n) updates, it is possible to call StartDenseUpdates() and
87 // then use DenseAddOrUpdate() instead of AddOrUpdate() which is slighlty
88 // faster.
89 //
90 // Note that calling AddOrUpdate() will still works fine, but will cause an
91 // extra test per call.
93 void DenseAddOrUpdate(Index position, Fractional value);
94
95 // Returns the current size n that was used in the last ClearAndResize().
96 void Clear() { ClearAndResize(Index(0)); }
97 Index Size() const { return values_.size(); }
98
99 // Returns some stats about this class if they are enabled.
100 std::string StatString() const { return stats_.StatString(); }
101
102 private:
103 // Adds an elements to the set of top elements.
104 void UpdateTopK(Index position, Fractional value);
105
106 // Returns a random element from the set {best} U {equivalent_choices_}.
107 // If equivalent_choices_ is empty, this just returns best.
108 Index RandomizeIfManyChoices(Index best);
109
110 // For tie-breaking.
111 absl::BitGenRef random_;
112 std::vector<Index> equivalent_choices_;
113
114 // Set of candidates and their value.
115 // Note that if is_candidate_[index] is false, values_[index] can be anything.
117 Bitset64<Index> is_candidate_;
118
119 // We maintain the top-k current candidates for a fixed k. Note that not all
120 // entries in tops_ are necessary up to date since we don't remove elements.
121 // There can even be duplicate elements inside if Update() add an element
122 // already inside. This is fine, since tops_ will be recomputed as soon as we
123 // can't get the true maximum from there.
124 //
125 // The invariant is that:
126 // - All elements > threshold_ are in tops_.
127 // - All elements not in tops have a value <= threshold_.
128 // - elements == threshold_ can be in or out.
129 //
130 // In particular, the threshold only increase until the heap becomes empty and
131 // is recomputed from scratch by GetMaximum().
132 struct HeapElement {
133 HeapElement() = default;
134 HeapElement(Index i, Fractional v) : index(i), value(v) {}
135
136 Index index;
137 Fractional value;
138
139 // We want a min-heap: tops_.top() actually represents the k-th value, not
140 // the max.
141 double operator<(const HeapElement& other) const {
142 return value > other.value;
143 }
144 };
145 Fractional threshold_;
146 std::vector<HeapElement> tops_;
147
148 // Statistics about the class.
149 struct QueryStats : public StatsGroup {
150 QueryStats()
151 : StatsGroup("PricingStats"),
152 get_maximum("get_maximum", this),
153 heap_size_on_hit("heap_size_on_hit", this),
154 random_choices("random_choices", this) {}
155 TimeDistribution get_maximum;
156 IntegerDistribution heap_size_on_hit;
157 IntegerDistribution random_choices;
158 };
159 QueryStats stats_;
160};
161
162template <typename Index>
164 tops_.clear();
165 threshold_ = -kInfinity;
166 values_.resize(n);
167 is_candidate_.ClearAndResize(n);
168}
169
170template <typename Index>
172 is_candidate_.Clear(position);
173}
174
175template <typename Index>
177 // This disable tops_ until the next GetMaximum().
178 tops_.clear();
179 threshold_ = kInfinity;
180}
181
182template <typename Index>
184 Fractional value) {
185 DCHECK(!std::isnan(value));
186 DCHECK(tops_.empty());
187 is_candidate_.Set(position);
188 values_[position] = value;
189}
190
191template <typename Index>
193 Fractional value) {
194 DCHECK(!std::isnan(value));
195 is_candidate_.Set(position);
196 values_[position] = value;
197 if (value >= threshold_) UpdateTopK(position, value);
198}
199
200template <typename Index>
201inline Index DynamicMaximum<Index>::RandomizeIfManyChoices(Index best) {
202 if (equivalent_choices_.empty()) return best;
203 equivalent_choices_.push_back(best);
204 stats_.random_choices.Add(equivalent_choices_.size());
205
206 return equivalent_choices_[std::uniform_int_distribution<int>(
207 0, equivalent_choices_.size() - 1)(random_)];
208}
209
210template <typename Index>
212 SCOPED_TIME_STAT(&stats_);
213 Fractional best_value = -kInfinity;
214 Index best_position(-1);
215 equivalent_choices_.clear();
216
217 // Optimized version if the maximum is in tops_ already.
218 //
219 // We do two things here:
220 // 1/ Filter tops_ to only contain valid entries. This is because we never
221 // remove element, so the value of one of the element in tops might have
222 // decreased now. Note that we leave threshold_ untouched, so it
223 // can actually be lower than the minimum of the element in tops.
224 // 2/ Get the maximum of the valid elements.
225 if (!tops_.empty()) {
226 int new_size = 0;
227 for (const HeapElement e : tops_) {
228 // The two possible sources of "invalidity".
229 if (!is_candidate_[e.index]) continue;
230 if (values_[e.index] != e.value) continue;
231
232 tops_[new_size++] = e;
233 if (e.value >= best_value) {
234 if (e.value == best_value) {
235 equivalent_choices_.push_back(e.index);
236 continue;
237 }
238 equivalent_choices_.clear();
239 best_value = e.value;
240 best_position = e.index;
241 }
242 }
243 tops_.resize(new_size);
244 if (new_size != 0) {
245 stats_.heap_size_on_hit.Add(new_size);
246 return RandomizeIfManyChoices(best_position);
247 }
248 }
249
250 // We need to iterate over all the candidates.
251 threshold_ = -kInfinity;
252 DCHECK(tops_.empty());
253 const auto values = values_.const_view();
254 for (const Index position : is_candidate_) {
255 const Fractional value = values[position];
256
257 // TODO(user): Add a mode when we do not maintain the TopK for small sizes
258 // (like n < 1000) ? The gain might not be worth the extra code though.
259 if (value < threshold_) continue;
260 UpdateTopK(position, value);
261
262 if (value >= best_value) {
263 if (value == best_value) {
264 equivalent_choices_.push_back(position);
265 continue;
266 }
267 equivalent_choices_.clear();
268 best_value = value;
269 best_position = position;
270 }
271 }
272
273 return RandomizeIfManyChoices(best_position);
274}
275
276template <typename Index>
277inline void DynamicMaximum<Index>::UpdateTopK(Index position,
278 Fractional value) {
279 // Note that this should only be called when an update is required.
280 DCHECK_GE(value, threshold_);
281
282 // We use a compile time size of the form 2^n - 1 to have a full binary heap.
283 //
284 // TODO(user): Adapt the size depending on the problem size? Note sure it is
285 // worth it. To experiment more.
286 constexpr int k = 31;
287 static_assert(((k + 1) & k) == 0, "k + 1 should be a power of 2.");
288
289 // Simply grow the vector until we hit a size of k.
290 if (tops_.size() < k) {
291 tops_.emplace_back(position, value);
292 if (tops_.size() == k) {
293 std::make_heap(tops_.begin(), tops_.end());
294 threshold_ = tops_[0].value;
295 }
296 return;
297 }
298
299 // If the value is equal, we randomly replace it. Having some randomness can
300 // also be important to increase the chance of keeping the true maximum in the
301 // top k set.
302 //
303 // TODO(user): use proper probability by counting the number of ties seen and
304 // replacing a random minimum element to get an uniform distribution? Note
305 // that it will never be truly uniform since once the top k structure is
306 // constructed, we will reuse it as much as possible, so it will be biased
307 // towards elements already inside.
308 if (value == tops_[0].value) {
309 if (absl::Bernoulli(random_, 0.5)) {
310 tops_[0].index = position;
311 }
312 return;
313 }
314
315 // The code below is basically a custom implementation of this. It is however
316 // only slighlty faster for such a small heap. So it might not be completely
317 // worth it.
318 if (/*DISABLES CODE*/ (false)) {
319 std::pop_heap(tops_.begin(), tops_.end());
320 tops_.back() = HeapElement(position, value);
321 std::push_heap(tops_.begin(), tops_.end());
322 threshold_ = tops_[0].value;
323 return;
324 }
325
326 // To not have to do std::pop_heap() and then std::push_heap(), we code our
327 // own update. Note that we exploit the fact that k is of the form 2^n - 1 to
328 // save one test per update.
329 int i = 0;
330 DCHECK_EQ(tops_.size(), k);
331 constexpr int limit = k / 2;
332 for (; i < limit;) {
333 const int left_child = 2 * i + 1;
334 const int right_child = left_child + 1;
335 const Fractional l_value = tops_[left_child].value;
336 const Fractional r_value = tops_[right_child].value;
337 if (l_value > r_value) {
338 if (value <= r_value) break;
339 tops_[i] = tops_[right_child];
340 i = right_child;
341 } else {
342 if (value <= l_value) break;
343 tops_[i] = tops_[left_child];
344 i = left_child;
345 }
346 }
347 tops_[i] = HeapElement(position, value);
348 threshold_ = tops_[0].value;
349 DCHECK(std::is_heap(tops_.begin(), tops_.end()));
350}
351
352} // namespace glop
353} // namespace operations_research
354
355#endif // OR_TOOLS_GLOP_PRICING_H_
StatsGroup(absl::string_view name)
Definition stats.h:138
std::string StatString() const
Returns some stats about this class if they are enabled.
Definition pricing.h:100
void DenseAddOrUpdate(Index position, Fractional value)
Definition pricing.h:183
void Remove(Index position)
Removes the given index from the set of candidates.
Definition pricing.h:171
void Clear()
Returns the current size n that was used in the last ClearAndResize().
Definition pricing.h:96
DynamicMaximum(absl::BitGenRef random)
To simplify the APIs, we take a random number generator at construction.
Definition pricing.h:64
void AddOrUpdate(Index position, Fractional value)
Definition pricing.h:192
void SetRandom(absl::BitGenRef random)
Definition pricing.h:79
constexpr Fractional kInfinity
Infinity for type Fractional.
Definition lp_types.h:87
In SWIG mode, we don't want anything besides these top-level includes.
#define SCOPED_TIME_STAT(stats)
Definition stats.h:421