Google OR-Tools v9.11
a fast and portable software suite for combinatorial optimization
Loading...
Searching...
No Matches
dynamic_partition.h
Go to the documentation of this file.
1// Copyright 2010-2024 Google LLC
2// Licensed under the Apache License, Version 2.0 (the "License");
3// you may not use this file except in compliance with the License.
4// You may obtain a copy of the License at
5//
6// http://www.apache.org/licenses/LICENSE-2.0
7//
8// Unless required by applicable law or agreed to in writing, software
9// distributed under the License is distributed on an "AS IS" BASIS,
10// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11// See the License for the specific language governing permissions and
12// limitations under the License.
13
14// TODO(user): refine this toplevel comment when this file settles.
15//
16// Two dynamic partition classes: one that incrementally splits a partition
17// into more and more parts; one that incrementally merges a partition into less
18// and less parts.
19//
20// GLOSSARY:
21// The partition classes maintain a partition of N integers 0..N-1
22// (aka "elements") into disjoint equivalence classes (aka "parts").
23//
24// SAFETY:
25// Like vector<int> crashes when used improperly, these classes are not "safe":
26// most of their methods may crash if called with invalid arguments. The client
27// code is responsible for using this class properly. A few DCHECKs() will help
28// catch bugs, though.
29
30#ifndef OR_TOOLS_ALGORITHMS_DYNAMIC_PARTITION_H_
31#define OR_TOOLS_ALGORITHMS_DYNAMIC_PARTITION_H_
32
33#include <cstdint>
34#include <string>
35#include <vector>
36
37#include "absl/log/check.h"
38#include "absl/types/span.h"
39
40namespace operations_research {
41
42// Partition class that supports incremental splitting, with backtracking.
43// See http://en.wikipedia.org/wiki/Partition_refinement .
44// More precisely, the supported edit operations are:
45// - Refine the partition so that a subset S (typically, |S| <<< N)
46// of elements are all considered non-equivalent to any element in ¬S.
47// Typically, this should be done in O(|S|).
48// - Undo the above operations (backtracking).
49//
50// TODO(user): rename this to BacktrackableSplittingPartition.
52 public:
53 // Creates a DynamicPartition on n elements, numbered 0..n-1. Start with
54 // the trivial partition (only one subset containing all elements).
55 explicit DynamicPartition(int num_elements);
56
57 // Ditto, but specify the initial part of each elements. Part indices must
58 // form a dense integer set starting at 0; eg. [2, 1, 0, 1, 1, 3, 0] is valid.
59 explicit DynamicPartition(const std::vector<int>& initial_part_of_element);
60
61 // Accessors.
62 int NumElements() const { return element_.size(); }
63 int NumParts() const { return part_.size(); }
64
65 // To iterate over the elements in part #i:
66 // for (int element : partition.ElementsInPart(i)) { ... }
67 //
68 // ORDERING OF ELEMENTS INSIDE PARTS: the order of elements within a given
69 // part is volatile, and may change with Refine() or UndoRefine*() operations,
70 // even if the part itself doesn't change.
71 struct IterablePart;
72 IterablePart ElementsInPart(int i) const;
73
74 int PartOf(int element) const;
75 int SizeOfPart(int part) const;
76 int ParentOfPart(int part) const;
77
78 // A handy shortcut to ElementsInPart(PartOf(e)). The returned IterablePart
79 // will never be empty, since it contains at least i.
80 IterablePart ElementsInSamePartAs(int i) const;
81
82 // Returns a fingerprint of the given part. While collisions are possible,
83 // their probability is quite low. Two parts that have the same size and the
84 // same fingerprint are most likely identical.
85 // Also, two parts that have the exact same set of elements will *always*
86 // have the same fingerprint.
87 uint64_t FprintOfPart(int part) const;
88
89 // Refines the partition such that elements that are in distinguished_subset
90 // never share the same part as elements that aren't in that subset.
91 // This might be a no-op: in that case, NumParts() won't change, but the
92 // order of elements inside each part may change.
93 //
94 // ORDERING OF PARTS:
95 // For each i such that Part #i has a non-trivial intersection with
96 // "distinguished_subset" (neither empty, nor the full Part); Part #i is
97 // stripped out of all elements that are in "distinguished_subset", and
98 // those elements are sent to a newly created part, whose parent_part = i.
99 // The parts newly created by a single Refine() operations are sorted
100 // by parent_part.
101 // Example: a Refine() on a partition with 6 parts causes parts #1, #3 and
102 // #4 to be split: the partition will now contain 3 new parts: part #6 (with
103 // parent_part = 1), part #7 (with parent_part = 3) and part #8 (with
104 // parent_part = 4).
105 //
106 // TODO(user): the graph symmetry finder could probably benefit a lot from
107 // keeping track of one additional bit of information for each part that
108 // remains unchanged by a Refine() operation: was that part entirely *in*
109 // the distinguished subset or entirely *out*?
110 void Refine(absl::Span<const int> distinguished_subset);
111
112 // Undo one or several Refine() operations, until the number of parts
113 // becomes equal to "original_num_parts".
114 // Prerequisite: NumParts() >= original_num_parts.
115 void UndoRefineUntilNumPartsEqual(int original_num_parts);
116
117 // Converts the current partition to a string, like "3 | 1 2 | 0 4 5". Within
118 // each part, elements are sorted. And if sort_parts_lexicographically=true,
119 // the parts are sorted lexicographically instead of by their natural order.
120 std::string DebugString(bool sort_parts_lexicographically) const;
121
122 // ADVANCED USAGE:
123 // All elements (0..n-1) of the partition, sorted in a way that's compatible
124 // with the hierarchical partitioning:
125 // - All the elements of any given part are contiguous.
126 // - Elements of a part P are always after elements of part Parent(P).
127 // - The order remains identical (and the above property holds) after any
128 // UndoRefine*() operation.
129 // Note that the order does get changed by Refine() operations.
130 // This is a reference, so it'll only remain valid and constant until the
131 // class is destroyed or until Refine() get called.
132 const std::vector<int>& ElementsInHierarchicalOrder() const {
133 return element_;
134 }
135
136 private:
137 // A DynamicPartition instance maintains a list of all of its elements,
138 // 'sorted' by partitions: elements of the same subset are contiguous
139 // in that list.
140 std::vector<int> element_;
141
142 // The reverse of elements_[]: element_[index_of_[i]] = i.
143 std::vector<int> index_of_;
144
145 // part_of_[i] is the index of the part that contains element i.
146 std::vector<int> part_of_;
147
148 struct Part {
149 // This part holds elements[start_index .. end_index-1].
150 // INVARIANT: end_index > start_index.
151 int start_index; // Inclusive
152 int end_index; // Exclusive
153
154 // The Part that this part was split out of. See the comment at Refine().
155 // INVARIANT: part[i].parent_part <= i, and the equality holds iff part[i]
156 // has no parent.
157 int parent_part; // Index into the part[] array.
158
159 // The part's fingerprint is the XOR of all fingerprints of its elements.
160 // See FprintOfInt32() in the .cc.
161 uint64_t fprint;
162
163 Part() : start_index(0), end_index(0), parent_part(0), fprint(0) {}
164 Part(int start_index, int end_index, int parent_part, uint64_t fprint)
165 : start_index(start_index),
166 end_index(end_index),
167 parent_part(parent_part),
168 fprint(fprint) {}
169 };
170 std::vector<Part> part_; // The disjoint parts.
171
172 // Used temporarily and exclusively by Refine(). This prevents Refine()
173 // from being thread-safe.
174 // INVARIANT: tmp_counter_of_part_ contains only 0s before and after Refine().
175 std::vector<int> tmp_counter_of_part_;
176 std::vector<int> tmp_affected_parts_;
177};
178
180 std::vector<int>::const_iterator begin() const { return begin_; }
181 std::vector<int>::const_iterator end() const { return end_; }
182 std::vector<int>::const_iterator begin_;
183 std::vector<int>::const_iterator end_;
184
185 int size() const { return end_ - begin_; }
186
187 IterablePart() = default;
188 IterablePart(const std::vector<int>::const_iterator& b,
189 const std::vector<int>::const_iterator& e)
190 : begin_(b), end_(e) {}
191
192 // These typedefs allow this iterator to be used within testing::ElementsAre.
193 typedef int value_type;
194 typedef std::vector<int>::const_iterator const_iterator;
195};
196
197// Partition class that supports incremental merging, using the union-find
198// algorithm (see http://en.wikipedia.org/wiki/Disjoint-set_data_structure).
200 public:
201 // At first, all nodes are in their own singleton part.
203 explicit MergingPartition(int num_nodes) { Reset(num_nodes); }
204 void Reset(int num_nodes);
205
206 int NumNodes() const { return parent_.size(); }
207
208 // Complexity: amortized O(Ackermann⁻¹(N)) -- which is essentially O(1) --
209 // where N is the number of nodes.
210 //
211 // Return value: If this merge caused a representative node (of either node1
212 // or node2) to stop being a representative (because only one can remain);
213 // this method returns that removed representative. Otherwise it returns -1.
214 //
215 // Details: a smaller part will always be merged onto a larger one.
216 // Upons ties, the smaller representative becomes the overall representative.
217 int MergePartsOf(int node1, int node2); // The 'union' of the union-find.
218
219 // Get the representative of "node" (a node in the same equivalence class,
220 // which will also be returned for any other "node" in the same class).
221 // The complexity if the same as MergePartsOf().
222 int GetRootAndCompressPath(int node);
223
224 // Specialized reader API: prunes "nodes" to only keep at most one node per
225 // part: any node which is in the same part as an earlier node will be pruned.
226 void KeepOnlyOneNodePerPart(std::vector<int>* nodes);
227
228 // Output the whole partition as node equivalence classes: if there are K
229 // parts and N nodes, node_equivalence_classes[i] will contain the part index
230 // (a number in 0..K-1) of node #i. Parts will be sorted by their first node
231 // (i.e. node 0 will always be in part 0; then the next node that isn't in
232 // part 0 will be in part 1, and so on).
233 // Returns the number K of classes.
234 int FillEquivalenceClasses(std::vector<int>* node_equivalence_classes);
235
236 // Dump all components, with nodes sorted within each part and parts
237 // sorted lexicographically. Eg. "0 1 3 4 | 2 5 | 6 7 8".
238 std::string DebugString();
239
240 // Advanced usage: sets 'node' to be in its original singleton. All nodes
241 // who may point to 'node' as a parent will remain in an inconsistent state.
242 // This can be used to reinitialize a MergingPartition that has been sparsely
243 // modified in O(|modifications|).
244 // CRASHES IF USED INCORRECTLY.
245 void ResetNode(int node);
246
247 int NumNodesInSamePartAs(int node) {
248 return part_size_[GetRootAndCompressPath(node)];
249 }
250
251 // FOR DEBUGGING OR SPECIAL "CONST" ACCESS ONLY:
252 // Find the root of the union-find tree with leaf 'node', i.e. its
253 // representative node, but don't use path compression.
254 // The amortized complexity can be as bad as log(N), as opposed to the
255 // version using path compression.
256 int GetRoot(int node) const;
257
258 private:
259 // Along the upwards path from 'node' to its root, set the parent of all
260 // nodes (including the root) to 'parent'.
261 void SetParentAlongPathToRoot(int node, int parent);
262
263 std::vector<int> parent_;
264 std::vector<int> part_size_;
265
266 // Used transiently by KeepOnlyOneNodePerPart().
267 std::vector<bool> tmp_part_bit_;
268};
269
270// A subset of the API of DynamicPartition without backtrack support. The
271// Refine() here is about twice as fast, but we have limited query support until
272// a batch ComputeElementsByPart() is called.
274 public:
275 explicit SimpleDynamicPartition(int num_elements)
276 : part_of_(num_elements, 0),
277 size_of_part_(num_elements > 0 ? 1 : 0, num_elements) {}
278
279 int NumElements() const { return part_of_.size(); }
280 int NumParts() const { return size_of_part_.size(); }
281 int PartOf(int element) const { return part_of_[element]; }
282 int SizeOfPart(int part) const { return size_of_part_[part]; }
283
284 void Refine(absl::Span<const int> distinguished_subset);
285
286 // This is meant to be called once after a bunch of Refine(). The returned
287 // Span<> points into the given buffer which is re-initialized. To handle
288 // strongly typed int, we actually call T(int) as we fill the buffer.
289 template <typename T>
290 std::vector<absl::Span<const T>> GetParts(std::vector<T>* buffer);
291
292 private:
293 std::vector<int> part_of_;
294 std::vector<int> size_of_part_;
295
296 // Temp data. Always empty or all zero.
297 std::vector<int> temp_to_clean_;
298 std::vector<int> temp_data_by_part_;
299};
300
301// *** Implementation of inline methods of the above classes. ***
302
304 int i) const {
305 DCHECK_GE(i, 0);
306 DCHECK_LT(i, NumParts());
307 return IterablePart(element_.begin() + part_[i].start_index,
308 element_.begin() + part_[i].end_index);
309}
310
311inline int DynamicPartition::PartOf(int element) const {
312 DCHECK_GE(element, 0);
313 DCHECK_LT(element, part_of_.size());
314 return part_of_[element];
315}
316
317inline int DynamicPartition::SizeOfPart(int part) const {
318 DCHECK_GE(part, 0);
319 DCHECK_LT(part, part_.size());
320 const Part& p = part_[part];
321 return p.end_index - p.start_index;
322}
323
324inline int DynamicPartition::ParentOfPart(int part) const {
325 DCHECK_GE(part, 0);
326 DCHECK_LT(part, part_.size());
327 return part_[part].parent_part;
328}
329
334
335inline uint64_t DynamicPartition::FprintOfPart(int part) const {
336 DCHECK_GE(part, 0);
337 DCHECK_LT(part, part_.size());
338 return part_[part].fprint;
339}
340
341inline int MergingPartition::GetRoot(int node) const {
342 DCHECK_GE(node, 0);
343 DCHECK_LT(node, NumNodes());
344 int child = node;
345 while (true) {
346 const int parent = parent_[child];
347 if (parent == child) return child;
348 child = parent;
349 }
350}
351
352inline void MergingPartition::SetParentAlongPathToRoot(int node, int parent) {
353 DCHECK_GE(node, 0);
354 DCHECK_LT(node, NumNodes());
355 DCHECK_GE(parent, 0);
356 DCHECK_LT(parent, NumNodes());
357 int child = node;
358 while (true) {
359 const int old_parent = parent_[child];
360 parent_[child] = parent;
361 if (old_parent == child) return;
362 child = old_parent;
363 }
364}
365
366inline void MergingPartition::ResetNode(int node) {
367 DCHECK_GE(node, 0);
368 DCHECK_LT(node, NumNodes());
369 parent_[node] = node;
370 part_size_[node] = 1;
371}
372
373template <typename T>
374inline std::vector<absl::Span<const T>> SimpleDynamicPartition::GetParts(
375 std::vector<T>* buffer) {
376 const int num_elements = part_of_.size();
377 const int num_parts = size_of_part_.size();
378 buffer->resize(num_elements);
379
380 std::vector<absl::Span<const T>> result(num_parts);
381 if (result.empty()) return result;
382
383 // Compute start of each part in buffer.
384 std::vector<int>& starts = temp_data_by_part_;
385 starts.resize(num_parts, 0);
386 for (int i = 1; i < num_parts; ++i) {
387 starts[i] = starts[i - 1] + size_of_part_[i - 1];
388 }
389
390 // Fill result.
391 for (int i = 0; i < num_parts; ++i) {
392 result[i] = absl::MakeSpan(&(*buffer)[starts[i]], size_of_part_[i]);
393 }
394
395 // Copy elements in order and at their place.
396 for (int element = 0; element < num_elements; ++element) {
397 (*buffer)[starts[part_of_[element]]++] = T(element);
398 }
399 starts.clear();
400 return result;
401}
402} // namespace operations_research
403
404#endif // OR_TOOLS_ALGORITHMS_DYNAMIC_PARTITION_H_
const std::vector< int > & ElementsInHierarchicalOrder() const
std::string DebugString(bool sort_parts_lexicographically) const
IterablePart ElementsInSamePartAs(int i) const
void UndoRefineUntilNumPartsEqual(int original_num_parts)
void Refine(absl::Span< const int > distinguished_subset)
IterablePart ElementsInPart(int i) const
*** Implementation of inline methods of the above classes. ***
void KeepOnlyOneNodePerPart(std::vector< int > *nodes)
int FillEquivalenceClasses(std::vector< int > *node_equivalence_classes)
MergingPartition()
At first, all nodes are in their own singleton part.
std::vector< absl::Span< const T > > GetParts(std::vector< T > *buffer)
void Refine(absl::Span< const int > distinguished_subset)
int64_t b
Definition table.cc:45
In SWIG mode, we don't want anything besides these top-level includes.
int nodes
int value_type
These typedefs allow this iterator to be used within testing::ElementsAre.
IterablePart(const std::vector< int >::const_iterator &b, const std::vector< int >::const_iterator &e)
std::vector< int >::const_iterator begin() const
std::vector< int >::const_iterator end() const
std::vector< int >::const_iterator const_iterator