Google OR-Tools v9.11
a fast and portable software suite for combinatorial optimization
Loading...
Searching...
No Matches
perfect_matching.h
Go to the documentation of this file.
1// Copyright 2010-2024 Google LLC
2// Licensed under the Apache License, Version 2.0 (the "License");
3// you may not use this file except in compliance with the License.
4// You may obtain a copy of the License at
5//
6// http://www.apache.org/licenses/LICENSE-2.0
7//
8// Unless required by applicable law or agreed to in writing, software
9// distributed under the License is distributed on an "AS IS" BASIS,
10// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11// See the License for the specific language governing permissions and
12// limitations under the License.
13
14// Implementation of the Blossom V min-cost perfect matching algorithm. The
15// main source for the algo is the paper: "Blossom V: A new implementation
16// of a minimum cost perfect matching algorithm", Vladimir Kolmogorov.
17//
18// The Algorithm is a primal-dual algorithm. It always maintains a dual-feasible
19// solution. We recall some notations here, but see the paper for more details
20// as it is well written.
21//
22// TODO(user): This is a work in progress. The algo is not fully implemented
23// yet. The initial version is closer to Blossom IV since we update the dual
24// values for all trees at once with the same delta.
25
26#ifndef OR_TOOLS_GRAPH_PERFECT_MATCHING_H_
27#define OR_TOOLS_GRAPH_PERFECT_MATCHING_H_
28
29#include <cstdint>
30#include <functional>
31#include <memory>
32#include <string>
33#include <vector>
34
35#include "absl/base/attributes.h"
41
42namespace operations_research {
43
44class BlossomGraph;
45
46// Given an undirected graph with costs on each edges, this class allows to
47// compute a perfect matching with minimum cost. A matching is a set of disjoint
48// pairs of nodes connected by an edge. The matching is perfect if all nodes are
49// matched to each others.
51 public:
52 // TODO(user): For now we ask the number of nodes at construction, but we
53 // could automatically infer it from the added edges if needed.
55 explicit MinCostPerfectMatching(int num_nodes) { Reset(num_nodes); }
56
57 // Resets the class for a new graph.
58 //
59 // TODO(user): Eventually, we may support incremental Solves(). Or at least
60 // memory reuse if one wants to solve many problems in a row.
61 void Reset(int num_nodes);
62
63 // Adds an undirected edges between the two given nodes.
64 //
65 // For now we only accept non-negative cost.
66 // TODO(user): We can easily shift all costs if negative costs are needed.
67 //
68 // Important: The algorithm supports multi-edges, but it will be slower. So it
69 // is better to only add one edge with a minimum cost between two nodes. In
70 // particular, do not add both AddEdge(a, b, cost) and AddEdge(b, a, cost).
71 // TODO(user): We could just presolve them away.
72 void AddEdgeWithCost(int tail, int head, int64_t cost);
73
74 // Solves the min-cost perfect matching problem on the given graph.
75 //
76 // NOTE(user): If needed we could support a time limit. Aborting early will
77 // not provide a perfect matching, but the algorithm does maintain a valid
78 // lower bound on the optimal cost that gets better and better during
79 // execution until it reaches the optimal value. Similarly, it is easy to
80 // support an early stop if this bound crosses a preset threshold.
81 enum Status {
82 // A perfect matching with min-cost has been found.
84
85 // There is no perfect matching in this graph.
87
88 // The costs are too large and caused an overflow during the algorithm
89 // execution.
91
92 // Advanced usage: the matching is OPTIMAL and was computed without
93 // overflow, but its OptimalCost() does not fit on an int64_t. Note that
94 // Match() still work and you can re-compute the cost in double for
95 // instance.
97 };
98 ABSL_MUST_USE_RESULT Status Solve();
99
100 // Returns the cost of the perfect matching. Only valid when the last solve
101 // status was OPTIMAL.
102 int64_t OptimalCost() const {
103 DCHECK(optimal_solution_found_);
104 return optimal_cost_;
105 }
106
107 // Returns the node matched to the given node. In a perfect matching all nodes
108 // have a match. Only valid when the last solve status was OPTIMAL.
109 int Match(int node) const {
110 DCHECK(optimal_solution_found_);
111 return matches_[node];
112 }
113 const std::vector<int>& Matches() const {
114 DCHECK(optimal_solution_found_);
115 return matches_;
116 }
117
118 private:
119 std::unique_ptr<BlossomGraph> graph_;
120
121 // Fields used to report the optimal solution. Most of it could be read on
122 // the fly from BlossomGraph, but we prefer to copy them here. This allows to
123 // reclaim the memory of graph_ early or allows to still query the last
124 // solution if we later allow re-solve with incremental changes to the graph.
125 bool optimal_solution_found_ = false;
126 int64_t optimal_cost_ = 0;
127 int64_t maximum_edge_cost_ = 0;
128 std::vector<int> matches_;
129};
130
131// Class containing the main data structure used by the Blossom algorithm.
132//
133// At the core is the original undirected graph. During the algorithm execution
134// we might collapse nodes into so-called Blossoms. A Blossom is a cycle of
135// external nodes (which can be blossom nodes) of odd length (>= 3). The edges
136// of the cycle are called blossom-forming eges and will always be tight
137// (i.e. have a slack of zero). Once a Blossom is created, its nodes become
138// "internal" and are basically considered merged into the blossom node for the
139// rest of the algorithm (except if we later re-expand the blossom).
140//
141// Moreover, external nodes of the graph will have 3 possible types ([+], [-]
142// and free [0]). Free nodes will always be matched together in pairs. Nodes of
143// type [+] and [-] are arranged in a forest of alternating [+]/[-] disjoint
144// trees. Each unmatched node is the root of a tree, and of type [+]. Nodes [-]
145// will always have exactly one child to witch they are matched. [+] nodes can
146// have any number of [-] children, to which they are not matched. All the edges
147// of the trees will always be tight. Some examples below, double edges are used
148// for matched nodes:
149//
150// A matched pair of free nodes: [0] === [0]
151//
152// A possible rooted tree: [+] -- [-] ==== [+]
153// \
154// [-] ==== [+] ---- [-] === [+]
155// \
156// [-] === [+]
157//
158// A single unmatched node is also a tree: [+]
159//
160// TODO(user): For now this class does not maintain a second graph of edges
161// between the trees nor does it maintains priority queue of edges.
162//
163// TODO(user): For now we use CHECKs in many places to facilitate development.
164// Switch them to DCHECKs for speed once the code is more stable.
166 public:
167 // Typed index used by this class.
169 DEFINE_INT_TYPE(EdgeIndex, int);
171
172 // Basic constants.
173 // NOTE(user): Those can't be constexpr because of the or-tools export,
174 // which complains for constexpr DEFINE_INT_TYPE.
176 static const EdgeIndex kNoEdgeIndex;
178
179 // Node related data.
180 // We store the edges incident to a node separately in the graph_ member.
181 struct Node {
182 explicit Node(NodeIndex n) : parent(n), match(n), root(n) {}
183
184 // A node can be in one of these 4 exclusive states. Internal nodes are part
185 // of a Blossom and should be ignored until this Blossom is expanded. All
186 // the other nodes are "external". A free node is always matched to another
187 // free node. All the other external node are in alternating [+]/[-] trees
188 // rooted at the only unmatched node of the tree (always of type [+]).
189 bool IsInternal() const {
190 DCHECK(!is_internal || type == 0);
191 return is_internal;
192 }
193 bool IsFree() const { return type == 0 && !is_internal; }
194 bool IsPlus() const { return type == 1; }
195 bool IsMinus() const { return type == -1; }
196
197 // Is this node a blossom? if yes, it was formed by merging the node.blossom
198 // nodes together. Note that we reuse the index of node.blossom[0] for this
199 // blossom node. A blossom node can be of any type.
200 bool IsBlossom() const { return !blossom.empty(); }
201
202 // The type of this node. We use an int for convenience in the update
203 // formulas. This is 1 for [+] nodes, -1 for [-] nodes and 0 for all the
204 // others.
205 //
206 // Internal node also have a type of zero so the dual formula are correct.
207 int type = 0;
208
209 // Whether this node is part of a blossom.
210 bool is_internal = false;
211
212 // The parent of this node in its tree or itself otherwise.
213 // Unused for internal nodes.
215
216 // Itself if not matched, or this node match otherwise.
217 // Unused for internal nodes.
219
220 // The root of this tree which never changes until a tree is disassambled by
221 // an Augment(). Unused for internal nodes.
223
224 // The "delta" to apply to get the dual for nodes of this tree.
225 // This is only filled for root nodes (i.e unmatched nodes).
227
228 // See the formula in Dual() used to derive the true dual of this node.
229 // This is the equal to the "true" dual for free exterior node and internal
230 // node.
232
233#ifndef NDEBUG
234 // The true dual of this node. We only maintain this in debug mode.
236#endif
237
238 // Non-empty for Blossom only. The odd-cycle of blossom nodes that form this
239 // blossom. The first element should always be the current blossom node, and
240 // all the other nodes are internal nodes.
241 std::vector<NodeIndex> blossom;
242
243 // This allows to store information about a new blossom node created by
244 // Shrink() so that we can properly restore it on Expand(). Note that we
245 // store the saved information on the second node of a blossom cycle (and
246 // not the blossom node itself) because that node will be "hidden" until the
247 // blossom is expanded so this way, we do not need more than one set of
248 // saved information per node.
249#ifndef NDEBUG
251#endif
253 std::vector<NodeIndex> saved_blossom;
254 };
255
256 // An undirected edge between two nodes: tail <-> head.
257 struct Edge {
259 : pseudo_slack(c),
260#ifndef NDEBUG
261 slack(c),
262#endif
263 tail(t),
264 head(h) {
265 }
266
267 // Returns the "other" end of this edge.
269 DCHECK(n == tail || n == head);
270 return NodeIndex(tail.value() ^ head.value() ^ n.value());
271 }
272
273 // AdjustablePriorityQueue interface. Note that we use std::greater<> in
274 // our queues since we want the lowest pseudo_slack first.
276 int GetHeapIndex() const { return pq_position; }
277 bool operator>(const Edge& other) const {
278 return pseudo_slack > other.pseudo_slack;
279 }
280
281 // See the formula is Slack() used to derive the true slack of this edge.
283
284#ifndef NDEBUG
285 // We only maintain this in debug mode.
287#endif
288
289 // These are the current tail/head of this edges. These are changed when
290 // creating or expanding blossoms. The order do not matter.
291 //
292 // TODO(user): Consider using node_a/node_b instead to remove the "directed"
293 // meaning. I do need to think a bit more about it though.
296
297 // Position of this Edge in the underlying std::vector<> used to encode the
298 // heap of one priority queue. An edge can be in at most one priority queue
299 // which allow us to share this amongst queues.
300 int pq_position = -1;
301 };
302
303 // Creates a BlossomGraph on the given number of nodes.
304 explicit BlossomGraph(int num_nodes);
305
306 // Same comment as MinCostPerfectMatching::AddEdgeWithCost() applies.
308
309 // Heuristic to start with a dual-feasible solution and some matched edges.
310 // To be called once all edges are added. Returns false if the problem is
311 // detected to be INFEASIBLE.
312 ABSL_MUST_USE_RESULT bool Initialize();
313
314 // Enters a loop that perform one of Grow()/Augment()/Shrink()/Expand() until
315 // a fixed point is reached.
316 void PrimalUpdates();
317
318 // Computes the maximum possible delta for UpdateAllTrees() that keeps the
319 // dual feasibility. Dual update approach (2) from the paper. This also fills
320 // primal_update_edge_queue_.
322
323 // Applies the same dual delta to all trees. Dual update approach (2) from the
324 // paper.
326
327 // Returns true iff this node is matched and is thus not a tree root.
328 // This cannot live in the Node class because we need to know the NodeIndex.
329 bool NodeIsMatched(NodeIndex n) const;
330
331 // Returns the node matched to the given one, or n if this node is not
332 // currently matched.
333 NodeIndex Match(NodeIndex n) const;
334
335 // Adds to the tree of tail the free matched pair(head, Match(head)).
336 // The edge is only used in DCHECKs. We duplicate tail/head because the
337 // order matter here.
338 void Grow(EdgeIndex e, NodeIndex tail, NodeIndex head);
339
340 // Merges two tree and augment the number of matched nodes by 1. This is
341 // the only functions that change the current matching.
342 void Augment(EdgeIndex e);
343
344 // Creates a Blossom using the given [+] -- [+] edge between two nodes of the
345 // same tree.
346 void Shrink(EdgeIndex e);
347
348 // Expands a Blossom into its component.
349 void Expand(NodeIndex to_expand);
350
351 // Returns the current number of matched nodes.
352 int NumMatched() const { return nodes_.size() - unmatched_nodes_.size(); }
353
354 // Returns the current dual objective which is always a valid lower-bound on
355 // the min-cost matching. Note that this is capped to kint64max in case of
356 // overflow. Because all of our cost are positive, this starts at zero.
357 CostValue DualObjective() const;
358
359 // This must be called at the end of the algorithm to recover the matching.
360 void ExpandAllBlossoms();
361
362 // Return the "slack" of the given edge.
363 CostValue Slack(const Edge& edge) const;
364
365 // Returns the dual value of the given node (which might be a pseudo-node).
366 CostValue Dual(const Node& node) const;
367
368 // Display to VLOG(1) some statistic about the solve.
369 void DisplayStats() const;
370
371 // Checks that there is no possible primal update in the current
372 // configuration.
374
375 // Tests that the dual values are currently feasible.
376 // This should ALWAYS be the case.
377 bool DebugDualsAreFeasible() const;
378
379 // In debug mode, we maintain the real slack of each edges and the real dual
380 // of each node via this function. Both Slack() and Dual() checks in debug
381 // mode that the value computed is the correct one.
383
384 // Returns true iff this is an external edge with a slack of zero.
385 // An external edge is an edge between two external nodes.
386 bool DebugEdgeIsTightAndExternal(const Edge& edge) const;
387
388 // Getters to access node/edges from outside the class.
389 // Only used in tests.
390 const Edge& GetEdge(int e) const { return edges_[EdgeIndex(e)]; }
391 const Node& GetNode(int n) const { return nodes_[NodeIndex(n)]; }
392
393 // Display information for debugging.
394 std::string NodeDebugString(NodeIndex n) const;
395 std::string EdgeDebugString(EdgeIndex e) const;
396 std::string DebugString() const;
397
398 private:
399 // Returns the index of a tight edge between the two given external nodes.
400 // Returns kNoEdgeIndex if none could be found.
401 //
402 // TODO(user): Store edges for match/parent/blossom instead and remove the
403 // need for this function that can take around 10% of the running time on
404 // some problems.
405 EdgeIndex FindTightExternalEdgeBetweenNodes(NodeIndex tail, NodeIndex head);
406
407 // Appends the path from n to the root of its tree. Used by Augment().
408 void AppendNodePathToRoot(NodeIndex n, std::vector<NodeIndex>* path) const;
409
410 // Returns the depth of a node in its tree. Used by Shrink().
411 int GetDepth(NodeIndex n) const;
412
413 // Adds positive delta to dual_objective_ and cap at kint64max on overflow.
414 void AddToDualObjective(CostValue delta);
415
416 // In the presence of blossoms, the original tail/head of an arc might not be
417 // up to date anymore. It is important to use these functions instead in all
418 // the places where this can happen. That is basically everywhere except in
419 // the initialization.
420 NodeIndex Tail(const Edge& edge) const {
421 return root_blossom_node_[edge.tail];
422 }
423 NodeIndex Head(const Edge& edge) const {
424 return root_blossom_node_[edge.head];
425 }
426
427 // Returns the Head() or Tail() that does not correspond to node. Node that
428 // node must be one of the original index in the given edge, this is DCHECKed
429 // by edge.OtherEnd().
430 NodeIndex OtherEnd(const Edge& edge, NodeIndex node) const {
431 return root_blossom_node_[edge.OtherEnd(node)];
432 }
433
434 // Same as OtherEnd() but the given node should either be Tail(edge) or
435 // Head(edge) and do not need to be one of the original node of this edge.
436 NodeIndex OtherEndFromExternalNode(const Edge& edge, NodeIndex node) const {
437 const NodeIndex head = Head(edge);
438 if (head != node) {
439 DCHECK_EQ(node, Tail(edge));
440 return head;
441 }
442 return Tail(edge);
443 }
444
445 // Returns the given node and if this node is a blossom, all its internal
446 // nodes (recursively). Note that any call to SubNodes() invalidate the
447 // previously returned reference.
448 const std::vector<NodeIndex>& SubNodes(NodeIndex n);
449
450 // Just used to check that initialized is called exactly once.
451 bool is_initialized_ = false;
452
453 // The set of all edges/nodes of the graph.
456
457 // Identity for a non-blossom node, and its top blossom node (in case of many
458 // nested blossom) for an internal node.
460
461 // The current graph incidence. Note that one EdgeIndex should appear in
462 // exactly two places (on its tail and head incidence list).
464
465 // Used by SubNodes().
466 std::vector<NodeIndex> subnodes_;
467
468 // The unmatched nodes are exactly the root of the trees. After
469 // initialization, this is only modified by Augment() which removes two nodes
470 // from this list each time. Note that during Shrink()/Expand() we never
471 // change the indexing of the root nodes.
472 std::vector<NodeIndex> unmatched_nodes_;
473
474 // List of tight_edges and possible shrink to check in PrimalUpdates().
475 std::vector<EdgeIndex> primal_update_edge_queue_;
476 std::vector<EdgeIndex> possible_shrink_;
477
478 // Priority queues of edges of a certain types.
481 std::vector<Edge*> tmp_all_tops_;
482
483 // The dual objective. Increase as the algorithm progress. This is a lower
484 // bound on the min-cost of a perfect matching.
485 CostValue dual_objective_ = CostValue(0);
486
487 // Statistics on the main operations.
488 int64_t num_grows_ = 0;
489 int64_t num_augments_ = 0;
490 int64_t num_shrinks_ = 0;
491 int64_t num_expands_ = 0;
492 int64_t num_dual_updates_ = 0;
493};
494
495} // namespace operations_research
496
497#endif // OR_TOOLS_GRAPH_PERFECT_MATCHING_H_
void AddEdge(NodeIndex tail, NodeIndex head, CostValue cost)
Same comment as MinCostPerfectMatching::AddEdgeWithCost() applies.
void ExpandAllBlossoms()
This must be called at the end of the algorithm to recover the matching.
void DisplayStats() const
Display to VLOG(1) some statistic about the solve.
CostValue ComputeMaxCommonTreeDualDeltaAndResetPrimalEdgeQueue()
DEFINE_INT_TYPE(NodeIndex, int)
Typed index used by this class.
const Edge & GetEdge(int e) const
CostValue Slack(const Edge &edge) const
Return the "slack" of the given edge.
NodeIndex Match(NodeIndex n) const
void DebugUpdateNodeDual(NodeIndex n, CostValue delta)
int NumMatched() const
Returns the current number of matched nodes.
BlossomGraph(int num_nodes)
Creates a BlossomGraph on the given number of nodes.
std::string NodeDebugString(NodeIndex n) const
Display information for debugging.
static const EdgeIndex kNoEdgeIndex
const Node & GetNode(int n) const
std::string EdgeDebugString(EdgeIndex e) const
bool DebugEdgeIsTightAndExternal(const Edge &edge) const
static const NodeIndex kNoNodeIndex
CostValue Dual(const Node &node) const
Returns the dual value of the given node (which might be a pseudo-node).
void Expand(NodeIndex to_expand)
Expands a Blossom into its component.
bool NodeIsMatched(NodeIndex n) const
ABSL_MUST_USE_RESULT bool Initialize()
DEFINE_INT_TYPE(CostValue, int64_t)
void Grow(EdgeIndex e, NodeIndex tail, NodeIndex head)
static const CostValue kMaxCostValue
void AddEdgeWithCost(int tail, int head, int64_t cost)
@ INFEASIBLE
There is no perfect matching in this graph.
@ OPTIMAL
A perfect matching with min-cost has been found.
const std::vector< int > & Matches() const
Edge edge
int index
In SWIG mode, we don't want anything besides these top-level includes.
int64_t delta
Definition resource.cc:1709
int head
int tail
An undirected edge between two nodes: tail <-> head.
CostValue pseudo_slack
See the formula is Slack() used to derive the true slack of this edge.
bool operator>(const Edge &other) const
NodeIndex OtherEnd(NodeIndex n) const
Returns the "other" end of this edge.
Edge(NodeIndex t, NodeIndex h, CostValue c)
CostValue slack
We only maintain this in debug mode.
CostValue dual
The true dual of this node. We only maintain this in debug mode.
bool is_internal
Whether this node is part of a blossom.