Detailed Description

This class offers an alternative to gtl::linked_hash_set<> which is:

stateless: it works directly on a vector<int> or any similar container, without storing extra data anywhere;
faster when the number of unique values is 5K or above.

The memory usage can be O(num_distinct_values) at any time if you use AppendAndLazilyRemoveDuplicates(). In fact, unit tests verify that the average number of elements kept is ≤ 1.5 * num_distinct_values, making it comparable to a flat_hash_set<int> (whose overhead factor is ~1.68).

Usage pattern:

///< One instance of this can handle many sets on the same [0, n) domain. int N = 100'000; DenseIntDuplicateRemover deduper(N); ///< Uses N/8 bytes of memory. std::vector<int> values; ///< Your container. Could be RepeatedField<int>. for (int x : ...) { deduper.AppendAndLazilyRemoveDuplicates(x, &values); ///< O(1) amortized. } deduper.RemoveDuplicates(&values); ///< O(values.size())

Definition at line 51 of file duplicate_remover.h.

#include <duplicate_remover.h>

Public Member Functions
	DenseIntDuplicateRemover (int n)
template<class IntContainer>
void	RemoveDuplicates (IntContainer *container)
template<class IntContainer>
void	AppendAndLazilyRemoveDuplicates (int x, IntContainer *container)
template<>
void	Append (int x, std::vector< int > *container)
template<>
void	Append (int x, google::protobuf::RepeatedField< int > *container)
template<>
void	Truncate (size_t new_size, std::vector< int > *container)
template<>
void	Truncate (size_t new_size, google::protobuf::RepeatedField< int > *container)

Constructor & Destructor Documentation

◆ DenseIntDuplicateRemover()

operations_research::DenseIntDuplicateRemover::DenseIntDuplicateRemover ( int n )

inlineexplicit

Definition at line 53 of file duplicate_remover.h.

Member Function Documentation

◆ Append() [1/2]

template<>

void operations_research::DenseIntDuplicateRemover::Append	(	int	x,
		google::protobuf::RepeatedField< int > *	container )

inline

Definition at line 123 of file duplicate_remover.h.

◆ Append() [2/2]

template<>

void operations_research::DenseIntDuplicateRemover::Append	(	int	x,
		std::vector< int > *	container )

inline

Definition at line 117 of file duplicate_remover.h.

◆ AppendAndLazilyRemoveDuplicates()

template<class IntContainer>

void operations_research::DenseIntDuplicateRemover::AppendAndLazilyRemoveDuplicates	(	int	x,
		IntContainer *	container )

ALGORITHM: In order to remain stateless, yet call RemoveDuplicates() often enough that the size of the container remains O(num_distinct_elements), but not too often since we must remain O(1) time amortized, we randomize: every time we append an element, we'll call RemoveDuplicates() with probability 1/k, where k is the current size of the container. That way, the added expected complexity is O(k)*1/k = O(1), yet we know that we'll eventually call it. See the unit tests that verify the claims. As an important optimization, since drawing the pseudo-random number is expensive, we only perform it every kCheckPeriod, and to compensate we multiply the probability by the same amount.

Definition at line 89 of file duplicate_remover.h.

◆ RemoveDuplicates()

template<class IntContainer>

void operations_research::DenseIntDuplicateRemover::RemoveDuplicates ( IntContainer * container )

Implementation of the templates.

Definition at line 83 of file duplicate_remover.h.

◆ Truncate() [1/2]

template<>

void operations_research::DenseIntDuplicateRemover::Truncate	(	size_t	new_size,
		google::protobuf::RepeatedField< int > *	container )

inline

Definition at line 135 of file duplicate_remover.h.

◆ Truncate() [2/2]

template<>

void operations_research::DenseIntDuplicateRemover::Truncate	(	size_t	new_size,
		std::vector< int > *	container )

inline

Definition at line 129 of file duplicate_remover.h.

The documentation for this class was generated from the following files:

ortools/algorithms/duplicate_remover.h
ortools/algorithms/duplicate_remover.cc

Detailed Description

Public Member Functions

Constructor & Destructor Documentation

◆ DenseIntDuplicateRemover()

Member Function Documentation

◆ Append() [1/2]

◆ Append() [2/2]

◆ AppendAndLazilyRemoveDuplicates()

◆ RemoveDuplicates()

◆ Truncate() [1/2]

◆ Truncate() [2/2]