cds  2.3.2
cds::container::FeldmanHashSet< GC, T, Traits > Class Template Reference

Hash set based on multi-level array. More...

#include <cds/container/impl/feldman_hashset.h>

Inheritance diagram for cds::container::FeldmanHashSet< GC, T, Traits >:
cds::intrusive::FeldmanHashSet< GC, T, Traits >

Public Types

typedef GC gc
 Garbage collector.
 
typedef T value_type
 type of value stored in the set
 
typedef Traits traits
 Traits template parameter, see feldman_hashset::traits.
 
typedef base_class::hash_accessor hash_accessor
 Hash accessor functor.
 
typedef base_class::hash_type hash_type
 Hash type deduced from hash_accessor return type.
 
typedef base_class::hash_comparator hash_comparator
 hash compare functor based on opt::compare and opt::less option setter
 
typedef traits::item_counter item_counter
 Item counter type.
 
typedef traits::allocator allocator
 Element allocator.
 
typedef traits::node_allocator node_allocator
 Array node allocator.
 
typedef traits::memory_model memory_model
 Memory model.
 
typedef traits::back_off back_off
 Backoff strategy.
 
typedef traits::stat stat
 Internal statistics type.
 
typedef gc::template guarded_ptr< value_typeguarded_ptr
 Guarded pointer.
 
typedef feldman_hashset::level_statistics level_statistics
 Level statistics.
 

Public Member Functions

 FeldmanHashSet (size_t head_bits=8, size_t array_bits=4)
 Creates empty set. More...
 
 ~FeldmanHashSet ()
 Destructs the set and frees all data.
 
template<typename Q >
bool insert (Q const &val)
 Inserts new element. More...
 
template<typename Q , typename Func >
bool insert (Q const &val, Func f)
 Inserts new element. More...
 
template<typename Q , typename Func >
std::pair< bool, bool > update (Q const &val, Func func, bool bInsert=true)
 Updates the element. More...
 
template<typename... Args>
bool emplace (Args &&... args)
 Inserts data of type value_type created in-place from std::forward<Args>(args)... More...
 
bool erase (hash_type const &hash)
 Deletes the item from the set. More...
 
template<typename Func >
bool erase (hash_type const &hash, Func f)
 Deletes the item from the set. More...
 
bool erase_at (iterator const &iter)
 Deletes the item pointed by iterator iter. More...
 
guarded_ptr extract (hash_type const &hash)
 Extracts the item with specified hash. More...
 
template<typename Func >
bool find (hash_type const &hash, Func f)
 Finds an item by it's hash. More...
 
bool contains (hash_type const &hash)
 Checks whether the set contains hash. More...
 
guarded_ptr get (hash_type const &hash)
 Finds an item by it's hash and returns the item found. More...
 
void clear ()
 Clears the set (non-atomic) More...
 
bool empty () const
 Checks if the set is empty. More...
 
size_t size () const
 Returns item count in the set.
 
stat const & statistics () const
 Returns const reference to internal statistics.
 
size_t head_size () const
 Returns the size of head node.
 
size_t array_node_size () const
 Returns the size of the array node.
 
void get_level_statistics (std::vector< feldman_hashset::level_statistics > &stat) const
 Collects tree level statistics into stat. More...
 

Static Public Attributes

static constexpr size_t const c_nHazardPtrCount = base_class::c_nHazardPtrCount
 Count of hazard pointers required.
 
static constexpr size_t const c_hash_size = base_class::c_hash_size
 The size of hash_type in bytes, see feldman_hashset::traits::hash_size for explanation.
 

Thread-safe iterators

typedef base_class::iterator iterator
 Bidirectional iterator. More...
 
typedef base_class::const_iterator const_iterator
 bidirectional const iterator type
 
typedef base_class::reverse_iterator reverse_iterator
 bidirectional reverse iterator type
 
typedef base_class::const_reverse_iterator const_reverse_iterator
 bidirectional reverse const iterator type
 
iterator begin ()
 Returns an iterator to the beginning of the set.
 
const_iterator begin () const
 Returns an const iterator to the beginning of the set.
 
const_iterator cbegin ()
 Returns an const iterator to the beginning of the set.
 
iterator end ()
 Returns an iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
const_iterator end () const
 Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
const_iterator cend ()
 Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
reverse_iterator rbegin ()
 Returns a reverse iterator to the first element of the reversed set.
 
const_reverse_iterator rbegin () const
 Returns a const reverse iterator to the first element of the reversed set.
 
const_reverse_iterator crbegin ()
 Returns a const reverse iterator to the first element of the reversed set.
 
reverse_iterator rend ()
 Returns a reverse iterator to the element following the last element of the reversed set. More...
 
const_reverse_iterator rend () const
 Returns a const reverse iterator to the element following the last element of the reversed set. More...
 
const_reverse_iterator crend ()
 Returns a const reverse iterator to the element following the last element of the reversed set. More...
 

Additional Inherited Members

- Protected Types inherited from cds::intrusive::FeldmanHashSet< GC, T, Traits >
typedef GC gc
 Garbage collector.
 
typedef T value_type
 type of value stored in the set
 
typedef Traits traits
 Traits template parameter, see feldman_hashset::traits.
 
typedef traits::hash_accessor hash_accessor
 Hash accessor functor.
 
typedef base_class::hash_type hash_type
 Hash type deduced from hash_accessor return type.
 
typedef traits::disposer disposer
 data node disposer
 
typedef base_class::hash_comparator hash_comparator
 hash compare functor based on traits::compare and traits::less options
 
typedef traits::item_counter item_counter
 Item counter type.
 
typedef traits::node_allocator node_allocator
 Array node allocator.
 
typedef traits::memory_model memory_model
 Memory model.
 
typedef traits::back_off back_off
 Backoff strategy.
 
typedef traits::stat stat
 Internal statistics type.
 
typedef gc::template guarded_ptr< value_typeguarded_ptr
 Guarded pointer.
 
typedef feldman_hashset::level_statistics level_statistics
 Level statistics.
 
typedef implementation_defined iterator
 bidirectional iterator type
 
typedef implementation_defined const_iterator
 bidirectional const iterator type
 
typedef implementation_defined reverse_iterator
 bidirectional reverse iterator type
 
typedef implementation_defined const_reverse_iterator
 bidirectional reverse const iterator type
 
- Protected Member Functions inherited from cds::intrusive::FeldmanHashSet< GC, T, Traits >
 FeldmanHashSet (size_t head_bits=8, size_t array_bits=4)
 Creates empty set. More...
 
 ~FeldmanHashSet ()
 Destructs the set and frees all data.
 
bool insert (value_type &val)
 Inserts new node. More...
 
template<typename Func >
bool insert (value_type &val, Func f)
 Inserts new node. More...
 
std::pair< bool, bool > update (value_type &val, bool bInsert=true)
 Updates the node. More...
 
bool unlink (value_type const &val)
 Unlinks the item val from the set. More...
 
bool erase (hash_type const &hash)
 Deletes the item from the set. More...
 
template<typename Func >
bool erase (hash_type const &hash, Func f)
 Deletes the item from the set. More...
 
bool erase_at (iterator const &iter)
 Deletes the item pointed by iterator iter. More...
 
guarded_ptr extract (hash_type const &hash)
 Extracts the item with specified hash. More...
 
template<typename Func >
bool find (hash_type const &hash, Func f)
 Finds an item by it's hash. More...
 
bool contains (hash_type const &hash)
 Checks whether the set contains hash. More...
 
guarded_ptr get (hash_type const &hash)
 Finds an item by it's hash and returns the item found. More...
 
void clear ()
 Clears the set (non-atomic) More...
 
bool empty () const
 Checks if the set is empty. More...
 
size_t size () const
 Returns item count in the set.
 
stat const & statistics () const
 Returns const reference to internal statistics.
 
void get_level_statistics (std::vector< feldman_hashset::level_statistics > &stat) const
 Collects tree level statistics into stat. More...
 
iterator begin ()
 Returns an iterator to the beginning of the set. More...
 
const_iterator begin () const
 Returns an const iterator to the beginning of the set.
 
const_iterator cbegin ()
 Returns an const iterator to the beginning of the set.
 
iterator end ()
 Returns an iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
const_iterator end () const
 Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
const_iterator cend ()
 Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.
 
reverse_iterator rbegin ()
 Returns a reverse iterator to the first element of the reversed set.
 
const_reverse_iterator rbegin () const
 Returns a const reverse iterator to the first element of the reversed set.
 
const_reverse_iterator crbegin ()
 Returns a const reverse iterator to the first element of the reversed set.
 
reverse_iterator rend ()
 Returns a reverse iterator to the element following the last element of the reversed set. More...
 
const_reverse_iterator rend () const
 Returns a const reverse iterator to the element following the last element of the reversed set. More...
 
const_reverse_iterator crend ()
 Returns a const reverse iterator to the element following the last element of the reversed set. More...
 
- Static Protected Attributes inherited from cds::intrusive::FeldmanHashSet< GC, T, Traits >
static constexpr size_t const c_nHazardPtrCount = 2
 Count of hazard pointers required.
 
static constexpr size_t const c_hash_size = base_class::c_hash_size
 The size of hash_type in bytes, see feldman_hashset::traits::hash_size for explanation.
 

Detailed Description

template<class GC, typename T, class Traits = feldman_hashset::traits>
class cds::container::FeldmanHashSet< GC, T, Traits >

Hash set based on multi-level array.

Source:

  • [2013] Steven Feldman, Pierre LaBorde, Damian Dechev "Concurrent Multi-level Arrays: Wait-free Extensible Hash Maps"

[From the paper] The hardest problem encountered while developing a parallel hash map is how to perform a global resize, the process of redistributing the elements in a hash map that occurs when adding new buckets. The negative impact of blocking synchronization is multiplied during a global resize, because all threads will be forced to wait on the thread that is performing the involved process of resizing the hash map and redistributing the elements. FeldmanHashSet implementation avoids global resizes through new array allocation. By allowing concurrent expansion this structure is free from the overhead of an explicit resize, which facilitates concurrent operations.

The presented design includes dynamic hashing, the use of sub-arrays within the hash map data structure; which, in combination with perfect hashing, means that each element has a unique final, as well as current, position. It is important to note that the perfect hash function required by our hash map is trivial to realize as any hash function that permutes the bits of the key is suitable. This is possible because of our approach to the hash function; we require that it produces hash values that are equal in size to that of the key. We know that if we expand the hash map a fixed number of times there can be no collision as duplicate keys are not provided for in the standard semantics of a hash map.

FeldmanHashSet is a multi-level array which has an internal structure similar to a tree:

feldman_hashset.png

The multi-level array differs from a tree in that each position on the tree could hold an array of nodes or a single node. A position that holds a single node is a dataNode which holds the hash value of a key and the value that is associated with that key; it is a simple struct holding two variables. A dataNode in the multi-level array could be marked. A markedDataNode refers to a pointer to a dataNode that has been bitmarked at the least significant bit (LSB) of the pointer to the node. This signifies that this dataNode is contended. An expansion must occur at this node; any thread that sees this markedDataNode will try to replace it with an arrayNode; which is a position that holds an array of nodes. The pointer to an arrayNode is differentiated from that of a pointer to a dataNode by a bitmark on the second-least significant bit.

FeldmanHashSet multi-level array is similar to a tree in that we keep a pointer to the root, which is a memory array called head. The length of the head memory array is unique, whereas every other arrayNode has a uniform length; a normal arrayNode has a fixed power-of-two length equal to the binary logarithm of a variable called arrayLength. The maximum depth of the tree, maxDepth, is the maximum number of pointers that must be followed to reach any node. We define currentDepth as the number of memory arrays that we need to traverse to reach the arrayNode on which we need to operate; this is initially one, because of head.

That approach to the structure of the hash set uses an extensible hashing scheme; the hash value is treated as a bit string and rehash incrementally.

Note
Two important things you should keep in mind when you're using FeldmanHashSet:
  • all keys must be fixed-size. It means that you cannot use std::string as a key for FeldmanHashSet. Instead, for the strings you should use well-known hashing algorithms like SHA1, SHA2, MurmurHash, CityHash or its successor FarmHash and so on, which converts variable-length strings to fixed-length bit-strings, and use that hash as a key in FeldmanHashSet.
  • FeldmanHashSet uses a perfect hashing. It means that if two different keys, for example, of type std::string, have identical hash then you cannot insert both that keys in the set. FeldmanHashSet does not maintain the key, it maintains its fixed-size hash value.

The set supports bidirectional thread-safe iterators.

Template parameters:

  • GC - safe memory reclamation schema. Can be gc::HP, gc::DHP or one of RCU type
  • T - a value type to be stored in the set
  • Traits - type traits, the structure based on feldman_hashset::traits or result of feldman_hashset::make_traits metafunction. Traits is the mandatory argument because it has one mandatory type - an accessor to hash value of T. The set algorithm does not calculate that hash value.

There are several specializations of FeldmanHashSet for each GC. You should include:

  • <cds/container/feldman_hashset_hp.h> for gc::HP garbage collector
  • <cds/container/feldman_hashset_dhp.h> for gc::DHP garbage collector
  • <cds/container/feldman_hashset_rcu.h> for RCU type. RCU specialization has a slightly different interface.

Member Typedef Documentation

◆ iterator

template<class GC , typename T , class Traits = feldman_hashset::traits>
typedef base_class::iterator cds::container::FeldmanHashSet< GC, T, Traits >::iterator

Bidirectional iterator.

The set supports thread-safe iterators: you may iterate over the set in multi-threaded environment. It is guaranteed that the iterators will remain valid even if another thread deletes the node the iterator points to: Hazard Pointer embedded into the iterator object protects the node from physical reclamation.

Note
Since the iterator object contains hazard pointer that is a thread-local resource, the iterator should not be passed to another thread.

Each iterator object supports the following interface:

  • dereference operators:
    value_type [const] * operator ->() noexcept
    value_type [const] & operator *() noexcept
  • pre-increment and pre-decrement. Post-operators is not supported
  • equality operators == and !=. Iterators are equal iff they point to the same cell of the same array node. Note that for two iterators it1 and it2, the conditon it1 == it2 does not entail &(*it1) == &(*it2)
  • helper member function release() that clears internal hazard pointer. After release() the iterator points to nullptr but it still remain valid: further iterating is possible.

During iteration you may safely erase any item from the set; erase_at() function call doesn't invalidate any iterator. If some iterator points to the item to be erased, that item is not deleted immediately but only after that iterator will be advanced forward or backward.

Note
It is possible the item can be iterated more that once, for example, if an iterator points to the item in array node that is being splitted.

Constructor & Destructor Documentation

◆ FeldmanHashSet()

template<class GC , typename T , class Traits = feldman_hashset::traits>
cds::container::FeldmanHashSet< GC, T, Traits >::FeldmanHashSet ( size_t  head_bits = 8,
size_t  array_bits = 4 
)
inline

Creates empty set.

Parameters
head_bits- 2head_bits specifies the size of head array, minimum is 4.
array_bits- 2array_bits specifies the size of array node, minimum is 2.

Equation for head_bits and array_bits:

sizeof(hash_type) * 8 == head_bits + N * array_bits

where N is multi-level array depth.

Member Function Documentation

◆ clear()

template<class GC , typename T , class Traits = feldman_hashset::traits>
void cds::container::FeldmanHashSet< GC, T, Traits >::clear ( )
inline

Clears the set (non-atomic)

The function unlink all data node from the set. The function is not atomic but is thread-safe. After clear() the set may not be empty because another threads may insert items.

◆ contains()

template<class GC , typename T , class Traits = feldman_hashset::traits>
bool cds::container::FeldmanHashSet< GC, T, Traits >::contains ( hash_type const &  hash)
inline

Checks whether the set contains hash.

The function searches the item by its hash and returns true if it is found, or false otherwise.

◆ crend()

template<class GC , typename T , class Traits = feldman_hashset::traits>
const_reverse_iterator cds::container::FeldmanHashSet< GC, T, Traits >::crend ( )
inline

Returns a const reverse iterator to the element following the last element of the reversed set.

It corresponds to the element preceding the first element of the non-reversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

◆ emplace()

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename... Args>
bool cds::container::FeldmanHashSet< GC, T, Traits >::emplace ( Args &&...  args)
inline

Inserts data of type value_type created in-place from std::forward<Args>(args)...

Returns true if inserting successful, false otherwise.

◆ empty()

template<class GC , typename T , class Traits = feldman_hashset::traits>
bool cds::container::FeldmanHashSet< GC, T, Traits >::empty ( ) const
inline

Checks if the set is empty.

Emptiness is checked by item counting: if item count is zero then the set is empty. Thus, the correct item counting feature is an important part of the set implementation.

◆ erase() [1/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
bool cds::container::FeldmanHashSet< GC, T, Traits >::erase ( hash_type const &  hash)
inline

Deletes the item from the set.

The function searches hash in the set, deletes the item found, and returns true. If that item is not found the function returns false.

◆ erase() [2/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename Func >
bool cds::container::FeldmanHashSet< GC, T, Traits >::erase ( hash_type const &  hash,
Func  f 
)
inline

Deletes the item from the set.

The function searches hash in the set, call f functor with item found, and deltes the element from the set.

The Func interface is

struct functor {
void operator()( value_type& item );
};

If hash is not found the function returns false.

◆ erase_at()

template<class GC , typename T , class Traits = feldman_hashset::traits>
bool cds::container::FeldmanHashSet< GC, T, Traits >::erase_at ( iterator const &  iter)
inline

Deletes the item pointed by iterator iter.

Returns true if the operation is successful, false otherwise.

The function does not invalidate the iterator, it remains valid and can be used for further traversing.

◆ extract()

template<class GC , typename T , class Traits = feldman_hashset::traits>
guarded_ptr cds::container::FeldmanHashSet< GC, T, Traits >::extract ( hash_type const &  hash)
inline

Extracts the item with specified hash.

The function searches hash in the set, unlinks it from the set, and returns a guarded pointer to the item extracted. If hash is not found the function returns an empty guarded pointer.

The item returned is reclaimed by garbage collector GC when returned guarded_ptr object to be destroyed or released.

Note
Each guarded_ptr object uses the GC's guard that can be limited resource.

Usage:

my_set theSet;
// ...
{
my_set::guarded_ptr gp( theSet.extract( 5 ));
if ( gp ) {
// Deal with gp
// ...
}
// Destructor of gp releases internal HP guard
}

◆ find()

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename Func >
bool cds::container::FeldmanHashSet< GC, T, Traits >::find ( hash_type const &  hash,
Func  f 
)
inline

Finds an item by it's hash.

The function searches the item by hash and calls the functor f for item found. The interface of Func functor is:

struct functor {
void operator()( value_type& item );
};

where item is the item found.

The functor may change non-key fields of item. Note that the functor is only guarantee that item cannot be disposed during the functor is executing. The functor does not serialize simultaneous access to the set's item. If such access is possible you must provide your own synchronization schema on item level to prevent unsafe item modifications.

The function returns true if hash is found, false otherwise.

◆ get()

template<class GC , typename T , class Traits = feldman_hashset::traits>
guarded_ptr cds::container::FeldmanHashSet< GC, T, Traits >::get ( hash_type const &  hash)
inline

Finds an item by it's hash and returns the item found.

The function searches the item by its hash and returns the guarded pointer to the item found. If hash is not found the function returns an empty guarded_ptr.

Note
Each guarded_ptr object uses one GC's guard which can be limited resource.

Usage:

my_set theSet;
// ...
{
my_set::guarded_ptr gp( theSet.get( 5 ));
if ( theSet.get( 5 )) {
// Deal with gp
//...
}
// Destructor of guarded_ptr releases internal HP guard
}

◆ get_level_statistics()

template<class GC , typename T , class Traits = feldman_hashset::traits>
void cds::container::FeldmanHashSet< GC, T, Traits >::get_level_statistics ( std::vector< feldman_hashset::level_statistics > &  stat) const
inline

Collects tree level statistics into stat.

The function traverses the set and collects statistics for each level of the tree into feldman_hashset::level_statistics struct. The element of stat[i] represents statistics for level i, level 0 is head array. The function is thread-safe and may be called in multi-threaded environment.

Result can be useful for estimating efficiency of hash functor you use.

◆ insert() [1/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename Q >
bool cds::container::FeldmanHashSet< GC, T, Traits >::insert ( Q const &  val)
inline

Inserts new element.

The function creates an element with copy of val value and then inserts it into the set.

The type Q should contain as minimum the complete hash for the element. The object of value_type should be constructible from a value of type Q. In trivial case, Q is equal to value_type.

Returns true if val is inserted into the set, false otherwise.

◆ insert() [2/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename Q , typename Func >
bool cds::container::FeldmanHashSet< GC, T, Traits >::insert ( Q const &  val,
Func  f 
)
inline

Inserts new element.

The function allows to split creating of new item into two part:

  • create item with key only
  • insert new item into the set
  • if inserting is success, calls f functor to initialize value-fields of val.

The functor signature is:

void func( value_type& val );

where val is the item inserted. User-defined functor f should guarantee that during changing val no any other changes could be made on this set's item by concurrent threads. The user-defined functor is called only if the inserting is success.

◆ rend() [1/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
reverse_iterator cds::container::FeldmanHashSet< GC, T, Traits >::rend ( )
inline

Returns a reverse iterator to the element following the last element of the reversed set.

It corresponds to the element preceding the first element of the non-reversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

◆ rend() [2/2]

template<class GC , typename T , class Traits = feldman_hashset::traits>
const_reverse_iterator cds::container::FeldmanHashSet< GC, T, Traits >::rend ( ) const
inline

Returns a const reverse iterator to the element following the last element of the reversed set.

It corresponds to the element preceding the first element of the non-reversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

◆ update()

template<class GC , typename T , class Traits = feldman_hashset::traits>
template<typename Q , typename Func >
std::pair<bool, bool> cds::container::FeldmanHashSet< GC, T, Traits >::update ( Q const &  val,
Func  func,
bool  bInsert = true 
)
inline

Updates the element.

The operation performs inserting or replacing with lock-free manner.

If the val key not found in the set, then the new item created from val will be inserted into the set iff bInsert is true. Otherwise, if val is found, it is replaced with new item created from val and previous item is disposed. In both cases func functor is called.

The functor Func signature:

struct my_functor {
void operator()( value_type& cur, value_type * prev );
};

where:

  • cur - current element
  • prev - pointer to previous element with such hash. prev is nullptr if cur was just inserted.

The functor may change non-key fields of the item; however, func must guarantee that during changing no any other modifications could be made on this item by concurrent threads.

Returns std::pair<bool, bool> where first is true if operation is successful, i.e. the item has been inserted or updated, second is true if the new item has been added or false if the item with key equal to val already exists.


The documentation for this class was generated from the following file:

cds 2.3.2 Developed by Maxim Khizhinsky aka khizmax and other contributors 2007 - 2017
Autogenerated Sun Dec 31 2017 12:10:20 by Doxygen 1.8.13