cds
2.2.0

Intrusive hash set based on multilevel array. More...
#include <cds/intrusive/impl/feldman_hashset.h>
Public Types  
typedef GC  gc 
Garbage collector.  
typedef T  value_type 
type of value stored in the set  
typedef Traits  traits 
Traits template parameter, see feldman_hashset::traits .  
typedef traits::hash_accessor  hash_accessor 
Hash accessor functor.  
typedef base_class::hash_type  hash_type 
Hash type deduced from hash_accessor return type.  
typedef traits::disposer  disposer 
data node disposer  
typedef base_class::hash_comparator  hash_comparator 
hash compare functor based on traits::compare and traits::less options  
typedef traits::item_counter  item_counter 
Item counter type.  
typedef traits::node_allocator  node_allocator 
Array node allocator.  
typedef traits::memory_model  memory_model 
Memory model.  
typedef traits::back_off  back_off 
Backoff strategy.  
typedef traits::stat  stat 
Internal statistics type.  
typedef gc::template guarded_ptr< value_type >  guarded_ptr 
Guarded pointer.  
typedef feldman_hashset::level_statistics  level_statistics 
Level statistics.  
typedef implementation_defined  iterator 
bidirectional iterator type  
typedef implementation_defined  const_iterator 
bidirectional const iterator type  
typedef implementation_defined  reverse_iterator 
bidirectional reverse iterator type  
typedef implementation_defined  const_reverse_iterator 
bidirectional reverse const iterator type  
Public Member Functions  
FeldmanHashSet (size_t head_bits=8, size_t array_bits=4)  
Creates empty set. More...  
~FeldmanHashSet ()  
Destructs the set and frees all data.  
bool  insert (value_type &val) 
Inserts new node. More...  
template<typename Func >  
bool  insert (value_type &val, Func f) 
Inserts new node. More...  
std::pair< bool, bool >  update (value_type &val, bool bInsert=true) 
Updates the node. More...  
bool  unlink (value_type const &val) 
Unlinks the item val from the set. More...  
bool  erase (hash_type const &hash) 
Deletes the item from the set. More...  
template<typename Func >  
bool  erase (hash_type const &hash, Func f) 
Deletes the item from the set. More...  
bool  erase_at (iterator const &iter) 
Deletes the item pointed by iterator iter . More...  
guarded_ptr  extract (hash_type const &hash) 
Extracts the item with specified hash . More...  
template<typename Func >  
bool  find (hash_type const &hash, Func f) 
Finds an item by it's hash . More...  
bool  contains (hash_type const &hash) 
Checks whether the set contains hash . More...  
guarded_ptr  get (hash_type const &hash) 
Finds an item by it's hash and returns the item found. More...  
void  clear () 
Clears the set (nonatomic) More...  
bool  empty () const 
Checks if the set is empty. More...  
size_t  size () const 
Returns item count in the set.  
stat const &  statistics () const 
Returns const reference to internal statistics.  
void  get_level_statistics (std::vector< feldman_hashset::level_statistics > &stat) const 
Collects tree level statistics into stat . More...  
Threadsafe iterators  
iterator  begin () 
Returns an iterator to the beginning of the set. More...  
const_iterator  begin () const 
Returns an const iterator to the beginning of the set.  
const_iterator  cbegin () 
Returns an const iterator to the beginning of the set.  
iterator  end () 
Returns an iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.  
const_iterator  end () const 
Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.  
const_iterator  cend () 
Returns a const iterator to the element following the last element of the set. This element acts as a placeholder; attempting to access it results in undefined behavior.  
reverse_iterator  rbegin () 
Returns a reverse iterator to the first element of the reversed set.  
const_reverse_iterator  rbegin () const 
Returns a const reverse iterator to the first element of the reversed set.  
const_reverse_iterator  crbegin () 
Returns a const reverse iterator to the first element of the reversed set.  
reverse_iterator  rend () 
Returns a reverse iterator to the element following the last element of the reversed set. More...  
const_reverse_iterator  rend () const 
Returns a const reverse iterator to the element following the last element of the reversed set. More...  
const_reverse_iterator  crend () 
Returns a const reverse iterator to the element following the last element of the reversed set. More...  
Static Public Attributes  
static constexpr size_t const  c_nHazardPtrCount = 2 
Count of hazard pointers required.  
static constexpr size_t const  c_hash_size = base_class::c_hash_size 
The size of hash_type in bytes, see feldman_hashset::traits::hash_size for explanation.  
Intrusive hash set based on multilevel array.
[From the paper] The hardest problem encountered while developing a parallel hash map is how to perform a global resize, the process of redistributing the elements in a hash map that occurs when adding new buckets. The negative impact of blocking synchronization is multiplied during a global resize, because all threads will be forced to wait on the thread that is performing the involved process of resizing the hash map and redistributing the elements. FeldmanHashSet
implementation avoids global resizes through new array allocation. By allowing concurrent expansion this structure is free from the overhead of an explicit resize, which facilitates concurrent operations.
The presented design includes dynamic hashing, the use of subarrays within the hash map data structure; which, in combination with perfect hashing, means that each element has a unique final, as well as current, position. It is important to note that the perfect hash function required by our hash map is trivial to realize as any hash function that permutes the bits of the key is suitable. This is possible because of our approach to the hash function; we require that it produces hash values that are equal in size to that of the key. We know that if we expand the hash map a fixed number of times there can be no collision as duplicate keys are not provided for in the standard semantics of a hash map.
FeldmanHashSet
is a multilevel array which has a structure similar to a tree:
The multilevel array differs from a tree in that each position on the tree could hold an array of nodes or a single node. A position that holds a single node is a dataNode
which holds the hash value of a key and the value that is associated with that key; it is a simple struct holding two variables. A dataNode
in the multilevel array could be marked. A markedDataNode
refers to a pointer to a dataNode
that has been bitmarked at the least significant bit (LSB) of the pointer to the node. This signifies that this dataNode
is contended. An expansion must occur at this node; any thread that sees this markedDataNode
will try to replace it with an arrayNode
; which is a position that holds an array of nodes. The pointer to an arrayNode
is differentiated from that of a pointer to a dataNode
by a bitmark on the secondleast significant bit.
FeldmanHashSet
multilevel array is similar to a tree in that we keep a pointer to the root, which is a memory array called head
. The length of the head
memory array is unique, whereas every other arrayNode
has a uniform length; a normal arrayNode
has a fixed poweroftwo length equal to the binary logarithm of a variable called arrayLength
. The maximum depth of the tree, maxDepth
, is the maximum number of pointers that must be followed to reach any node. We define currentDepth
as the number of memory arrays that we need to traverse to reach the arrayNode
on which we need to operate; this is initially one, because of head
.
That approach to the structure of the hash set uses an extensible hashing scheme; the hash value is treated as a bit string and rehash incrementally.
FeldmanHashSet
:std::string
as a key for FeldmanHashSet
. Instead, for the strings you should use wellknown hashing algorithms like SHA1, SHA2, MurmurHash, CityHash or its successor FarmHash and so on, which converts variablelength strings to fixedlength bitstrings, and use that hash as a key in FeldmanHashSet
.FeldmanHashSet
uses a perfect hashing. It means that if two different keys, for example, of type std::string
, have identical hash then you cannot insert both that keys in the set. FeldmanHashSet
does not maintain the key, it maintains its fixedsize hash value.The set supports bidirectional threadsafe iterators.
Template parameters:
GC
 safe memory reclamation schema. Can be gc::HP
, gc::DHP
or one of RCU typeT
 a value type to be stored in the setTraits
 type traits, the structure based on feldman_hashset::traits
or result of feldman_hashset::make_traits
metafunction. Traits
is the mandatory argument because it has one mandatory type  an accessor to hash value of T
. The set algorithm does not calculate that hash value.There are several specializations of FeldmanHashSet
for each GC
. You should include:

inline 
Creates empty set.
head_bits  2^{head_bits} specifies the size of head array, minimum is 4. 
array_bits  2^{array_bits} specifies the size of array node, minimum is 2. 
Equation for head_bits
and array_bits:
where N
is multilevel array depth.

inline 
Returns an iterator to the beginning of the set.
The set supports threadsafe iterators: you may iterate over the set in multithreaded environment. It is guaranteed that the iterators will remain valid even if another thread deletes the node the iterator points to: Hazard Pointer embedded into the iterator object protects the node from physical reclamation.
Each iterator object supports the common interface:
==
and !=
. Iterators are equal iff they point to the same cell of the same array node. Note that for two iterators it1
and it2
, the conditon it1 == it2
does not entail &(*it1) == &(*it2)
: welcome to concurrent containersrelease()
that clears internal hazard pointer. After release()
the iterator points to nullptr
but it still remain valid: further iterating is possible.During iteration you may safely erase any item from the set; erase_at() function call doesn't invalidate any iterator. If some iterator points to the item to be erased, that item is not deleted immediately but only after that iterator will be advanced forward or backward.

inline 
Clears the set (nonatomic)
The function unlink all data node from the set. The function is not atomic but is threadsafe. After clear
() the set may not be empty because another threads may insert items.
For each item the disposer
is called after unlinking.

inline 
Checks whether the set contains hash
.
The function searches the item by its hash
and returns true
if it is found, or false
otherwise.

inline 
Returns a const reverse iterator to the element following the last element of the reversed set.
It corresponds to the element preceding the first element of the nonreversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

inline 
Checks if the set is empty.
Emptiness is checked by item counting: if item count is zero then the set is empty. Thus, the correct item counting feature is an important part of the set implementation.

inline 
Deletes the item from the set.
The function searches hash
in the set, unlinks the item found, and returns true
. If that item is not found the function returns false
.
The disposer specified in Traits
is called by garbage collector GC
asynchronously.

inline 
Deletes the item from the set.
The function searches hash
in the set, call f
functor with item found, and unlinks it from the set. The disposer specified in Traits
is called by garbage collector GC
asynchronously.
The Func
interface is
If hash
is not found the function returns false
.

inline 
Deletes the item pointed by iterator iter
.
Returns true
if the operation is successful, false
otherwise.
The function does not invalidate the iterator, it remains valid and can be used for further traversing.

inline 
Extracts the item with specified hash
.
The function searches hash
in the set, unlinks it from the set, and returns an guarded pointer to the item extracted. If hash
is not found the function returns an empty guarded pointer.
The disposer
specified in Traits
class' template parameter is called automatically by garbage collector GC
when returned guarded_ptr object to be destroyed or released.
guarded_ptr
object uses the GC's guard that can be limited resource.Usage:

inline 
Finds an item by it's hash
.
The function searches the item by hash
and calls the functor f
for item found. The interface of Func
functor is:
where item
is the item found.
The functor may change nonkey fields of item
. Note that the functor is only guarantee that item
cannot be disposed during the functor is executing. The functor does not serialize simultaneous access to the set's item
. If such access is possible you must provide your own synchronization schema on item level to prevent unsafe item modifications.
The function returns true
if hash
is found, false
otherwise.

inline 
Finds an item by it's hash
and returns the item found.
The function searches the item by its hash
and returns the guarded pointer to the item found. If hash
is not found the function returns an empty guarded_ptr
.
guarded_ptr
object uses one GC's guard which can be limited resource.Usage:

inline 
Collects tree level statistics into stat
.
The function traverses the set and collects statistics for each level of the tree into feldman_hashset::level_statistics
struct. The element of stat
[i] represents statistics for level i
, level 0 is head array. The function is threadsafe and may be called in multithreaded environment.
Result can be useful for estimating efficiency of hash functor you use.

inline 
Inserts new node.
The function inserts val
in the set if it does not contain an item with that hash.
Returns true
if val
is placed into the set, false
otherwise.

inline 
Inserts new node.
This function is intended for derived nonintrusive containers.
The function allows to split creating of new item into two part:
f
functor to initialize val
.The functor signature is:
where val
is the item inserted.
The userdefined functor is called only if the inserting is success.

inline 
Returns a reverse iterator to the element following the last element of the reversed set.
It corresponds to the element preceding the first element of the nonreversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

inline 
Returns a const reverse iterator to the element following the last element of the reversed set.
It corresponds to the element preceding the first element of the nonreversed container. This element acts as a placeholder, attempting to access it results in undefined behavior.

inline 
Unlinks the item val
from the set.
The function searches the item val
in the set and unlink it if it is found and its address is equal to &val
.
The function returns true
if success and false
otherwise.

inline 
Updates the node.
Performs inserting or updating the item with hash value equal to val
.
val
, old item is disposed with Traits::disposer
. Note that the disposer is called by GC
asynchronously. The function returns std::pair<true, false>
bInsert
is true
then val
is inserted, the function returns std::pair<true, true>
bInsert
is false
then the set is unchanged, the function returns std::pair<false, false>
Returns std::pair<bool, bool>
where first
is true
if operation is successful (i.e. the item has been inserted or updated), second
is true
if new item has been added or false
if the set contains that hash.