Container Types in C++ (2)
30 Nov 2019 Languages C/C++This post has more container types from STL (Standard Template Library), including map, unordered_map, set, unordered_set. Comparison examples are made to show their differences.
Comparisons
- map vs. unordered_map
The containers map and unordered_map provides some common methods: insert, find, erase, and iterator. The difference comes from their underlying data structures. What map uses is red-black tree, so its elements are maintained in the order of their keys. While unordered_map uses hash table, taking the hash of the key to index. Therefore, when iterating map and unordered_map with the same key value pairs, we can see ascending keys in map but random sequence in unordered_map as shown in the output of the following example. In addition, operations on hash table have lower time complexity than those on map, so the performance would be different as well. Here is the example code: And its example output:
n = 1
map insert: 8063
search hits: 1107
search misses: 426
delete misses: 979
delete hits: 5476
unordered_map insert: 10620
search hits: 737
search misses: 327
delete misses: 284
delete hits: 1480
n = 256
insert order: 1061730690 2004504234 1184214677 1989806367 ... 552473416 1402586708 1470503465 1977648522 434248626
map order: 2416949 7684930 11614769 12260289 ... 2025187190 2058657199 2113903881 2114937732
unordered_map order: 1470503465 1143408282 188213258 559412924 ... 1433102829 476667372 1555319301 677870460
map insert: 331511
search hits: 217137
search misses: 231464
delete misses: 296207
delete hits: 335181
unordered_map insert: 350584
search hits: 93975
search misses: 87372
delete misses: 69833
delete hits: 152499
n = 65536
insert order: 311267015 1335107622 857770492 157682926 ... 641367770 245080885 817867048 1368615282 1035118920
map order: 27975 35554 139321 199133 ... 2147283403 2147378031 2147388370 2147437002
unordered_map order: 1368615282 245080885 641367770 69179529 ... 1693433485 213249252 1815375700 364024234
map insert: 130303651
search hits: 109726040
search misses: 103807256
delete misses: 127026822
delete hits: 155480624
unordered_map insert: 82088965
search hits: 39713616
search misses: 27353719
delete misses: 21851616
delete hits: 48941932
n = 16777216
insert order: 264257611 1892075955 1603255718 5868700 ... 528685141 1230667579 658677251 1955003159 1316100773
map order: 315 452 490 657 ... 2147482973 2147483156 2147483297 2147483327
unordered_map order: 1955003159 658677251 1230667579 1005324365 ... 119267416 244994969 1788177400 651521018
map insert: 80245488460
search hits: 76534737823
search misses: 76905659798
delete misses: 87983001837
delete hits: 101008341229
unordered_map insert: 31688808239
search hits: 19101835324
search misses: 18114275484
delete misses: 16687859933
delete hits: 26223181698
I tried to understand why unordered_map has such sequence when iterating its elements with the following code, where I make the hash function simply return the key: The output does not really solve my confusion:
insert order: 1801979802 1101513929 468703135 2145174067 233665123 278722862 861021530 336465782 1726956429 294702567 521595368 35005211 1303455736 304089172 1540383426 1365180540
#bucket: 1 2 5 5 5 11 11 11 11 11 11 23 23 23 23 23 23
unordered_map order: 1365180540 1540383426 304089172 1303455736 521595368 1101513929 468703135 233665123 1801979802 278722862 1726956429 35005211 2145174067 861021530 336465782 294702567
bucket[0]:
bucket[1]: 336465782:596516649
bucket[2]: 294702567:719885386
bucket[3]: 1726956429:1649760492
bucket[4]: 468703135:1102520059 233665123:1350490027
bucket[5]:
bucket[6]:
bucket[7]: 278722862:1025202362
bucket[8]: 304089172:1681692777
bucket[9]:
bucket[10]: 1540383426:846930886
bucket[11]: 521595368:424238335
bucket[12]: 1303455736:1714636915
bucket[13]:
bucket[14]:
bucket[15]: 1365180540:1804289383
bucket[16]: 35005211:1957747793 2145174067:783368690 861021530:1189641421
bucket[17]:
bucket[18]:
bucket[19]:
bucket[20]:
bucket[21]: 1801979802:1967513926
bucket[22]: 1101513929:2044897763
It is clear that keys in the same bucket come continuously in sequence during iteration. That makes me think chaining is used to solve the collision (another solution is open addressing that what inserted later could shift to next empty table slot). However, the order for buckets to get traversed is random, also the number of buckets could increase along the number of elements. Then I have no idea how this is implemented.
- set vs. map
Unlike map, the elements in set are not key value pairs. This is useful when you only want to keep track of what are the keys have encountered.
- set vs. unordered_set
The fundamental difference between set and unordered_set is also the underlying data structure, and here is the code to show the performance difference: The example output:
n = 1
set insert: 12324
search hits: 2262
search misses: 1232
delete misses: 2748
delete hits: 9966
unordered_set insert: 25030
search hits: 1690
search misses: 884
delete misses: 780
delete hits: 3128
n = 256
set insert: 880318
search hits: 535922
search misses: 526534
delete misses: 663520
delete hits: 846508
unordered_set insert: 707624
search hits: 242320
search misses: 238602
delete misses: 186220
delete hits: 400322
n = 65536
set insert: 136285923
search hits: 108878299
search misses: 104043666
delete misses: 128401040
delete hits: 146288593
unordered_set insert: 74565550
search hits: 44290937
search misses: 26637407
delete misses: 21809733
delete hits: 47749717
n = 16777216
set insert: 79887200446
search hits: 77006315618
search misses: 76811770470
delete misses: 87920583767
delete hits: 100167624567
unordered_set insert: 29805463645
search hits: 18657390235
search misses: 17488587810
delete misses: 16185437271
delete hits: 24712411054
Credits
- cppreference.com - A great website for reference of C++.
- Why can’t I do std::map.begin() + 1? - Thanks Holt and PlasmaHH for your answers in the thread.
- Fisher-Yates shuffle - A wonderful algorithm to shuffle an array in palce with liner time.