Hash Strings at Compile Time

It is convenient to index data stored in map, and hash table by name, and the theoretical time complexity is as low as O(logn) or O(1). Nevertheless, the overhead to convert a string name into the key to index the map or hash table structure could hurt the performance sometimes. This post demonstrates a technique do the hash computation on constant strings earlier at compile time.

The trick uses

  • constexpr that marks the value of a function return or a variable could be resolved at compile time
  • user-defined string literal that converts a string literal to any type

The following code snippet defines a string literal function operator""_hash() which takes a constant string literal as the input, does some simple hash computation then returns a 64-bit unsigned integer. To invoke this function, constant string literals in the code should be suffixed with _hash. Because this function is marked as constexpr so when the optimization level is set high, the compiler would perform the hash computation and encode the results in the code. To verify this, we can dump the assemble and find the integer values are moved into register by several mov and movk instructions.

With this trick users still could index a container with string literals similar as before (but do not forget the suffix). Please note the container would still apply its own hash function, but now on integer keys rather than strings, so might have lower overheads.

Credits