Hash Strings at Compile Time

It is convenient to index data stored in map, and hash table by name, and the theoretical time complexity is as low as O(logn) or O(1). Nevertheless, the overhead to convert a string name into the key to index the map or hash table structure could hurt the performance sometimes. This post demonstrates a technique do the hash computation on constant strings earlier at compile time.

Read More

C++ Template Method Polymorphism

Polymorphism in C++ means a function or object of the same identity/name behaves differently under different scenarios. A common example of polymorphism is a derived class could override the method definition in the base class if it is marked virtual and a base class pointer would invoke the proper definition according to which class instance it is actually pointing to.

Polymorphism reduces trivial programming efforts in some cases and is very useful. Template method for generic programming is another style of polymorphism done at compile time to avoid unnecessary code duplication and eases the burden of programmers (see my earlier post series). However, C++ forbids template methods to be marked as virtual. So could somehow we have it all?

Read More

Relocation and Thread Local Storage

The following excellent articles in the past two days taught me a lot about how the loader does relocation for ELF executables and how thread local storage is handled:

Read More

Fix Broken Packages

A server machine failed at upgrading from Ubuntu14.04 to Ubuntu16.04, and many packages are left in half-installed status, even apt is not working. I struggled several days to fix those broken packages, and this post documents some tricks I learnt.

Read More

MPI Foundation

I attended MPI Foundation, a nice training given by TACC staff, at the end of March. This post briefly documents the basic MPI programming knowledge I learnt (:sweat: should have been written long ago, but did not even start until the day before follow-up training).

Read More

Jekyll Gist: Embed Code Snippets into Static Pages

Jekyll Gist is a plugin to embed a gist into the static pages (Jekyll-Gist Repository).

It is enabled at the backend of GitHub Page by default, and can be used as a liquid tag as follows:

{% gist 46b4401513cdbd8b2390172344157b7a UnionFind.cpp %}

The long hash code can be found in the URL of a specific gist repo. The Jekyll backend would render it to be a JavaScript tag. In addition, gist-it.appspot.com is a website providing JavaScript to embed files from a public github repository like a gist.

Read More

Git Empty Tree

4b825dc642cb6eb9a060e54bf8d69288fbee4904 is the hash code to index the “empty tree” (before your first commit) in git version control system.

Read More

Strongly Connected Components

Strongly connected components refer to connected components in a directed graph, and is useful to simplify a directed graph, for instance, merging all vertices in the same strongly connected components to be a big vertex, could produce a Directed Acyclic Graph (DAG) with less number of vertices. This post documents two well-known algorithms (i.e., Kosaraju’s Algorithm and Tarjan’s Algorithm) that solve the strongly connected component problem.

Read More

new and delete in C++

C++ introduced a built-in method to dynamically manage memory, as an alternative to library based malloc, realloc, free functions in C. This is achieved by two keywords, new and delete.

Read More

TeX Live Install Package

After switched to use Linux, I use vim-latex to extand vim for LaTeX editing. However, sometimes I would encounter errors like package not found. I thought that is because vim-latex supports limited packages (under ~/.vim/ftplugin/latex-suite/packages/), but actually vim-latex is just the plugin to turn vim into a LaTeX editor, while the complier behind it is something else, for example TeX Live.
TeX Live has its own package manager called tlmgr. This post documents some issues I had to install absent package for TeX Live.

Read More

Start a Process in Suspended State

Under some circumstances, I would like to launch the process, then let other programs attach to it. No matter how fast can I type, it is impossible for me to bind the processes from the very beginning. The solution is let the script to launch the process but suspend it immediately.

#!/bin/bash
$@ &
PID=$!
kill -STOP $PID
echo $PID
wait $PID

The script earns enough time for you to do your operations in another terminal, then to resume the process:

kill -CONT $PID

Note: need to replace $PID with the number printed in the first terminal.

Read More

Delete Lines from Huge Files

With huge files, tasks that as simple as removing first or last several lines could become hard. If you try to open a huge file with vim, the chance is that the system gets stuck struggling to load the huge file into memory. This post has several alternative approaches to delete lines from huge files.

Read More

Customized Key for Container Map: Bi-match Pair of Unsigned Integers

I had a post, Container Types in C++, where I introduced a bit about one of the C++ containters std::map.
Yesterday I ran into an interesting use case of map, namely, I wanted to use map with customized key, bi-match unsigned integer pair. This post talks about the way to customize key for std::map, and issues and solution to get a bi-match unsigned integer pair work correctly as the key.

Read More

A Bite of Template in C++ (3)

In the post A Bite of Template in C++ (1), I mentioned a constraint for template member functions in class that explicit specialization is not allowed. However, I realized there are more constraints on function templates later when I wanted to make two template varients with slightly difference on only one member function. I tried specializing member function template, as well as inheriting a template member function from base class. Both of them failed due to some constraints. Briefly, the constraints are 1) partial specialization is not allowed for function template, 2) using declaration plus template disambiguator is not the right way to inherit a template member function from a template class.

Read More

Tips to Find Predefines for Compilers

Two useful tips from TACC tips regarding compiler macro expanding.

Tip 120:
Want to know what MACROS are predefined by the GCC and Intel compilers? Do

gcc -dM -E - < /dev/null > gcc_defined_macros.txt
icc -dM -E - < /dev/null > icc_defined_macros.txt

The flag -E in the tip is telling the compiler to expand macros.

Tip 69:
Having trouble figuring out how a macro is being expanded? Try pasting the line into a terminal and adding a -E. For example try changing mpicc -DFOO -DBAR -c foo.c into mpicc -DFOO -DBAR -E foo.c > foo.i and look at the foo.i file.

Read More

A Bite of Template in C++ (2)

I encountered two issues when using C++ template. One is about why sometimes typename has to be prefixed to declarations, and another is how to access the types in the parameter list from outside of the template. This post has some ideas for those two questions, as well as some discussions about nested templates.

Read More

Find The Right Place For Const Qualifier In Declaration

The keyword const is a qualifier used to claim the values of variables are constant. It is useful, because compiler recognizes it and help programmers to avoid some bugs made unconsciously. However, I am always confused about where to put the const, especially using it with special types, for example pointer. This post is about declarations with const qualifier (also applicable to volatile of course).

Read More

How to Create Device Node in Kernel Module

Device node is a file-like abstraction in Linux for user to interact with devices. It could map the operations (read, write, fasync etc.) performed on the file to the associated device. Otherwise, user application needs to be executed with privilege to do memory map before using the device, and programer needs to know the physical address.

After reinstall OS, the SD card reader on my laptop stops work again. Because this time I use Linux Mint, which is based on higher version kernel, I encountered more troubles to drive the SD card reader.

Three steps, five errors and solutions are recorded.

Read More

Algorithms for Reuse Distance Calculation

Reuse distance calculation is a problem related to cache design and evaluation. With a sequence of address references, reuse distance is defined as the number of unique addresses between two references to the same address. For example, in “A B C A C B”, the first B has an infinite reuse distance since it has never been rereferenced, while the second B has a reuse distance of 2.

Several algorithms were discovered long time ago. The following three are included in this post:

  • Stack approach - Mattson et al. in 1970;
  • Bit Vector approach - Bennett & Kruskal in 1975;
  • AVL Tree approach - Olken in 1981;
Read More

Driver for RTS5227/RTS5229

After reinstall OS, the SD card reader on my laptop stops work again. Because this time I use Linux Mint, which is based on higher version kernel, I encountered more troubles to drive the SD card reader.

Three steps, five errors and solutions are recorded.

Read More

Jekyll Now

Jekyll is a static site generator that’s perfect for GitHub hosted blogs (Jekyll Repository).

Jekyll Now makes it easier to create your Jekyll blog, by eliminating a lot of the up front setup.

Read More