Thursday, January 8, 2009
summary of the interview
Interview at Ambrado.
{
Wednesday, January 7, 2009
Microsoft interview questions
phone interview for real networks on Jan.05 2009
Tuesday, January 6, 2009
Implementing the Singleton Design Pattern by Danny Kalev
Singleton is probably the most widely used design pattern. Its intent is to ensure that a class has only one instance, and to provide a global point of access to it. There are many situations in which a singleton object is necessary: a GUI application must have a single mouse, an active modem needs one and only one telephone line, an operating system can only have one window manager, and a PC is connected to a single keyboard. I will show how to implement the singleton pattern in C++ and explain how you can optimize its design for single-threaded applications.
Design Considerations
Using a global object ensures that the instance is easily accessible but it doesn't keep you from instantiating multiple objects—you can still create a local instance of the same class in addition to the global one. The Singleton pattern provides an elegant solution to this problem by making the class itself responsible for managing its sole instance. The sole instance is an ordinary object of its class, but that class is written so that only one instance can ever be created. This way, you guarantee that no other instance can be created. Furthermore, you provide a global point of access to that instance. The Singleton class hides the operation that creates the instance behind a static member function. This member function, traditionally called Instance(), returns a pointer to the sole instance. Here's a declaration of such a class:
class Singleton { public: static Singleton* Instance(); protected: Singleton(); Singleton(const Singleton&); Singleton& operator= (const Singleton&); private: static Singleton* pinstance; };Instead of Singleton, you can name your class Mouse, FileManager, Scheduler, etc., and declare additional members accordingly. To ensure that users can't create local instances of the class, Singleton's constructor, assignment operator, and copy constructor are declared protected. The class also declares a private static pointer to its instance, pinstance. When the static function Instance() is called for the first time, it creates the sole instance, assigns its address to pinstance, and returns that address. In every subsequent invocation, Instance() will merely return that address.
The class's implementation looks like this:
Singleton* Singleton::pinstance = 0;// initialize pointer Singleton* Singleton::Instance () { if (pinstance == 0) // is it the first call? { pinstance = new Singleton; // create sole instance } return pinstance; // address of sole instance } Singleton::Singleton() { //... perform necessary instance initializations }Users access the sole instance through the Instance() member function exclusively. Any attempt to create an instance not through this function will fail because the class's constructor is protected. Instance() uses lazy initialization. This means the value it returns is created when the function is accessed for the first time. Note that this design is bullet-proof—all the following Instance() calls return a pointer to the same instance:
Singleton *p1 = Singleton::Instance(); Singleton *p2 = p1->Instance(); Singleton & ref = * Singleton::Instance();Although our example uses a single instance, with minor modifications to the function Instance(), this design pattern permits a variable number of instances. For example, you can design a class that allows up to five instances.
fast bit counter
Puzzle: Devise a fast algorithm for computing the number of 1-bits in an unsigned integer. If the machine supports n-bit integers, can we compute the number of 1-bits in O(log n) machine instructions? Can we compute the number of 1-bits in O(b) machine instructions where b is the number of 1-bits in the integer?
Source: Commonly asked in job interviews for Software Engineers. I first heard it in 1996 when I was a graduate student.
Solution: This article presents six solutions to this problem.
1) Iterated Count int bitcount (unsigned int n) { int count = 0; while (n) { count += n & 0x1u; n >>= 1; } return count; } |
Iterated Count runs in time proportional to the total number of bits. It simply loops through all the bits, terminating slightly earlier because of the while condition. Useful if 1’s are sparse and among the least significant bits.
2a) Sparse Ones int bitcount (unsigned int n) { int count = 0 ; while (n) { count++ ; n &= (n - 1) ; } return count ; } |
Sparse Ones runs in time proportional to the number of 1 bits. The mystical line n &= (n - 1) simply sets the rightmost 1 bit in n to 0.
2b) Dense Ones int bitcount (unsigned int n) { int count = 8 * sizeof(int) ; n ^= (unsigned int) - 1 ; while (n) { count-- ; n &= (n - 1) ; } return count ; } |
Dense Ones runs in time proportional to the number of 0 bits. It is the same as Sparse Ones, except that it first toggles all bits (n ~= -1), and continually subtracts the number of 1 bits from sizeof(int).
Sparse Ones and Dense Ones were first described by Peter Wegner in “A Technique for Counting Ones in a Binary Computer”, Communications of the ACM, Volume 3 (1960) Number 5, page 322.
3a) Precompute_8bit static int bits_in_char [256] ; int bitcount (unsigned int n) { // works only for 32-bit ints return bits_in_char [n & 0xffu] + bits_in_char [(n >> 8 ) & 0xffu] + bits_in_char [(n >> 16) & 0xffu] + bits_in_char [(n >> 24) & 0xffu] ; } |
Precompute_8bit assumes an array bits_in_char such that bits_in_char[i] contains the number of 1 bits in the binary representation for i. It repeatedly updates count by masking out the last eight bits in n, and indexing into bits_in_char.
3b) Precompute_16bit static char bits_in_16bits [0x1u <<>> 16) & 0xffffu] ; } |
Precompute_16bit is a variant of Precompute_8bit in that an array bits_in_16bits[] stores the number of 1 bits in successive 16 bit numbers (shorts).
4) Parallel Count #define TWO(c) (0x1u << (c)) #define MASK(c) (((unsigned int)(-1)) / (TWO(TWO(c)) + 1u)) #define COUNT(x,c) ((x) & MASK(c)) + (((x) >> (TWO(c))) & MASK(c)) int bitcount (unsigned int n) { n = COUNT(n, 0) ; n = COUNT(n, 1) ; n = COUNT(n, 2) ; n = COUNT(n, 3) ; n = COUNT(n, 4) ; /* n = COUNT(n, 5) ; for 64-bit integers */ return n ; } |
Parallel Count carries out bit counting in a parallel fashion. Consider n after the first line has finished executing. Imagine splitting n into pairs of bits. Each pair contains the number of ones in those two bit positions in the original n. After the second line has finished executing, each nibble contains the number of ones in those four bits positions in the original n. Continuing this for five iterations, the 64 bits contain the number of ones among these sixty-four bit positions in the original n. That is what we wanted to compute.
5) Nifty Parallel Count #define MASK_01010101 (((unsigned int)(-1))/3) #define MASK_00110011 (((unsigned int)(-1))/5) #define MASK_00001111 (((unsigned int)(-1))/17) int bitcount (unsigned int n) { n = (n & MASK_01010101) + ((n >> 1) & MASK_01010101) ; n = (n & MASK_00110011) + ((n >> 2) & MASK_00110011) ; n = (n & MASK_00001111) + ((n >> 4) & MASK_00001111) ; return n % 255 ; } |
Nifty Parallel Count works the same way as Parallel Count for the first three iterations. At the end of the third line (just before the return), each byte of n contains the number of ones in those eight bit positions in the original n. A little thought then explains why the remainder modulo 255 works.
6) MIT HAKMEM Count int bitcount(unsigned int n) { /* works for 32-bit numbers only */ /* fix last line for 64-bit numbers */ register unsigned int tmp; tmp = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111); return ((tmp + (tmp >> 3)) & 030707070707) % 63; } |
MIT HAKMEM Count is funky. Consider a 3 bit number as being 4a+2b+c. If we shift it right 1 bit, we have 2a+b. Subtracting this from the original gives 2a+b+c. If we right-shift the original 3-bit number by two bits, we get a, and so with another subtraction we have a+b+c, which is the number of bits in the original number. How is this insight employed? The first assignment statement in the routine computes tmp. Consider the octal representation of tmp. Each digit in the octal representation is simply the number of 1’s in the corresponding three bit positions in n. The last return statement sums these octal digits to produce the final answer. The key idea is to add adjacent pairs of octal digits together and then compute the remainder modulus 63. This is accomplished by right-shifting tmp by three bits, adding it to tmp itself and ANDing with a suitable mask. This yields a number in which groups of six adjacent bits (starting from the LSB) contain the number of 1’s among those six positions in n. This number modulo 63 yields the final answer. For 64-bit numbers, we would have to add triples of octal digits and use modulus 1023. This is HACKMEM 169, as used in X11 sources. Source: MIT AI Lab memo, late 1970’s.
7) Builtin Instructions
GNU compiler allows for
int __builtin_popcount (unsigned int x);which translates into a single CPU instruction if the underlying machine architecture supports it. For example, Intel machines have POPCNT (SSE4 Instruction set announced in 2006). Many GCC builtin functions exist.
No Optimization Some Optimization Heavy Optimization Precomp_16 52.94 Mcps Precomp_16 76.22 Mcps Precomp_16 80.58 Mcps Precomp_8 29.74 Mcps Precomp_8 49.83 Mcps Precomp_8 51.65 Mcps Parallel 19.30 Mcps Parallel 36.00 Mcps Parallel 38.55 Mcps MIT 16.93 Mcps MIT 17.10 Mcps Nifty 31.82 Mcps Nifty 12.78 Mcps Nifty 16.07 Mcps MIT 29.71 Mcps Sparse 5.70 Mcps Sparse 15.01 Mcps Sparse 14.62 Mcps Dense 5.30 Mcps Dense 14.11 Mcps Dense 14.56 Mcps Iterated 3.60 Mcps Iterated 3.84 Mcps Iterated 9.24 Mcps Mcps = Million counts per second |
Which of the several bit counting routines is the fastest? Results of speed trials on an i686 are summarized in the table on left. “No Optimization” was compiled with plain gcc. “Some Optimizations” was gcc -O3. “Heavy Optimizations” corresponds to gcc -O3 -mcpu=i686 -march=i686 -fforce-addr -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -malign-functions=4.