Some New features in C++11

Lambda Expression
Lambda Expressions are nameless functions given as constant values, and written exactly in the place where it’s needed, typically as a parameter to some other function. The canonical example is that you’ll pass a comparison function to a generic “sort” routine, and instead of going to the trouble of defining a whole function (and incurring the lexical discontinuity and namespace pollution) to describe this comparison, you can just pass a lambda expression describing the comparison.

However, this misses one of the most important features of Lambda Expressions, which is that they execute in the context of their appearance. Therefore, they can use the values of the variables that are defined in that context.

Lambda expressions appear (with different syntax) in all LISPs, Perl, Python, and many other languages, but notably not C, Java, or any similar language, even though those all have a way to deal with passing functions (or some excuse for them) around as parameters.

C++11 has introduced a Lambda Expressions. Here is the typical usage of the same tried with Visual Studio 2010.

#include <stdlib.h>
#include <iostream>
#include <algorithm>

int main ( int argc, char **argv) {
    char s[]="Hello World!";
    int Uppercase = 0; 
    std::for_each(s, s+sizeof(s), [&Uppercase] (char c) {
        if (isupper(c))
            Uppercase++;
    });
    std::cout<< Uppercase<<" uppercase letters in: "<< s << std::endl;
    return 0;
}

Automatic Type deduction
C++11 allows you to declare the object without specifying there type. Consider following program which pushes some integers into the stl vector and prints the contents of the vector container.

#include <vector>
#include <iostream>
int main ( int argc, char **argv) {
    std::vector<int> myvector;
    for ( int i = 0; i < 10; i++) {
        myvector.push_back(i);
    }
    for ( std::vector<int>::iterator itr = myvector.begin();
        itr != myvector.end(); ++itr) {
            std::cout << *itr << " " ;
    }
    std::cout << std::endl;
}

Declaration of the std::vector iterator is very long and clumsy. C++11 allow you to use auto keywords to do the same.

#include <vector>
#include <iostream>
int main ( int argc, char **argv) {
    std::vector<int> myvector;
    for ( auto i = 0; i < 10; i++) {
        myvector.push_back(i);
    }
    for ( auto itr = myvector.begin(); itr != myvector.end(); ++itr) {
            std::cout << *itr << " " ;
    }
    std::cout << std::endl;
}

Note: The keyword auto isn’t new; it actually dates back the pre-ANSI C era. However, C++11 has changed its meaning; auto no longer designates an object with automatic storage type. Rather, it declares an object whose type is deducible from its initializer. The old meaning of auto was removed from C++11 to avoid confusion.

Range-based for-loop
In C++03, iterating over the elements of a list requires a lot of code. Other languages like C# and Java have shortcuts that allow one to write a simple “foreach” statement that automatically walks the list from start to finish.

C++11 added a similar feature. The statement for allows for easy iteration over a list of elements:

int main ( int argc, char **argv) {
    int my_array[5] = {1, 2, 3, 4, 5};
    for (int &x : my_array) {
        x *= 2;
    }

Note: This is not yet implemented in visual studio 2010 See here for more details.

Regular Expression support added

#include <regex>
#include <iostream>

int main ( int argc, char **argv) {
    const char *reg_esp = "[ ,.\\t\\n;:]"; 
    std::regex rgx(reg_esp);
    std::cmatch match;
    const char *target = "Unseen University - Ankh-Morpork";
    if( std::regex_search( target, match, rgx ) ) {
        const size_t n = match.size();
        for( size_t a = 0; a < n; a++ ) {
            std::string str( match[a].first, match[a].second );
            std::cout << str << "\n";
        }
    }
}

Cache Conscious programming.

Current processors have systems of cache that sits between the processors and memory. A Cache is a small high-speed memory device that operates on the assumption that recently accessed data is likely to be reference again in near future. Main memory which is RAM is contacted only when the data requested is not in the cache line. This process is called as cache-miss.

A Cache is not accessible to program and complete under the control of hardware. Although programs can be made better to be cache aware by improving spatial and temporal locality through regular predictable memory access pattern.

A Cache is a SRAM which is physically placed between processor and main memory. The goal of this memory is to keep the working set of data ( recently accessed data ) close to CPU. Working of cache is mainly depends upon the locality of reference. So the usefulness of this caches is mainly depends upon how your program access the cache and data. Temporal and Spatial are the two main components of the principle of locality.

Cache is divided into Layers. The first layer (L1) is located near the CPU core. It works as fast as CPU and has access latency of 1 to 3 CPU cycles, but the size of this cache is very limited not bigger than few KBs. The second layer ( L2 ) has little more space ( approx 512 MB to 4 MB ) but has access latency of around 10 CPU cycles. A third level of cache ( L3 ) can have capacity of 10’s of Megabytes but requires 10s of magnitude of CPU cycles, But all these 3 forms of Caches are still faster than Main memory.

Cache-Lines are the equal size blocks which divided your caches often between size of 64 and 256 bytes long. Cache-line size may vary between the CPU cache levels. Cache-lookup is the process which determine whether a cache-miss or cache-hit occurs. Cache-hit occurs when the requested data is found in one of the cache-lines. Cache miss occurs when the requested data is not found in the cache-lines and need to access further level of hierarchy.

To make your program more cache friendly, you need to improve on how your program is using the memory and improve its locality of access. Try to reduce the random access to the memory by using more predictable and regular memory access pattern. Try reducing or eliminating the user of pointers to exploit cache by using more regular and predictable memory access patterns. Pointers can cause two issues, first is the memory location they are pointing to is not known till they are reference. Second is they can cause you to access the Random memory ( SRAM ) which will result into poor usage of cache. It is yet to be proved that complete elimination of the pointers, but one can have techniques which will help to reduce the usage of pointers in such a way that resulting data structure is cache conscious.

Here is an example of cache conscious Binary search tree. ( see here for more details )

#include <malloc.h>
#include <string.h>

typedef unsigned long long  uint64_t;
typedef signed long int     int32_t;

#define PTR_SIZE 8
#define PTR_LEFT_RIGHT_SIZE 16

class cache_bst {
public:
    cache_bst();
    ~cache_bst();
    bool insert ( const char *);
    bool search ( const char *);
private:
    char  *m_buffer;
    size_t m_buffer_size;
    size_t m_offset;
    size_t _resize(size_t);
    static const int PAGE_SIZE = 10000000;
protected:
};
cache_bst::cache_bst() {
    m_buffer          = (char* )0;
    m_buffer_size     = 0;
    m_offset          = 0;
}
cache_bst::~cache_bst() {
    free(m_buffer);
}
size_t cache_bst::_resize(size_t length) {
    if ( m_offset == 0 ) {
        m_buffer_size = PAGE_SIZE;
        m_buffer    = (char*)malloc (m_buffer_size);
    } else {
        if(m_offset + length + PTR_LEFT_RIGHT_SIZE >= m_buffer_size) {
            char *tmp = (char*)malloc(m_buffer_size + PAGE_SIZE); 
            memcpy(tmp, m_buffer, m_buffer_size); 
            m_buffer_size += PAGE_SIZE;
            free(m_buffer);
            m_buffer = tmp;
        }
    }
    size_t rc = m_offset;
    m_offset += length + PTR_LEFT_RIGHT_SIZE;
    return rc;
}
bool cache_bst::insert ( const char *word) {
    char *root = m_buffer; 
    size_t offset = 0;
    char *key;
    if(m_buffer == (char *)0) {
        size_t len = strlen(word);
        offset     = _resize(len + 1 );
        key        = (char *)(m_buffer + PTR_LEFT_RIGHT_SIZE); 
        *(uint64_t *)(m_buffer) = *(uint64_t *)(m_buffer + PTR_SIZE) = 0;
        for(; *word != '\0'; word++, key++) {
            *key = *word;
        }
        *key = '\0';

        return true;
    } 
    while(1) {
        uint64_t *left = (uint64_t *)(root);
        uint64_t *right= (uint64_t *)(root + PTR_SIZE);
        char *key = (char *)(root + PTR_LEFT_RIGHT_SIZE);
        size_t rc = strcmp(word, key);
        if(rc == 0 ) {
            return false;
        } else if (rc < 0) {
            if( *left == 0) {
                rc    = _resize(strlen(word) + 1);
                left  = (uint64_t *)(m_buffer + offset);
                *left = rc;
                root  = (char *)(m_buffer + *left);
                left  = (uint64_t *)(root);
                right = (uint64_t *)(root + PTR_SIZE);
                key   = (char *)(root + PTR_LEFT_RIGHT_SIZE);
                *left = *right = 0;
                for(; *word != '\0'; word++, key++) {
                    *key = *word;
                }
                *key='\0';
                return true;
            }
            offset   = (size_t)*left;
            root     = (char *)(m_buffer + *left);
        } else {
            if( *right == 0) {
                size_t length = strlen(word);
                rc     =  _resize(length + 1);
                right  = (uint64_t *)(m_buffer + offset + PTR_SIZE);
                *right = rc;
                root   = (char *)(m_buffer + *right);
                left   = (uint64_t *)(root);
                right  = (uint64_t *)(root + PTR_SIZE);
                key    = (char *)(root + PTR_LEFT_RIGHT_SIZE);
                *left  = *right = 0; 
                for(; *word != '\0'; word++, key++){
                    *key=*word;
                }
                *key='\0';
                return true;
            }
            offset = (size_t)*right;
            root   = (char *)(m_buffer + *right);
        }
    }
    return true;
}
bool cache_bst::search( const char *word) {
    char *root = m_buffer;
    
    if(root==NULL) {
        return false;
    }
    while (1) {
        char * key = (root + PTR_LEFT_RIGHT_SIZE);  
        int32_t rc = strcmp(word, key);   

        uint64_t *left  = (uint64_t *)(root);  
        uint64_t *right = (uint64_t *)(root + PTR_SIZE);

        if ( rc == 0 ) {
            return true;
        } else if(rc  < 0) {
            if(*left == 0)  
                return false;
            root = (m_buffer + *left);
        } else {
            if(*right == 0)  
                return false;
            root = (m_buffer + *right);
        }
    } /* end of while (1)*/
    return false;
}

int main(int argc, char **argv) {
    cache_bst dict;
    dict.insert("A");
    dict.insert("B");
    dict.insert("C");
    dict.insert("D");
    dict.insert("E");
    dict.insert("F");
    dict.insert("G");
    dict.insert("H");

    bool found = dict.search("X");

    return 0;
}

User bin crash puzzle solution

Interesting puzzle on facebook job puzzle site. This is quick and fun puzzle to solve. Can be solve in many ways, but I choose to solve this using Knapsack approach.

You are on a cargo plane full of commercial goods when the pilot announces that the plane is short on fuel. Unless cargo is ejected from the plane, you will run out of fuel and crash. The pilot provides you with the number of pounds of weight that must be ejected from the plane. Fortunately, the manifest of the plane is both completely accurate, digitized, and you are a programmer of some repute. Unfortunately, your boss is going to be extremely unhappy, and wants you to exactly minimize the losses to the absolute minimum possible without crashing the plane. The manifest does not contain the exact quantities of every type of good, because you have so many on board. You may assume that you will not run out of any good in the course of saving the plane. You also realize this kind of program could be handy to others who find themselves in similar situations. Read More about puzzle description here.

Here is the C++ solution for the same.

#include <iostream>
#include <fstream>
#include <vector>
#include <limits>
#include <sstream>

inline int gcd(int a, int b) {
    while (a != b) {
        if (a > b)
            a = a - b;
        else
            b = b - a;
    }
    return a;
}

int main(int argc, char *argv[]) {
    // Read the file and populate internal data structures.
    std::ifstream input(argv[1]);
    if ( input.peek() == EOF ) {
        return false;
    }
    int total_weight = 0;
    std::string file_line; 
    // Read the minimum number of pounds of weight that must 
    // be ejected from the plane to prevent a crash
    getline(input,file_line);
    std::istringstream iss;
    iss.clear();
    iss.str(file_line);
    iss >> total_weight;

    // Read all the lines in the file.
    std::vector<std::string> lines;
    getline(input,file_line);
    while( input.good() ) {
        if ( !file_line.empty())
            lines.push_back(file_line);           
        getline(input,file_line);
    }
    int num_of_items = lines.size();
    int *weights  = new int [num_of_items];
    int *costs    = new int [num_of_items];
    for(int i = 0; i < num_of_items; ++i) {
        iss.clear();
        iss.str(lines[i]);
        std::string sku = "";
        iss >> sku >> weights[i] >> costs[i] ;
    }
    input.close();
    // calculate optimize gcd.
    int overall_gcd = gcd(weights[0], total_weight);
    for(register int i = 1; i < num_of_items; ++i) {
        overall_gcd = gcd(overall_gcd, weights[i]) ;
    }
    for(register int i = 0; i < num_of_items; ++i) {
        weights[i] /= overall_gcd ;
    }   
    total_weight /= overall_gcd;

    // Solve using DP.
    int *table = new int [total_weight];
    for(register int cur_weight = 0; cur_weight < total_weight; ++cur_weight) {
        int best = -1;
        for(register int j = 0; j < num_of_items; ++j) {
            int cur_cost = costs[j];
            int item_weight = weights[j];
            if ( item_weight <= cur_weight) {
                cur_cost += table[cur_weight - item_weight];
            }
            if ( best == -1  || cur_cost < best ){
                best = cur_cost ;
            }
        }
        table[cur_weight] = best;
    }
    std::cout << table[total_weight-1] << std::endl;

    delete [] table;
    delete [] weights;
    delete [] costs;

    return 0;
}