Cisco netmanager product version 1.1 now features a new notification filtering feature, which will allow users to specify rules to filter the email notifications they chose to recieve. To implement this feature I had to implement a rule engine, which would filter the event notifications according to the rules specified. In this section we shall investigate how this can be done in C++.

Though the new capability is fairly straight forward, the implementation is not so …well simple. When events happen in the system, the only information that is at hand is the type of event that has occured and the device that caused the event. Based on this the system has to filter the event based on a combination of groups or devices, the event severity and the actual event type. A single rule can specify multiple groups and multiple events.

The Challenge

The primary difficulty in applying the rules, in the netmanager system, is that netmanager has the capability for creating DYNAMIC groups. ie You could write an SQL query that returns the devices correspoding to it. A device can also be part of multiple groups at the same time due to these dynamic queries. In order to match a device against a group, one would have to evaluate all these SQL queries at rune time, which is clearly not feasible. Therefore there has to be some system that determines in an efficient and fast manner the groups a device belongs to.

In other words given a relation like A contains {1,4,5} and B contains {1,4,6,7}, we should be able to say that the value 3 is contained in sets B and A. How do you create something like this using C++ ?


Solution – Reverse lookup sets

The cost / challenge that we face here is the computation cost for evaluating n queries each time an event occurs. Caching is a good solutiuon to store precomputed results and offset any computational overheads.

Therefore in this case

Set A Contains -> {1,4,5}
Set B Contains -> {1,4,6,7}

Pre-computing the reverse mappings would produce mappings that store the values against the set identifiers.

1 -> {A,B}
4 -> {A,B}
5 -> {A}
6 -> {B}
7 -> {B}

With this data structure in place, finding the presence of a value in any of the sets is now reduced to a single hash_find, of 0(1) cost.

C++ Implementation

Well, since we have sets, nothing beats the STL vector class in having fast access to a set of values. Since you need a fast map, STL hash_map is the best in this class. As you can see STL, written using templates gives some really useful libraries (usually containers and algorithms) you can use in any situation. But this barely represents the full use of STL.

How would you represent our solution in C++?

First off we need to represent a set of integers. Since we are using a vector, this can be represented as vector<int>. Next off we need a hash that will map the integer values to a vector containing the set idenitifers.

integer -> vector<int> mapping is required.

Therefore this would be represented as a hash

hash<int, std::vector<int>>

Not Generic

However as you might have noticed even though we decomposed the problem into a generic set mapping problem, we are now speaking in concrete terms like int and std::vector<int>. This will not help in generalization at all. If we do not generalize we cannot create a class that can be re-used in multiple situations. If a class is not generic enough, folks who want to reuse (for eg find certain names in multiple sets) will be faced with the uphill task of customization or modification of old code, and no one will take that up coz everyone knows how easy it is to write new code rather than modify old code. (a popular fallacy).

Therefore we need to implement the solution to this problem such that you can create a reverse index of ANY type of values. ie the sets can have numbers or strings or classes or any other type. The class we write should be able to reverse calculate the memberships. Only then would the code we write be more useful and achieve a broader reuse target.

Generic C++ Solution

This was solved by creating a templated class that takes as its argument two types, which represents the type of the set idenitifer (A,B,C) and the type of the values that are contained in these sets. That right. The class takes TYPES as its arguments.

It would then create the right type of hashes and vector inside it, of type int or string or whatever type you chose and then perform the mapping and reverse mapping on these types. Here is the actual code that does this in just 3 pages. (WARNING – this code will loook scary if you do not know hash_map declaration syntax)
Isnt it cool? What you have just done is specified an algorithm or a procedure independent of the types it contains. This allows your algorthm to be used against any data types and more importantly other folks with similar requirement can easily reuse your code. This class can now be used to create precomposed lookups for any kindof of data types.

-> STL Power

What you have seen above is why STL is so useful and powerful a tool. It allows you to encaspulate a concept (in this case reverse lookups on precomputed indexes) compared to a day to day code which encasulates an extrememly specific action. In fact templates allows you to generate code, that does the type specific task your program requires.

Once you start writing code using templates, the surrouding code also soon becomes templatized and soon you will found yourself writing entire templates of generic algorithms that can be resued as a body of code. In fact the whole of STL has been created in this manner. Someone thought of generalizing algorithms and then they required other artifacts (eg pointer = iterator concept) to support this and so on an so forth. It is a really pweorful concept and one helluva useful tool.

The downside

Unfortunately the readabaility is a bit of a pain but once you get used to this sort of code it becomes easy. The pay offs are huge and the work satisfaction is also a major boost.

Advertisements