The other day, while automating our load tests, (you heard that right – we are even automating load tests as part of agile efforts) i came across an interesting scenario.

Load runner, which we where using to automate the clustered web application, spread all its output across tons of small files, containing one/many similar lines. The 2-3 lines each file contained would look something like this

Report.c(199): Error: App Error: refresh report. DataSet took longer than the refresh rate to load.  This may indicate a slow database (e.g. HDS) [loadTime: 37.708480, refreshRate: 26].      [MsgId: MERR-17999]
Report.c(83): Error: App Error: report took longer than refresh rate: 26; iter: 32;  vuser: ?; total time 41.796206    [MsgId: MERR-17999]

But what we wanted to know to know at the end of the test was a summary of errors as in

  • Report.c(199): Error: App Error: refresh report. DataSet took longer than the refresh rate to load.  This may indicate a slow database (e.g. HDS) – 100 errors
  • Report.c(200) Error Null arguments – 50 errors

Approaching the solution

How would you solve this problem ? Think for a moment and not more ….

Well being the C++ junkie that i’m, i immediately saw an oppurtunity to apply my sword and utility belts of hash-tables / efficient string comparisons etc and whipped out Visual Studio 2010 (i havent used 2010 too much and wanted to use it) and then it occured to me

Why not use Java ? I had recently learned it and it might be easier.

The Final Solution

cat *.log | sed  ‘s/total time \([0-9]*\)\.\([0-9]*\)/total time/;s/loadTime: \([0-9]*\)\.\([0-9]*\)/loadTime: /;s/iter: \([0-9]*\)/iter: /;s/refresh rate: \([0-9]*\)/refresh rate:  /;s/refreshRate: \([0-9]*\)/refreshRate:  /’ | sort | uniq –c

I probably should optimize the sed list of commands such that i can pass in the replacement commands as input to sed from a file. But i left it at that, happy that my solution was ready to use / would work all the time / does not require maintenance / saved me a lot of time.

The end

 

Thinking about the episode, i was reminded of the hit Japanese samurai movie Zatoichi and its happy ending at the end.

  • The villain (problem) is dead
  • Samurai is happy with his effort
  • The subjects (files/compiler/system/software/hardware) are happy due to the efficiency created
  • Karma (company + job) is happy due to the time saved.

and to boot it nearly follows the Art of War zen which advices the clever warrior to win without pulling out his sword …

 

Happy coding !!!

 

 

Advertisements

Having been a full-time C++ programmer for all of the past ten years, I recently had the unique opportunity of seriously studying Java almost 10-15 years after it started becoming popular. In the meantime I’m also playing with Python on a personal note. This affords me a unique perspective, of having seen the past and at the same time looking at two new languages with a fresh eye.

C++ vs Java in 2009 = Assembly vs C++ during the 1980’s

The impressions or the vibes that i get in this new learning process is eerily similar to the same doubts and choices i felt / read about during the initial stages of my education, while playing around with assembly, C++ and Visual Basic.

Needs assembly

Needs assembly or C

Assembly was THE choice if speed and memory choices was paramount and nothing else mattered but it was damn hard to use.  All the old timers claimed you had to know assembly to understand whats happening on the system and that C++ was only for productivity improvements. They lamented about how new minds never learned / worked seriously with assembly and hence where at a disadvantage when it came to figuring out what really happened inside the box.

However at the time (1980) C++ was fast catching up, even embedded systems where moving to C++(maybe without using the templates feature).  Java currently is in the same state that C++ used to be in 1980 wrt assembly and there are many who clamour for C or C++ to be taught in universities rather than Java for the sake of better understanding.

How can  Java be as fast as C++?

It is not. However the hardware that is around now is vastly superior to 1980 and 1990 hardware, which made Java seem slow THEN, in the same way C++ seemed slow in all those small devices compared to raw assembly or binary code. Our current crop of laptops are more powerful than the servers used during the time when Java was beginning to get popular. This improvement in hardware makes the difference in speed highly immaterial for all practical purposes, because human beings cannot make out the difference all that much between 500  and 50 microseconds.

Fast forward to 2010

Is as fast as a 1999 computer and runs Java

Is as fast as as 1999 computer and runs Java

C++ is what you require if you need pure speed like in the critical loops or core libraries. You could substitute the word Assembly instead of C++, in the previous line, had this post been written 15 years back.

But Java is more useful from a business point of view for application development due to increased productivity and more features.  It is therefore no coincidence then that embedded devices like mobile phones (and even dog collars, according to Java books) are now running Java. In fact the last Nokia mobile phone that i bought was guided by the factor that it had a data port using which i could upload custom Java program onto it.

Java vs Python = C++ vs Visual Basic

When C++ used to be THE top language around, and assembly was gradually being kicked to the ground, there was something else in periphery, which was Visual Basic and all other RAD languages, which basically gave you forms based development and automatic memory management.

Internal IT or back office apps need not be cool

Internal IT or back office apps need not be cool

However they were nowhere as fast as C++ and hence where relegated to internal IT applications, which never needed to be as slick as professional, cool looking  applications and could be developed fast enough.

Fast development is very critical when it comes to small shops with less resources or IT departments on a shoe string budget. Many business applications certainly fit the bill and where developed on Visual Basic. It is exactly the same situation now with Python when comparing it to Java.

Python in 2009 = Visual Basic in 1990

Developing using Python does definitely seem like a breeze with not much hassles around object orientedness or strict rules around exception declaration or big infrastructures. Is Python as comprehensive as Java, with as many no of apis as Java has ? Probably not, although it is fast catching up in popularity, like Java before it, caught with C++ and  that is saying a lot. There is no denying the fact that Python and Ruby would soon have many programmers who are trained in it by choice.

Will run Python as fast as Java

Will run Python as fast as Java

I have not compared the performance of Java vs Python. But i assumed Python could be slower from what i have heard on the web. Please do correct me if iam wrong on the above aspect.

If the economy keeps growing like it did in the last ten years, we would soon have quad core or even eight core desktops a common place and would not have to worry about Python’s performance compared to Java. I just hope folks don’t introduce messy co-orporate stuff and make it verbose and bulky as Java before it.

3  cheers for easy coding !!!

The basic idea behind templates ? Its been done before !!

The word definition of template is “something that serves as a model / pattern

Patterns …. hmmmm ….. Ever heard of patterns in software? The idea behind patterns is that no software is totally unique. The same way all cars have a steering and a set of tires, most software also have similar pieces that constitute to make them.

If you have to acceept user input, thats been done before. If you have to sort, thats been done before. If you want to use a hash, thats been done too. The only difference in a new piece of software is the way all these common stuff is put together. So why re-invent the wheel ….err sort ?

The inspiration !!!

So lets say you have written a sort program in C to sort an array of integers. And lets say you have used Binary Search, which is infamously hard to get right. After all those fixes and tests, you have an efficient fast implementation of Binary Sort.

Just as were about to finish the task, the requirements change and now you get an array of structures which should be sorted instead of using integers. Oops !!! Wouldnt it be easy, to instead have some way of encasulating that simple logic behind the binary sort and have it used for all types of data types? That way we could save a lot of time whenver we created new code / modified existing ones.

This thought is the inspiration behind templates. Almost all advanced languages give this facility. (java and c# only recently acquired this)

Logic vs Data

Give a thought about how you would make this logic reusable

int AddNumbers (int a, int b)
{
return a + b;
}

There is a way in C++ to achieve this without templates, using overloaded functions

int Add(int a, int b) {return a + b;}
char Add (char a, int b) {return a + b;}
double Add (double a, double b) {return a + b;}

You get the idea. Once we write enough number of overloaded functions, (imagine the pain !!) you could write something that simply called Add(x,y) and the right type of function would get invoked.

The realization, is that the type of the data is an important constraint, when creating reusable logic. Even if the logic is reusable, since the data type is different, statically compiled languages like C and C++ are forced to create multiple versions of the same code to satisfy compiler requirements.

Wouldn’t it be cool if the compiler does this for you and save all that trouble of defining and creating many such overloaded functions?

Enter templates …

template <class T>

T Add(T a, T b) {return a+b;}

Thats it. This will work EXACTLY LIKE THE OVERLOADED function shown above.

There will be many versions of function Add() created inside your binary like in an overloaded function

What then is the difference between an overload and a template ?

When you use a template, the compiler is intellignet enough to make a first pass through your code, to find, what versions of the overloaded function template (from now on simply referred to as the template) you require.

So if you have invoked Add with double as it’s parameter, the compiler will EMIT code, replacing the word double wherever the Type T is used inside the the template of the function Add. If it encounters Add with say, a structure type used as parameter, it will EMIT the Add template code, with the structure type used in place of the Type T keyword.

In C++ Templates is purely a compiler technology

As you can see, templates are created by creating copies of the real logic, by substituting the type used in the algorithm, with the one actually passed in, much like a programmer would copy paste his original code with the changed types. This code is then compiled along with other regular code and voila your program magically has all the overloaded definitions of the logic, it requires.

Templates vs Overloads

Though template is implemented as overloads, the basic difference is that templates are compiler generated. Therefore it knows which overloads are required and only generates those types.

So for example if program A includes the Add template and only uses it for computing integers, there would be only one overloaded form of Add created in the program. if program B computes both int and doubles, two of these would be created by the compiler. As you can see, templated code are “instantiated” based on the source code that uses it.

But overloads have to be pre-created if you want to use them and most importantly, someone should have created them before by hand (buggy + painful) if it should work – not so cool or easy to use if you ask me

Any more uses ?

Let me spell this out, if you missed the idea – but templates are actually creating new code. CODE GENERATION !!!! Now that is the holy grail of computing. Imagine if computers could generate code automatically !! Templates are one step towards this. it gives us a way to program programs that can create other programs. This gives ultimate power (ie if you are a geek).

Clever people have used and abused this technique for even more powerfull stuff, like how about passing a templated function or class itslef as parameter for another template ?

Drawbacks

Most people are put off by the syntax of the code and more importantly the synatx of the error messages that templates produce. The angle brackets that templates use can be nested to define a template of template of ….. etc.

If you remember that the compiler simply substitutes the template with the actual type and then compiles the code, it would then create error messages that look something like class A < class B , class c> :: line x error – some people hate this and i must admit, it becomes more difficult to parse once you start using the templates in a more powerful manner.

But that is no reason to ignore something as powerful as templates. IMO if you are not using templates, you are simply using C++ as a glorified C and nothing more.

So get cracking and write some cool logic, THAT WILL BE REUSED.

ps: Here is a real life example where i could use templates to create a generic solution to a very commonly occuring problem.

Cisco netmanager product version 1.1 now features a new notification filtering feature, which will allow users to specify rules to filter the email notifications they chose to recieve. To implement this feature I had to implement a rule engine, which would filter the event notifications according to the rules specified. In this section we shall investigate how this can be done in C++.

Though the new capability is fairly straight forward, the implementation is not so …well simple. When events happen in the system, the only information that is at hand is the type of event that has occured and the device that caused the event. Based on this the system has to filter the event based on a combination of groups or devices, the event severity and the actual event type. A single rule can specify multiple groups and multiple events.

The Challenge

The primary difficulty in applying the rules, in the netmanager system, is that netmanager has the capability for creating DYNAMIC groups. ie You could write an SQL query that returns the devices correspoding to it. A device can also be part of multiple groups at the same time due to these dynamic queries. In order to match a device against a group, one would have to evaluate all these SQL queries at rune time, which is clearly not feasible. Therefore there has to be some system that determines in an efficient and fast manner the groups a device belongs to.

In other words given a relation like A contains {1,4,5} and B contains {1,4,6,7}, we should be able to say that the value 3 is contained in sets B and A. How do you create something like this using C++ ?


Solution – Reverse lookup sets

The cost / challenge that we face here is the computation cost for evaluating n queries each time an event occurs. Caching is a good solutiuon to store precomputed results and offset any computational overheads.

Therefore in this case

Set A Contains -> {1,4,5}
Set B Contains -> {1,4,6,7}

Pre-computing the reverse mappings would produce mappings that store the values against the set identifiers.

1 -> {A,B}
4 -> {A,B}
5 -> {A}
6 -> {B}
7 -> {B}

With this data structure in place, finding the presence of a value in any of the sets is now reduced to a single hash_find, of 0(1) cost.

C++ Implementation

Well, since we have sets, nothing beats the STL vector class in having fast access to a set of values. Since you need a fast map, STL hash_map is the best in this class. As you can see STL, written using templates gives some really useful libraries (usually containers and algorithms) you can use in any situation. But this barely represents the full use of STL.

How would you represent our solution in C++?

First off we need to represent a set of integers. Since we are using a vector, this can be represented as vector<int>. Next off we need a hash that will map the integer values to a vector containing the set idenitifers.

integer -> vector<int> mapping is required.

Therefore this would be represented as a hash

hash<int, std::vector<int>>

Not Generic

However as you might have noticed even though we decomposed the problem into a generic set mapping problem, we are now speaking in concrete terms like int and std::vector<int>. This will not help in generalization at all. If we do not generalize we cannot create a class that can be re-used in multiple situations. If a class is not generic enough, folks who want to reuse (for eg find certain names in multiple sets) will be faced with the uphill task of customization or modification of old code, and no one will take that up coz everyone knows how easy it is to write new code rather than modify old code. (a popular fallacy).

Therefore we need to implement the solution to this problem such that you can create a reverse index of ANY type of values. ie the sets can have numbers or strings or classes or any other type. The class we write should be able to reverse calculate the memberships. Only then would the code we write be more useful and achieve a broader reuse target.

Generic C++ Solution

This was solved by creating a templated class that takes as its argument two types, which represents the type of the set idenitifer (A,B,C) and the type of the values that are contained in these sets. That right. The class takes TYPES as its arguments.

It would then create the right type of hashes and vector inside it, of type int or string or whatever type you chose and then perform the mapping and reverse mapping on these types. Here is the actual code that does this in just 3 pages. (WARNING – this code will loook scary if you do not know hash_map declaration syntax)
Isnt it cool? What you have just done is specified an algorithm or a procedure independent of the types it contains. This allows your algorthm to be used against any data types and more importantly other folks with similar requirement can easily reuse your code. This class can now be used to create precomposed lookups for any kindof of data types.

-> STL Power

What you have seen above is why STL is so useful and powerful a tool. It allows you to encaspulate a concept (in this case reverse lookups on precomputed indexes) compared to a day to day code which encasulates an extrememly specific action. In fact templates allows you to generate code, that does the type specific task your program requires.

Once you start writing code using templates, the surrouding code also soon becomes templatized and soon you will found yourself writing entire templates of generic algorithms that can be resued as a body of code. In fact the whole of STL has been created in this manner. Someone thought of generalizing algorithms and then they required other artifacts (eg pointer = iterator concept) to support this and so on an so forth. It is a really pweorful concept and one helluva useful tool.

The downside

Unfortunately the readabaility is a bit of a pain but once you get used to this sort of code it becomes easy. The pay offs are huge and the work satisfaction is also a major boost.

Design Compromises

The ideal design would be a mix of all good things in life, which as we know, is never ever achievable.

Therefore, in real life, designs strive to achieve an excess of at least one factor, which determines the personality of the system being constructed. It could therefore tend towards performance /  simplicity /  or aesthetic beauty(I hope you thought of apple here *) and so on.

Whenever a design achieves more than one of these goals, it comes into the realm of classics.

Change

I will use a piece of the data collection framework, which we were dicusssing to introduce the concept of change. Our framework, uses a callback pattern whereby handlers can be associated with data which is collected, which can be later invoked from within the framework, to make enable modifications and processing.

This used to be done like this

framework.addProcessor(fn_x1)
framework.addProcessor(fn_x2)
framework.addProcessor(fn_x3)
framework.DataToBeCollected = Metadataspecs; framework.collect()

framework.process (invokes the callback functions

x1,x2,x3 on the collected data

Later on, a need for pre-processing the meta – data was felt. i.e Even before the data was collected, some processing had to be done like loading already collected data into the columns from the database, so that they neednt be collected again.

We achieved this by adding a new classs of functions to the framework. Now the sequence of calls became something like this –

framework.addPreProcessor(fn1)
framework.addPreProcessor(fn2)
framework.addPreProcessor(fn3)
framework.addProcessor(fn_x1)
framework.addProcessor(fn_x2)
framework.addProcessor(fn_x3)
framework.DataToBeCollected = metadataspecs;

framework.preprocess -(invokes the callback functions)

fn1,fn2,fn3 on the MetaData

framework.collect()

framework.process (invokes the callback functions)

x1,x2,x3 on the collected data

This is stupid. We had to add to a new set of functions to the base framework, to add this new functionality. Adding the new set of functions to an already in use framework, requires lot of attention and thoroughness to ensure that nothing breaks, not to mention that extra bit of reading to understand how things used to work before.

However had we designed the classes like this instead, we could have saved a lot of trouble in this area.

framework.addCallBack(META, fn_x1)
framework.addCallBack(META, fn_x2)
framework.addCallBack(META, fn_x3)

framework.addCallBack(DATA,fn_x1)
framework.addCallBack(DATA,fn_x2)
framework.addCallBack(DATA,fn_x3)

framework.DataToBeCollected = MetaDataSpecs;

framework.Process(META) (invokes callback fns tagged with META)

fn1,fn2,fn3 on the metaData

framework.collect()

framework.process(DATA) (invokes callback fns tagged with DATA)

x1,x2,x3 on the collected data

Extensibility

I hope the above example was able to demonstrate, what it would mean to have extensibility aka maintainability in design. Extensibility in common parlance would mean avoiding changes to existing code ie code which is already tested and running should not require modification to add new functionality.Changes are evil extensibility is cool.

Changes to existing code would be nothing but lost investment and time on old stable code. Classes should be designed to facilitate this. The cool way of expressing this idea is to say that classes should be open for extension, but closed for change.

* Apple systems almost always seems to achieve and exceed their aesthetic goals but unlike the legendary Steve Wozniak built systems, none of them seems have achieved the same levels of engineering wizardry ever. (though the spirit of of engineering excellence might have survived)

The design from the part 2 of the series demonstrated the flexibility arising from the capability to mix and match framework class or roll custom classes. In this section let us check out another aspect of design, arising from a very common paradigm, but frequently ignored.

Check out, how our current design addresses multiple collectors used as part of the framework

class UserClass::UserClass()
{
ICollector * pSoap = new CSoapCollector(SoapMeataData);
ICollector * pSnmp = new CSnmpCollector(SnmpMetaData);
}

UserClass::Collect()
{
pSoap -> Collect()) + error checking / logging / storing data pointers etc
pSnmp -> Collect()) + error checking / logging / storing data pointers etc
}

UserClass::Save()
{
pSoap -> GetData();
//Loop tables and save data / check errors using DataSave Class.

pSnmp->getData();
//Loop tables and save data / check errors using DataSave Class.
}

AVOID CODE DUPLICATION

The more the number of collectors used, the more the amount of code that is duplicated and this gets multiplied across the number of user classes that are created. Imagine the amount of changes that has to be made, in case one were to change even a simple thing like an error code. Hiving off such common functionality across to a helper class would be useful.

class CCollectorBasket()

{
AddCollector(ICollector *);
// adds the collector instance to a list of collectors present in the basket
CCollectorBasket::Collect()
{
/* loops over all collectors
Invokes Collect()
Do error checks */

}

CCollectorBasket::Poll()
{
/* loops over all collectors
invokes poll()
Do error checks etc */

}

CCollectorBasket:: Inspect()
{
/* loops over all Collectors
loops over data collected from the collectors
Hands them to inspection class */

}

}

After employing the helper class in our design, the user classes would now look like this, unless of-course any custom mixing is required.

class UserClass

{
CCCollectorBasket oBasket;UserClass::UserClass()
{
oBasket.AddCollector(new CSnmpCollector(MetaData));
oBasket.AddCollector(new CSoapCollector(MetaData));
}

UserClass::Poll()
{
oBasket.Poll()
}

UserClass::Save()
{
oBasket.Save()
}

}

ps : Note that the user does not have to use the CCollectorBasket if not required and retains all freedom to create a custom mix of collectors and the work done using them.

OO theory states that if you have a class Fly(), next time you require flight, you could reuse Fly(). That class would get an A+ for reuse – right? However experience will teach you that you will never ever require the exact Fly() capability ever again. What might be useful though are methods like Find-Co-ordinates() / GetRadioSignal() / GetFlyingTime() / GetTimeZon() / ConvertToTimeZone() / GetDestinationCoordinates() / GetWeatherForeCast() and such. Even these might require to be customized, neverthless you will find this type of minute reuse more plausible rather than wholesale reuse like those countless books on 24 hrs C++ would like you to believe.

PARANOID DE-DUPLICATION

Knowing that reuse is good is one thing. Knowing which code can be reused is another. It is likely that the original Fly() author would never have, in his wildest dreams, dreamed that the system that was created will be a modest success (aka makes profits) and will form the basis of two or three other implementations based on the same base code. Therefore the chances of the system being designed for reuse as a base library are remote. But what might help here is a paranoid fear of duplication that, if followed religiously will lead to creation of countless small pockets of code that can be reused and adds value to the overall system.

It is not just concrete action functions like ConvertToTimeZone() that should be de-duplicated but just about ANY code fits this bill. Even if they might not make sense in another context these blocks of de-duplicated code makes sense from a maintainability and extensibility point of view. Once factored out, like fractals that grow on its own, these peices of functionality slowly starts to make sense on their own and useful reasons for extending them keep emerging as the code base grows. Frameworks are all about value add, and any additional value keeps adding to the overall usability, to finally reach that tipping point which makes a product or code valuable enough to be purchased.

In the next part of the series we shall look at another piece of code that got factored into its own small class to aid in reuse.

The part1 of the series, on real life class design using C++, established our requirement

– we require a framework for collecting data easily from network devices –

The act of collecting network data is popularly termed as Inventory collection in the NMS (Network Management System) space. Now that we have a requirement, how do we implement this in C++?

It does make a difference who you learn it from


When i started out with C++, I learned the language mostly by reading multiple text books on the topic. My assumption was that if I went through at least 4-5 texts, i would encounter the most number of facts (tid-bits really) that any one author could not cover in a single book.

Unfortunately I was right about this assumption, because that was what i remained for a long time – tid-bit person. I could quote clever tricks with the type system., quote reams of in-compatibilities and feature comparisons between compilers and put my finger on many such clever sounding facts.

Ideal Learning

Alas, this my dear reader, while definitely good to know for practicing programmers, is not the MOST important piece of  knowledge you need from C++. The thing that really matters is how you would use C++ in controlling the complexity or making maintenance easier, which in reality is the most important reason why you should be using C++ compared to say C or FORTRAN or something else.
The Swiss Army Design Methodology

Instead most C++ text books i encountered in local publications, start out with a SINGLE chapter to impress the power of C++ on you, giving an example of a CWindow class, which reduced the entire WinMain to 3 or 4 lines or a CEmployee class, which could do everything relating to employee related details.

Many in one

Many in one

All that was required of you was to stick more and more functions into these super class and you had all the benefits of C++, much like the swiss army knifes which come with 5,10,15 different functions from your basic knife to a basic lens.

If you learned C++ in this manner, and read similar crappy text books, this would be the way in which you might start out your design of the Inventory Collection Framework.

Swiss Army Inventory Class


class CInventory()
    {
    Collect (MetaData *) ;
    PollData (MetaData *) ;

    InspectData (MetaData *);

    SaveData (MetaData *);
   };

This single class would be THE framework if you are from the swiss army design school. It could do everything related to Inventory collection and all that your users would have to do, would be to invoke this class and its member functions. I wish using C++ to control complexity is as simple as this.

Whats Wrong with Swiss Army Classes

Swiss army is the easiest way to get off the design ground and start coding. In fact that’s just what people do when they try to design as they code. Soon you would see that this one big class has many smells

1. Way too many unique functions ( > 10?)
2. Way too many member variables ( > 10?)
3. Too many lines of code in class source ( > 300 ?)

Think of what you would have done without the BIG class you just “designed”. All the individual functions that are a part of this class would have existed outside of this class, say in the main.cpp and supposedly this is bad for program maintenance and getting complexity under control.

All that you have achieved by putting together this one big class is to transfer all of this complexity into CInventory.cpp from main.cpp, such that your main function looks much more neater. So get this into you – cleaning up main() is not the reason why you write classes.

Separation of Concerns

The little tools that come with the swiss army knife are just that – tiny little tools that can be used in an emergency but never as good as the real big carving knife or the real lens or the big scissors. So that’s we will do with our big Inventory class –  break them up into proper classes that will handle one responsibility and do that well.

The Broken up CInventory Class

  1. class DataSave; // Reads metaData and saves the data
  2. class Inspect;    // Reads MetaData and allows data processing functions to be invoked
  3. class SnmpCollect // Reads MetaData & performs DataCollection
  4. class CInventory(); // Central class to integrate all the above

Once you have done the breakup of functionality, more possibilities open up.

Possibilities

  1. Why not make the SnmpCollector class an instance of ICollect, such that later on i can have a SOAP data collectionframework that implements ICOllect and it might just plugin into this framework and truly make multiplke data sources a possibility.
  2. MetaData is being read again and again by all the different classes. Why not make this a single class such that no other classes needs to understand how to do this ? Additionally if i change my metadata format from structures in .h files tosay XML, i need not worry about the wave of changes in other areas of my program
  3. Do i need to make DataSave an instance of IDataSave such that i have different classes for say talking to SQL-Server / PostGres ?

After further breakup –

  1. class CMetaData; //Knows how to interpret MetaData and translate them into generic data specs of row/table
  2. class CRow,CColumn, CTable,CAttribute, CScalars – Common classes to represent the interpreted MetaData
  3. class CDataSave; // saves the data, given an in-memory data spec
  4. class CAdoDbase; // Provides implementation of data persistence for all ADO compatible data sources in windows
  5. class CDataCollectionConductor; // Interprets the data spec and invokes the required ICollector functions
  6. interface ICollector;
  7. class SnmpCollector; // Implements ICollector and performs SNMP;
  8. class CDataInspect; //Provided way to inspect your data before after data collection and modify the same
  9. class CDataCollectorBucket; // Provides ways to put together multiple instances of data conductors and save them

Benefits

  1. The entire program and its associated complexity is divided into nice little pieces that you can understand
  2. The pieces can be improved without affecting others
  3. Maybe the naming of the little pieces would be good enough to add more meaning to the program.
  4. Easier to split work between different developers.
  5. Better chances of reuse
  6. Flexibility – Its easy to use pieces of the framework without swallowing the whole design.

The most important benefit of distributing the scope and different responsibilities amongst multiple classes was that new patterns began to emerge. We could find further commonalities that where ideal to be spun off into their own individual classes and using new interfaces to break dependencies. Only when you get to this stage do you, start seeing the benefits of other features of C++, like virtual classes, interfaces, inheritance and such.

So the guide to beginning on the track to good design, is to start splitting your classes into smaller ones, with specific responsibilities and specific NAMES (Your class names are an important smell as to how good the design is.)