C++


The latest version of Visual Studio is here and here are the different versions it supports (Premium / Ultimate / Professional) and their feature differences.

Personally i’m most thrilled and would be testing out these VC++ features in the same order –

  • Call Hierarchy Window (great for learning existing code layout)
  • VC++ Parallel programming Library (PPL)
  • Return of the MFC class wizard.
  • Ribbon Designer for MFC. Wonder how this control works and how to define a context.
  • Code profiling and coverage (out with Rational for these tasks)  with emphasis on threading
  • Generating code and architecture diagrams
  • Recordings of manual tests and their playback – wow for QA teams i suppose ?

🙂

I recently came across a cool replacement for Acrobat Reader, Foxit. Why do i want to use Foxit?

  1. Less memory consumption
  2. Feels more responsive
  3. Tabbed reading

It goes to proves that building a better mouse trap still works, even against such a de-facto standard like using the Adobe reader for reading PDF files.

Here is the list of MAJOR breaking changes to be aware of when moving from VC 7.1 (Visual Studio 2003) to VC 8.0 (Visual C++ 2005).

Pretty old information but there was no page that listed everything (important) on a single page with contextual info.  For a FULL of C++ changes consult MSDN.

New features in Visual Studio 2005 will be a separate (bigger) post .

Breaking Changes in Environment

1. VS 2005 removes the support for 95 and NT.

2. Single Threaded C Run time library support is not available anymore.

3. New versions of common shared DLL’s introduced. Mixing code that uses different DLL versions could cause issues.

4. Application and library deployment model has now -> changed. Core run-time DLLs will now have to be stored in the system cache (WinSxS) as side by side assemblies or as private assemblies and applications can link to the DLLs ONLY by describing them using application manifests (generated and embedded automatically into the binary by VS 2005). More info here . Some programmatic info about manifests here.

5. A  new run time DLL (MSVCP80.DLL) is required for applications linking against the DLL version of CRT

NOTE : These are the new common DLLs – MSVCR80.dll, MFC80*.dll, ATL80.dll, MSVCP80.DLL. The version number used is 8.0.50608.0. (This is shipped with Vista too) However your app might link to other versions during compilation due to policy redirect.

NOTE: The version nos for the version of the DLLs installed by Visual Studio 2005 SP1 is 8.0.50727.762. If these are not present your app will not run.

Breaking Changes in C++ / Features that show new behaviour


WARNINGS

1.  Security is now officially supported. 🙂 Expect loads of Deprecation warnings for ANSI C string routines. define _CRT_SECURE_NO_DEPRECATE and _SCL_SECURE_NO_DEPRECATE to shut up the compiler. The best way to do this to put them up in the pre-processor directives section rather than figuring out which include file gets included from all sources.

2. Visual Studio supports lots custom non-ANSI C methods. But the naming convention for those functions is supposed to be _fn().  If you where using them without the leading underscore you will get warnings.

BREAKING CHANGES

1. new will now by default throw an exception if it fails instead of returning NULL which was the wrong  and standard behavior up until now. (Vanilla 2005 has a bug which causes release version to throw and debug version to return null. This has been fixed in SP1. Another reason why you should install SPI + have a catch block for new)

2.  volatile has now acquired memory fencing semantics, which changes the behaviour of read-write operations in multi-threaded code.

3. Named Return Value Optimization is now available. Classes that have side effects while being
copied / constructed will face problems. Discussed here

4. CRT functions now have SAL – ie Standard C functions can now validate their input parameters and will throw asserts in debug mode if used wrongly. This is helpful for catching more errors.

5. std::time_t is now 64 bit (up from 32 bit)

6. STL now supports iterator debugging and checks. Debug builds will have STL iterator debugging enabled by default. This will throw asserts and cause failures in in-correct code and help to ensure correctness in code. It can be set back by using _HAS_ITERATOR_DEBUGGING=0 and SECURE_SCL=0(not recommended)

7. wchar_t used to be #define to unsigned short. Now it is a  built-in type and can be controlled via project settings. So code that treats unsigned short as wchar_t will now break due to type mis-match. You will get linker errors, if related modules treat this setting differently, coz the C++ name mangling scheme will generate different linkage names depending on the wchar_t type.  

8. Many CRT functions now set errno, where they never used to set any before. This is good for debuggability but bad for code that delayed their errno checks.

9. Processor architecture specific optimizations has been removed and a blended architecture is targeted. If your binary has some subtle bugs and where hidden due to the specific binary produced, it might now surface.

COMPILATION ERRORS / REMOVED FEATURES

1. global functions and variables that reports OS versions from within CRT is now gone.

2.  Some C++ functions like strstr now return const pointers instead (eg strstr)

3. You now need to use the address of operator to get the class member function address. Link

4. Managed C++ support is now dropped which has been replaced by C++/CLI.

5. Exception class has been moved to std namespace.

ps : Here is a brief List of Visual Studio and .NET versions

The basic idea behind templates ? Its been done before !!

The word definition of template is “something that serves as a model / pattern

Patterns …. hmmmm ….. Ever heard of patterns in software? The idea behind patterns is that no software is totally unique. The same way all cars have a steering and a set of tires, most software also have similar pieces that constitute to make them.

If you have to acceept user input, thats been done before. If you have to sort, thats been done before. If you want to use a hash, thats been done too. The only difference in a new piece of software is the way all these common stuff is put together. So why re-invent the wheel ….err sort ?

The inspiration !!!

So lets say you have written a sort program in C to sort an array of integers. And lets say you have used Binary Search, which is infamously hard to get right. After all those fixes and tests, you have an efficient fast implementation of Binary Sort.

Just as were about to finish the task, the requirements change and now you get an array of structures which should be sorted instead of using integers. Oops !!! Wouldnt it be easy, to instead have some way of encasulating that simple logic behind the binary sort and have it used for all types of data types? That way we could save a lot of time whenver we created new code / modified existing ones.

This thought is the inspiration behind templates. Almost all advanced languages give this facility. (java and c# only recently acquired this)

Logic vs Data

Give a thought about how you would make this logic reusable

int AddNumbers (int a, int b)
{
return a + b;
}

There is a way in C++ to achieve this without templates, using overloaded functions

int Add(int a, int b) {return a + b;}
char Add (char a, int b) {return a + b;}
double Add (double a, double b) {return a + b;}

You get the idea. Once we write enough number of overloaded functions, (imagine the pain !!) you could write something that simply called Add(x,y) and the right type of function would get invoked.

The realization, is that the type of the data is an important constraint, when creating reusable logic. Even if the logic is reusable, since the data type is different, statically compiled languages like C and C++ are forced to create multiple versions of the same code to satisfy compiler requirements.

Wouldn’t it be cool if the compiler does this for you and save all that trouble of defining and creating many such overloaded functions?

Enter templates …

template <class T>

T Add(T a, T b) {return a+b;}

Thats it. This will work EXACTLY LIKE THE OVERLOADED function shown above.

There will be many versions of function Add() created inside your binary like in an overloaded function

What then is the difference between an overload and a template ?

When you use a template, the compiler is intellignet enough to make a first pass through your code, to find, what versions of the overloaded function template (from now on simply referred to as the template) you require.

So if you have invoked Add with double as it’s parameter, the compiler will EMIT code, replacing the word double wherever the Type T is used inside the the template of the function Add. If it encounters Add with say, a structure type used as parameter, it will EMIT the Add template code, with the structure type used in place of the Type T keyword.

In C++ Templates is purely a compiler technology

As you can see, templates are created by creating copies of the real logic, by substituting the type used in the algorithm, with the one actually passed in, much like a programmer would copy paste his original code with the changed types. This code is then compiled along with other regular code and voila your program magically has all the overloaded definitions of the logic, it requires.

Templates vs Overloads

Though template is implemented as overloads, the basic difference is that templates are compiler generated. Therefore it knows which overloads are required and only generates those types.

So for example if program A includes the Add template and only uses it for computing integers, there would be only one overloaded form of Add created in the program. if program B computes both int and doubles, two of these would be created by the compiler. As you can see, templated code are “instantiated” based on the source code that uses it.

But overloads have to be pre-created if you want to use them and most importantly, someone should have created them before by hand (buggy + painful) if it should work – not so cool or easy to use if you ask me

Any more uses ?

Let me spell this out, if you missed the idea – but templates are actually creating new code. CODE GENERATION !!!! Now that is the holy grail of computing. Imagine if computers could generate code automatically !! Templates are one step towards this. it gives us a way to program programs that can create other programs. This gives ultimate power (ie if you are a geek).

Clever people have used and abused this technique for even more powerfull stuff, like how about passing a templated function or class itslef as parameter for another template ?

Drawbacks

Most people are put off by the syntax of the code and more importantly the synatx of the error messages that templates produce. The angle brackets that templates use can be nested to define a template of template of ….. etc.

If you remember that the compiler simply substitutes the template with the actual type and then compiles the code, it would then create error messages that look something like class A < class B , class c> :: line x error – some people hate this and i must admit, it becomes more difficult to parse once you start using the templates in a more powerful manner.

But that is no reason to ignore something as powerful as templates. IMO if you are not using templates, you are simply using C++ as a glorified C and nothing more.

So get cracking and write some cool logic, THAT WILL BE REUSED.

ps: Here is a real life example where i could use templates to create a generic solution to a very commonly occuring problem.

Cisco netmanager product version 1.1 now features a new notification filtering feature, which will allow users to specify rules to filter the email notifications they chose to recieve. To implement this feature I had to implement a rule engine, which would filter the event notifications according to the rules specified. In this section we shall investigate how this can be done in C++.

Though the new capability is fairly straight forward, the implementation is not so …well simple. When events happen in the system, the only information that is at hand is the type of event that has occured and the device that caused the event. Based on this the system has to filter the event based on a combination of groups or devices, the event severity and the actual event type. A single rule can specify multiple groups and multiple events.

The Challenge

The primary difficulty in applying the rules, in the netmanager system, is that netmanager has the capability for creating DYNAMIC groups. ie You could write an SQL query that returns the devices correspoding to it. A device can also be part of multiple groups at the same time due to these dynamic queries. In order to match a device against a group, one would have to evaluate all these SQL queries at rune time, which is clearly not feasible. Therefore there has to be some system that determines in an efficient and fast manner the groups a device belongs to.

In other words given a relation like A contains {1,4,5} and B contains {1,4,6,7}, we should be able to say that the value 3 is contained in sets B and A. How do you create something like this using C++ ?


Solution – Reverse lookup sets

The cost / challenge that we face here is the computation cost for evaluating n queries each time an event occurs. Caching is a good solutiuon to store precomputed results and offset any computational overheads.

Therefore in this case

Set A Contains -> {1,4,5}
Set B Contains -> {1,4,6,7}

Pre-computing the reverse mappings would produce mappings that store the values against the set identifiers.

1 -> {A,B}
4 -> {A,B}
5 -> {A}
6 -> {B}
7 -> {B}

With this data structure in place, finding the presence of a value in any of the sets is now reduced to a single hash_find, of 0(1) cost.

C++ Implementation

Well, since we have sets, nothing beats the STL vector class in having fast access to a set of values. Since you need a fast map, STL hash_map is the best in this class. As you can see STL, written using templates gives some really useful libraries (usually containers and algorithms) you can use in any situation. But this barely represents the full use of STL.

How would you represent our solution in C++?

First off we need to represent a set of integers. Since we are using a vector, this can be represented as vector<int>. Next off we need a hash that will map the integer values to a vector containing the set idenitifers.

integer -> vector<int> mapping is required.

Therefore this would be represented as a hash

hash<int, std::vector<int>>

Not Generic

However as you might have noticed even though we decomposed the problem into a generic set mapping problem, we are now speaking in concrete terms like int and std::vector<int>. This will not help in generalization at all. If we do not generalize we cannot create a class that can be re-used in multiple situations. If a class is not generic enough, folks who want to reuse (for eg find certain names in multiple sets) will be faced with the uphill task of customization or modification of old code, and no one will take that up coz everyone knows how easy it is to write new code rather than modify old code. (a popular fallacy).

Therefore we need to implement the solution to this problem such that you can create a reverse index of ANY type of values. ie the sets can have numbers or strings or classes or any other type. The class we write should be able to reverse calculate the memberships. Only then would the code we write be more useful and achieve a broader reuse target.

Generic C++ Solution

This was solved by creating a templated class that takes as its argument two types, which represents the type of the set idenitifer (A,B,C) and the type of the values that are contained in these sets. That right. The class takes TYPES as its arguments.

It would then create the right type of hashes and vector inside it, of type int or string or whatever type you chose and then perform the mapping and reverse mapping on these types. Here is the actual code that does this in just 3 pages. (WARNING – this code will loook scary if you do not know hash_map declaration syntax)
Isnt it cool? What you have just done is specified an algorithm or a procedure independent of the types it contains. This allows your algorthm to be used against any data types and more importantly other folks with similar requirement can easily reuse your code. This class can now be used to create precomposed lookups for any kindof of data types.

-> STL Power

What you have seen above is why STL is so useful and powerful a tool. It allows you to encaspulate a concept (in this case reverse lookups on precomputed indexes) compared to a day to day code which encasulates an extrememly specific action. In fact templates allows you to generate code, that does the type specific task your program requires.

Once you start writing code using templates, the surrouding code also soon becomes templatized and soon you will found yourself writing entire templates of generic algorithms that can be resued as a body of code. In fact the whole of STL has been created in this manner. Someone thought of generalizing algorithms and then they required other artifacts (eg pointer = iterator concept) to support this and so on an so forth. It is a really pweorful concept and one helluva useful tool.

The downside

Unfortunately the readabaility is a bit of a pain but once you get used to this sort of code it becomes easy. The pay offs are huge and the work satisfaction is also a major boost.

Design Compromises

The ideal design would be a mix of all good things in life, which as we know, is never ever achievable.

Therefore, in real life, designs strive to achieve an excess of at least one factor, which determines the personality of the system being constructed. It could therefore tend towards performance /  simplicity /  or aesthetic beauty(I hope you thought of apple here *) and so on.

Whenever a design achieves more than one of these goals, it comes into the realm of classics.

Change

I will use a piece of the data collection framework, which we were dicusssing to introduce the concept of change. Our framework, uses a callback pattern whereby handlers can be associated with data which is collected, which can be later invoked from within the framework, to make enable modifications and processing.

This used to be done like this

framework.addProcessor(fn_x1)
framework.addProcessor(fn_x2)
framework.addProcessor(fn_x3)
framework.DataToBeCollected = Metadataspecs; framework.collect()

framework.process (invokes the callback functions

x1,x2,x3 on the collected data

Later on, a need for pre-processing the meta – data was felt. i.e Even before the data was collected, some processing had to be done like loading already collected data into the columns from the database, so that they neednt be collected again.

We achieved this by adding a new classs of functions to the framework. Now the sequence of calls became something like this –

framework.addPreProcessor(fn1)
framework.addPreProcessor(fn2)
framework.addPreProcessor(fn3)
framework.addProcessor(fn_x1)
framework.addProcessor(fn_x2)
framework.addProcessor(fn_x3)
framework.DataToBeCollected = metadataspecs;

framework.preprocess -(invokes the callback functions)

fn1,fn2,fn3 on the MetaData

framework.collect()

framework.process (invokes the callback functions)

x1,x2,x3 on the collected data

This is stupid. We had to add to a new set of functions to the base framework, to add this new functionality. Adding the new set of functions to an already in use framework, requires lot of attention and thoroughness to ensure that nothing breaks, not to mention that extra bit of reading to understand how things used to work before.

However had we designed the classes like this instead, we could have saved a lot of trouble in this area.

framework.addCallBack(META, fn_x1)
framework.addCallBack(META, fn_x2)
framework.addCallBack(META, fn_x3)

framework.addCallBack(DATA,fn_x1)
framework.addCallBack(DATA,fn_x2)
framework.addCallBack(DATA,fn_x3)

framework.DataToBeCollected = MetaDataSpecs;

framework.Process(META) (invokes callback fns tagged with META)

fn1,fn2,fn3 on the metaData

framework.collect()

framework.process(DATA) (invokes callback fns tagged with DATA)

x1,x2,x3 on the collected data

Extensibility

I hope the above example was able to demonstrate, what it would mean to have extensibility aka maintainability in design. Extensibility in common parlance would mean avoiding changes to existing code ie code which is already tested and running should not require modification to add new functionality.Changes are evil extensibility is cool.

Changes to existing code would be nothing but lost investment and time on old stable code. Classes should be designed to facilitate this. The cool way of expressing this idea is to say that classes should be open for extension, but closed for change.

* Apple systems almost always seems to achieve and exceed their aesthetic goals but unlike the legendary Steve Wozniak built systems, none of them seems have achieved the same levels of engineering wizardry ever. (though the spirit of of engineering excellence might have survived)

The design from the part 2 of the series demonstrated the flexibility arising from the capability to mix and match framework class or roll custom classes. In this section let us check out another aspect of design, arising from a very common paradigm, but frequently ignored.

Check out, how our current design addresses multiple collectors used as part of the framework

class UserClass::UserClass()
{
ICollector * pSoap = new CSoapCollector(SoapMeataData);
ICollector * pSnmp = new CSnmpCollector(SnmpMetaData);
}

UserClass::Collect()
{
pSoap -> Collect()) + error checking / logging / storing data pointers etc
pSnmp -> Collect()) + error checking / logging / storing data pointers etc
}

UserClass::Save()
{
pSoap -> GetData();
//Loop tables and save data / check errors using DataSave Class.

pSnmp->getData();
//Loop tables and save data / check errors using DataSave Class.
}

AVOID CODE DUPLICATION

The more the number of collectors used, the more the amount of code that is duplicated and this gets multiplied across the number of user classes that are created. Imagine the amount of changes that has to be made, in case one were to change even a simple thing like an error code. Hiving off such common functionality across to a helper class would be useful.

class CCollectorBasket()

{
AddCollector(ICollector *);
// adds the collector instance to a list of collectors present in the basket
CCollectorBasket::Collect()
{
/* loops over all collectors
Invokes Collect()
Do error checks */

}

CCollectorBasket::Poll()
{
/* loops over all collectors
invokes poll()
Do error checks etc */

}

CCollectorBasket:: Inspect()
{
/* loops over all Collectors
loops over data collected from the collectors
Hands them to inspection class */

}

}

After employing the helper class in our design, the user classes would now look like this, unless of-course any custom mixing is required.

class UserClass

{
CCCollectorBasket oBasket;UserClass::UserClass()
{
oBasket.AddCollector(new CSnmpCollector(MetaData));
oBasket.AddCollector(new CSoapCollector(MetaData));
}

UserClass::Poll()
{
oBasket.Poll()
}

UserClass::Save()
{
oBasket.Save()
}

}

ps : Note that the user does not have to use the CCollectorBasket if not required and retains all freedom to create a custom mix of collectors and the work done using them.

OO theory states that if you have a class Fly(), next time you require flight, you could reuse Fly(). That class would get an A+ for reuse – right? However experience will teach you that you will never ever require the exact Fly() capability ever again. What might be useful though are methods like Find-Co-ordinates() / GetRadioSignal() / GetFlyingTime() / GetTimeZon() / ConvertToTimeZone() / GetDestinationCoordinates() / GetWeatherForeCast() and such. Even these might require to be customized, neverthless you will find this type of minute reuse more plausible rather than wholesale reuse like those countless books on 24 hrs C++ would like you to believe.

PARANOID DE-DUPLICATION

Knowing that reuse is good is one thing. Knowing which code can be reused is another. It is likely that the original Fly() author would never have, in his wildest dreams, dreamed that the system that was created will be a modest success (aka makes profits) and will form the basis of two or three other implementations based on the same base code. Therefore the chances of the system being designed for reuse as a base library are remote. But what might help here is a paranoid fear of duplication that, if followed religiously will lead to creation of countless small pockets of code that can be reused and adds value to the overall system.

It is not just concrete action functions like ConvertToTimeZone() that should be de-duplicated but just about ANY code fits this bill. Even if they might not make sense in another context these blocks of de-duplicated code makes sense from a maintainability and extensibility point of view. Once factored out, like fractals that grow on its own, these peices of functionality slowly starts to make sense on their own and useful reasons for extending them keep emerging as the code base grows. Frameworks are all about value add, and any additional value keeps adding to the overall usability, to finally reach that tipping point which makes a product or code valuable enough to be purchased.

In the next part of the series we shall look at another piece of code that got factored into its own small class to aid in reuse.

Next Page »