The other day, while automating our load tests, (you heard that right – we are even automating load tests as part of agile efforts) i came across an interesting scenario.

Load runner, which we where using to automate the clustered web application, spread all its output across tons of small files, containing one/many similar lines. The 2-3 lines each file contained would look something like this

Report.c(199): Error: App Error: refresh report. DataSet took longer than the refresh rate to load.  This may indicate a slow database (e.g. HDS) [loadTime: 37.708480, refreshRate: 26].      [MsgId: MERR-17999]
Report.c(83): Error: App Error: report took longer than refresh rate: 26; iter: 32;  vuser: ?; total time 41.796206    [MsgId: MERR-17999]

But what we wanted to know to know at the end of the test was a summary of errors as in

  • Report.c(199): Error: App Error: refresh report. DataSet took longer than the refresh rate to load.  This may indicate a slow database (e.g. HDS) – 100 errors
  • Report.c(200) Error Null arguments – 50 errors

Approaching the solution

How would you solve this problem ? Think for a moment and not more ….

Well being the C++ junkie that i’m, i immediately saw an oppurtunity to apply my sword and utility belts of hash-tables / efficient string comparisons etc and whipped out Visual Studio 2010 (i havent used 2010 too much and wanted to use it) and then it occured to me

Why not use Java ? I had recently learned it and it might be easier.

The Final Solution

cat *.log | sed  ‘s/total time \([0-9]*\)\.\([0-9]*\)/total time/;s/loadTime: \([0-9]*\)\.\([0-9]*\)/loadTime: /;s/iter: \([0-9]*\)/iter: /;s/refresh rate: \([0-9]*\)/refresh rate:  /;s/refreshRate: \([0-9]*\)/refreshRate:  /’ | sort | uniq –c

I probably should optimize the sed list of commands such that i can pass in the replacement commands as input to sed from a file. But i left it at that, happy that my solution was ready to use / would work all the time / does not require maintenance / saved me a lot of time.

The end

 

Thinking about the episode, i was reminded of the hit Japanese samurai movie Zatoichi and its happy ending at the end.

  • The villain (problem) is dead
  • Samurai is happy with his effort
  • The subjects (files/compiler/system/software/hardware) are happy due to the efficiency created
  • Karma (company + job) is happy due to the time saved.

and to boot it nearly follows the Art of War zen which advices the clever warrior to win without pulling out his sword …

 

Happy coding !!!

 

 

Advertisements

Having been a full-time C++ programmer for all of the past ten years, I recently had the unique opportunity of seriously studying Java almost 10-15 years after it started becoming popular. In the meantime I’m also playing with Python on a personal note. This affords me a unique perspective, of having seen the past and at the same time looking at two new languages with a fresh eye.

C++ vs Java in 2009 = Assembly vs C++ during the 1980’s

The impressions or the vibes that i get in this new learning process is eerily similar to the same doubts and choices i felt / read about during the initial stages of my education, while playing around with assembly, C++ and Visual Basic.

Needs assembly

Needs assembly or C

Assembly was THE choice if speed and memory choices was paramount and nothing else mattered but it was damn hard to use.  All the old timers claimed you had to know assembly to understand whats happening on the system and that C++ was only for productivity improvements. They lamented about how new minds never learned / worked seriously with assembly and hence where at a disadvantage when it came to figuring out what really happened inside the box.

However at the time (1980) C++ was fast catching up, even embedded systems where moving to C++(maybe without using the templates feature).  Java currently is in the same state that C++ used to be in 1980 wrt assembly and there are many who clamour for C or C++ to be taught in universities rather than Java for the sake of better understanding.

How can  Java be as fast as C++?

It is not. However the hardware that is around now is vastly superior to 1980 and 1990 hardware, which made Java seem slow THEN, in the same way C++ seemed slow in all those small devices compared to raw assembly or binary code. Our current crop of laptops are more powerful than the servers used during the time when Java was beginning to get popular. This improvement in hardware makes the difference in speed highly immaterial for all practical purposes, because human beings cannot make out the difference all that much between 500  and 50 microseconds.

Fast forward to 2010

Is as fast as a 1999 computer and runs Java

Is as fast as as 1999 computer and runs Java

C++ is what you require if you need pure speed like in the critical loops or core libraries. You could substitute the word Assembly instead of C++, in the previous line, had this post been written 15 years back.

But Java is more useful from a business point of view for application development due to increased productivity and more features.  It is therefore no coincidence then that embedded devices like mobile phones (and even dog collars, according to Java books) are now running Java. In fact the last Nokia mobile phone that i bought was guided by the factor that it had a data port using which i could upload custom Java program onto it.

Java vs Python = C++ vs Visual Basic

When C++ used to be THE top language around, and assembly was gradually being kicked to the ground, there was something else in periphery, which was Visual Basic and all other RAD languages, which basically gave you forms based development and automatic memory management.

Internal IT or back office apps need not be cool

Internal IT or back office apps need not be cool

However they were nowhere as fast as C++ and hence where relegated to internal IT applications, which never needed to be as slick as professional, cool looking  applications and could be developed fast enough.

Fast development is very critical when it comes to small shops with less resources or IT departments on a shoe string budget. Many business applications certainly fit the bill and where developed on Visual Basic. It is exactly the same situation now with Python when comparing it to Java.

Python in 2009 = Visual Basic in 1990

Developing using Python does definitely seem like a breeze with not much hassles around object orientedness or strict rules around exception declaration or big infrastructures. Is Python as comprehensive as Java, with as many no of apis as Java has ? Probably not, although it is fast catching up in popularity, like Java before it, caught with C++ and  that is saying a lot. There is no denying the fact that Python and Ruby would soon have many programmers who are trained in it by choice.

Will run Python as fast as Java

Will run Python as fast as Java

I have not compared the performance of Java vs Python. But i assumed Python could be slower from what i have heard on the web. Please do correct me if iam wrong on the above aspect.

If the economy keeps growing like it did in the last ten years, we would soon have quad core or even eight core desktops a common place and would not have to worry about Python’s performance compared to Java. I just hope folks don’t introduce messy co-orporate stuff and make it verbose and bulky as Java before it.

3  cheers for easy coding !!!

I just noticed that the biggest app running in my work laptop,  with respect to memory consumption is FireFox (3.0.4 if you have to know).  It consumes on an average of about 500 MB worth of memory. Even Outlook, comes only a far second. Of-course i have Visual Studio, MSDN and SQL -server et-all installed. But FireFox beats everything hands down when it comes to the frequency which with i use them.

All other applications seem to support the browser, or launch it within themselves. No applications seems to be made these days, without web being their core interface and to that end again browser has become a a single point of access. Even VM’s seems to favour running their clients inside a browser window.  When viewed from this angle, the rationale behind google’s shiny new web browser, chrome, becomes more apparent.

In short, the browser has become a platform of sorts, with the biggest players vying with each other for a piece of the action. What with all the VM efforts and Javascript optimization efforts , bearing fruit, the concept of browser as a platform, and Javascript as the ultimate language might be happening faster than we realize !!!

(note : Javascript is still 10x slower than compiled Java – but the point is, everyone involved seems to be accelerating their efforts to get there faster)

This was a doubt someone asked me recently. The doubt was specific to Java but i consider it to apply to all reference counted systems (.NET / JavaScript etc)

Consider a class box.

Box b1 = new Box();
Box b2 = b1;
.
.
.
b1 = null;

B1 is set to null but b2 still points to the original object. How is this? Why is b2 not destroyed

Answer

B1 = new Box
  • Box is created in memory (say at address 123)
  • Box object reference count is set to 1
  • B1 = 123
B2 = B1
  • Box object reference count is incremented by 1 ie Reference count = 2
  • B2 = 123
B1 = Null
  • Box object reference count is decremented by 1
  • If Box object reference count = 0 if yes destroy box object
At this point there are outstanding references to the Box object ie In plain words, the reference count of the object is not zero. This is the reason why the object does not get destroyed when B1 alone is set to null.
B2 = Null
  • Box object reference is decremented by 1
  • Reference count now becomes 0 and box object is destroyed

Object Deletion

At the point in time the reference count of the object goes to zero, the object will be considered safe for deletion.

But the actual cleanup of the object might not happen at that point. This i because the garbage collector has to run to determine and traverse all the objects that are ready for destruction and reclaim their memory. This is a rather pain in the neck and often causes large memory pile ups for some kind-of programs and you might then have to manually cause the GC to run.

When the GC runs, it will call the destructor (finalize() in Java-ese) of the object. So if you have some time bound programs that have to immediately perform some stuff when all the references go to zero, you cant depend on finalize().

Hope that makes everything clear.

Corollary : The sysytem of outstanding references work in a recursive manner. (SO every time you add a reference to a particualr object, references to all objects within it must also incremented. This creates chains of references, which are sometimes hard to debug.

In fact i have heard folks who have moved to Java and .NET from C++ complain about this. In C++ you can delete an object by calling delete on the pointer. The object would get destroyed then and there.

But this never happens in GC environments. You have to make sure all the references are gone away. It is up to you to decide which is harder.

Personally, I think that it might be possible to mathematically prove that the GC reference counting scenario has more complexity.