How do you train trainee programmers ? Simple, you do as the Shaolin masters do – keep them working and working and working on the same thing till, light dawns. ie the light of wisdom.

My sister who had joined an IT services company gets tasked with writing many such tools every week. implement grep. implement ls. implement malloc so on and so forth.

I get tasked with validating the output and sometimes i try some myself, before advocating the approach & piling on some advices for good ol’ sis.

So how would you write a small & fast grep equivalent? Any thoughts? Think for a moment before you read further.

Obvioulsy I needed lots of clever optimizations. right? The question is what are those. But before I optimized the hell out of the little grep my sis wrote, like any self respecting hacker, i sat down to compare the performance against a suitable competitor before making my advocacies. I chose to compare the speed of the grep we wrote to the DOS find program, since at the time i was working on a VC++ project back at the cube. These are the numbers –

myfind – runs on a 600K working set using a VM size of 108 K – CPU hits is 50%
DOS find – climbs upto a 200 MB working set using a VM size of 1500K – CPU howers around 20%

myfind – finishes searching in 5-7 secs
DOS find – takes 1 minute !!!!

Excuse me, but did my kid sister just whack the hell out of a code thats been running for years? Definitely not i suppose. Perhaps Microsoft never wanted to optimize FIND for huge files. Perhaps it works faster on small files. Perhaps my program is broken. Perhaps DOS FIND does many more super smart things when it searches for a sub string. Maybe it does not want to fry the processor. Maybe my measurement was off.
But the point is, we measured before optimizing and wow does it make a difference. The performance profile of my program was exactly the way i liked it.

I dont think all of us might get this lucky everytime we construct code. But it definitely pays to see the light – “premature optimization is the root of all evil” – Knuth

GREP code