Process


An internal mail on the cost and success of our agile implementation efforts, describe the cost of a single bug as x ……person days …

Finally, change !! ūüôā

Advertisements

Just listened to a podcast from Scoble, interviewing the so called king of tech shows Leo Laparte.¬† The talk does impart the impression on how content is most important than any glam and glitter and their wow’s about the new media, and some general other information about content sharing.

While they went on about the history of the show, they mentioned how they both have ran into one another a long time back and how Leo has touched based with many folks in the industry and even launched their careers.

This seems to be a theme that i have heard repeating every now and then with Scoble and many others. All these guys who have been in the Valley or manage to be in tech scenery in the States, seems to have run into the others “who matter”, at some time or other.

It really does seem to matter where you are from, for you to get picked up in the flow, noticed and perhaps to also feed you the enthusiasm to do more, and see more groovy stuff done and launch you into new circles that matter perhaps (VC’s anyone?)

Since all that is not here, i might as well keep myself fixated on the idea that the useful tool i will create some day will have to be at least 50% better than it might been necessary, had i been from the Valley.

I hope you realize that too.

Cheers !!

Deduct one mark for answers in the nagtive and plus one for the positive ones

  1. Is the forum url is widely published in the product page ?
  2. Is the forum accessible without a password ?
  3. Does the content of the forum get indexed by google such that doing a search of the product name + problem show up past issues on the same lines?
  4. Do all customer interactions add something of value to the forum and act as THE tool for capturing information that never got captured in the requirements gathering / documentation, such that you do might not even require a wiki page apart from the forum and your own documentation ?
  5. Does it take only a single click for the user to start creating a new case and interacting with the forum?
  6. Is the relevant product page a single click away from the home url?
  7. Are the histories of all cases handled so far, easily viewable within a single click?
  8. Are 5,6&7 easily discoverable in the home page? Rather, is the UI intuitive enough?
  9. Is it possible to conduct customer surveys from your support site?
  10. Does your support site have RSS feeds for each topic?
  11. Does the extracts of the content that gets revealed in your forum, regularly get merged into your documentation, for the aid for future customers?

Now what was the score of your support tool?

Here is a working, healthy forum that scores 10 out of 10 in my questionnaire.

Here is another that is definitely below 10 but how much below i cant say, because i cant figure out how to use this site yet. Heck i cant even figure out what the exteranl URL for this site is, though i have been developing on one of our companny’s product for close to two years now. But hopefully someone is reading this so things can change for the better.

When developing in a highly matured business like environment vs a pure software-play startup like environment, the disadvantages of a 3 month coding period (see here to know how this can be possible) for the entire year starts to become extremely obvious.

Loss of familiarity with code

The 3 month coding period’s first causality is familiarity with the base code. Even the same code base you wrote an year back look alien. As we all know, unlike other engineering disciplines, the code is the real design when it comes to programming.

Code kept in head is a set of designs and interactions between data structures.

process

However even the best of code has small in-efficiencies. There are pieces that can be vastly improved, the ones with  bungled interfaces, sloppy designs, edge cases that one knows can break or modules that are downright ugly. However it is only the primary developer who is cognizant of these shortcomings. By forgetting code, or becoming unfamiliar with code, the possible improvements to these sloppy areas and modules suffer. Once this code is not looked at for an year, when someone else (most likely) or even the primary developer goes to this code, it is reduced to a jumble of interactions, which one must make sense of.

Many of the fine details are lost and you look at the code only from vantage point of high level interactions ie can do xyz, has messaging etc. But the knowledge of the caveats in your code, the hidden loop holes, and the hacks that where applied the first time over are lost.

The opportunity to improve, and change the areas that where lacking is lost here. A new image / idea of the code has to be build up from which to start hacking on the existing code. Any fixes or modifications incur the risk of bungling the original design precisely because the familiarity with the code base is lost.

This is why programmers sometimes prefer to re-write rather than maintain old code. The fight against never ending complexity develops hick-ups when done in such short bursty cycles of sporadic coding exercises.

No longer in touch

One of the worst kept secrets of coding is that, being a programmer is much like being a swordsman. If you do not practice daily, you become dangerously slow and outmoded. You might know all the moves, but you would not have perfect recall, the reflexive reaction that can be a life saver.

practise_tai-ji

Similarly if you no longer code every day, you might know the basics of the data structures but you cannot be as fast and as correct as someone else who have been writing code daily. Apart from the obvious stops to look up data structures, api calls and such, you start forgetting the details, and side effects of the API’s, which means for all practical purposes your code is no longer clean and carefully written. At this point quality of code suffers.

Impact

The first casualty is the engineering culture. For lack of anything else, engineers start to game the system, reflectively, without consciously being aware of it, due to appraisal implications. Most of this is usually observed and learned from more successful engineers who have taken the same route.

The *implicit thought process* and actions go along these lines

  1. You need to achieve something every fiscal
  2. I create great code – but why am i not successful?

No real coding targets

In the presence of a cycle that disallows coding / feature creation for the better part of the year, programmers have to resort to make-believe to meet their hopes of how they perform in the appraisals.

Ideally the appraisals for software engineers should have dealt with things like –

  1. no and type of features they where able to produce
  2. The quality of the designs
  3. Code quality
  4. No of bugs that where observed per line of code.
  5. and other such measurements related to code

But the 2-4 month coding cycle, ensures that a single person can only deliver one feature and no more. The quality of the feature itself is of not much concern, since they are not measured either. So how do you get ahead of your colleagues in appraisals ?

Weasel Dictionary Starts – Initiative, Impact, Communicate, Perspective etc

dilbert-power-point

Folks can be extremely innovative when it comes to gaming the system. This is a gradual process and often learned from observations of their peers who magically seem to achieve high grades despite their low coding prowess / output.

The magical lessons learned for a business transformation for a pure coder, would then mean more activities, which are anything but coding. These would probably go along the lines of

  1. Communication
  2. Collaboration
  3. Initiatives
  4. Vision statements
  5. Visibility
  6. Impact
  7. etc

Soon people start being more vocal / outspoken and argumentative in a clever sort of way. They become interested in giving out more presentations and writing more forceful mails and everything and anything that can make a positive impact on their appraisal sheet apart from pure coding.

This effort which resulted from not having any solid work to do, soon degenerates into an effort to avoid all coding and into translating into roles that give them more visibility and aid their impending changing over to a management career track. The
side effects are all arguably good but

ARE THESE AT THE EXPENSE OF CODING

Where in all this does the code creation take place? Instead of running the show with a tight 3-4 skilled developer team, double that number of  developers (all equally vocal, solution oriented and customer driven) are now required.  Of-course there is not enough work / money to pass around, which means all these folks are going to be even more vocal, solution oritened and customer driven than before.

Its a sheer uphill task getting such really experienced management type coders, to get some work done, after their drastically larger communication requirements and visibility needs are met.All this translates into visibly higher man power and resource requirements to get any work done.

Management complications – percentage time and context switches

The higher resource requirements means that there are often work left un-done from other areas. Management tries to juggle this by allotting percentage of a developers time across different projects.

  1. 20% on answering queries for really old project
  2. 60% on developing new feature for current project
  3. another 20% on handling bug fixes for the previous release.

All this generates tons of emails, which cause you to switch between many threads of thoughts, which is nothing but more and more context switches for the developer concerned. This again drives down productivity as it is quite well known that human task switches drives down productivity.

How would you fight this scenario?

Now, that is the correct question, the bitter pill no one really likes to know about. Lets try to take this up in the next and final installment of this series.

Meanwhile please leave any comments if you know of any good ways to arrest this rot.

In the first installment of this four part series i essentially described how a good pure play software company operates and how the process or rather lack of process enabled much code to be created efficiently.

Compare this to any / most corporate environments. In this post i want to detail how software development happens in such a highly charged but disabled environments. This would also be a scary but educational read, for any un-initiated newbies who want to shift into such an environment but does not know what to expect.

helpless

Requirements

After the *soft* in-principle decision to create a product is made, a master list of un-meet-able requirements are made. This is called the PRD, aka the Project Requirements Document.

Evaluate Requirements (1 month)

The managers task their teams to evaluate the PRD’s and come up with an estimated time and the doable items. The developers then go into an investigative mode, which again is loosely time bound. The developers come up with some arbitrary numbers after a rather relaxed period.

Estimation (1 hour – 1 day)

These numbers might get shortened or not depending on what effort, the manager arbitrarily decides the task is going to take. These numbers are informally communicated and then a rough priority of items that are doable is decided. Once this rough estimation is done someone gets tasked with creating the final list of items that will be taken up.

This phase takes a minimum of one month to an average of 2 months to complete.

SFS & SFS Review ( > 1 month)

The new task list is communicated in a new document called the SFS (system functional spec). During all this time meetings keep happenings twice a week between developers, and between development, management, program management and marketing to evaluate the concerns and overall progress.

The SFS once completed, is sent out for review where developers testers and anyone else remotely connected with the program take a dig at the rationale behind the features, the implications and so on. In short everything that be criticized will be criticized and the necessary concerns addressed and the items addressed in the document. This phase generally takes over a month to complete.

Executive Decision (2-4 weeks)

Once SFS has been reviewed over and over again, the program starts moving towards what is known as the EC (Execution Commit) which is a final go that the management has to get, from the upper echelons who ar ready to fund the program, for the features they are going to commit (aka promise) for the release.

As EC dates approach (and keep slipping), everyone goes into an overdrive to actually start more investigations into the features they are supposed to investigate and more issues crop up all over the place which in most likely hood, delays the EC plans. After a few number of delays, the SFS is finally frozen and the EC happens. Most of the SFS reviews and answers actually happens towards this period.

Committed ie Promised features have to be released without fail and hence everyone including managers make sure they consider the features only by adding liberal amount of buffer zones in their individual plans to make failure a low probability affair.

Design ( < 1 month)


Once EC is made, the design starts and this is when the actual troubles that a feature could run into are encountered (assuming the design is properly done). This might cause further delays as the issues are communicated back and forth to the management. The manager ensures that the teams actually do the designs by ensuring design review meetings happen. During this time the testing teams also start writing test cases, which the developers responsible for the modules are supposed to review and ok, apart from being reviewed by their own peers.

This phase can take from two to four weels overall. When controversies erupt, the design could be stretched to another two weeks.

Coding (2-3 months)

Once this phase is over, a hectic phase of coding starts, which is never more than 2-3 months on an average.

Developer Testing ( > 1 month)

Once this short affair is over development testing / integration testing etc starts which one or two of the developers do, to ensure things are fine before handing over the first cut of the product to the testing team.

During this phase not much happens with other developers and they sporadically fix bugs that appear off and on from their individual modules. The integrated pieces are handed over to the testing team, who now starts with each of their test cases and again report issues, which basically reflect the health of the system. Many of the issues come up during this phase and there occurs some or  more delays depending on how sloppy the coding was. One more release is then made within a short span of time addressing the issues discovered and finishing whatever features where unfinished from the first release.

This developer done testing and fixing phase takes well over a month totally.

Formal Testing (2 months)


Next the formal testing backed by formal test cases, starts in full earnest and a HOST of issues are discovered. The issues that consume the most amount of time are those that are related to system performance and this drags on quite a while until the test team is finally pressured into releasing the product and sweep some of the critical issues under the rug to at-least enabled a delayed shipping.

The no of emails that get sent during this phase and the meetings that occur between the test and the development teams, haggling over the actual nature of the issues observed are a tumulus affair. The entire testing phase takes up at least 1/3rd of the entire development time so you can imagine how long this affair drags.

In the meanwhile folks who have done their work fast and in a high quality manner can actually relax and drag their feet while the rest of the modules come up to speed. Most of the documentation effort starts towards the end of this phase and gets done in a few days time.

Release(1 month)

Once the entire affair is finished, the release is finally ok-d and then again starts a phase where the CD, licenses and other nitty gritty project management tasks are taken up and completed. Chalk up another month for this to finish.

Next Phase = Next Year


Once this is completed a new release and new PRD is announced which is circulated and the whole routine starts all over again. In my experience only 2 such phase ever gets completed in one year with a little bit of the next release overlapping into the same year.

Total amount of time spent coding = 2-3 months.

No matter how you look at this, there is no way a company that works on these lines can produce new code or new features that matter without a HUGE amount of time.  Some companies beats this by acquiring small companies regularly and customizes these well written products in ways it thinks suitable. Sometimes, the results are terrible but at times they are good. But hardware also folks can get away with this, since the over-priced items can always be packaged along with hardware solutions and sold to enterprise-y folks.

But never would a pure play software firm be able to survive on these lines. For that matter, never would an enterprise, be able to create quality products the same way that these other startup firms / pure play software firms do.

Key observations

  1. Coding happens for a max of four months an year
  2. Design happens for a maximum of 2-4 weeks an year and estimation takes only a day or an hour at the max.
  3. Documents are generated at each phase mostly as a way of keeping tabs
  4. Since these documents are control points, they need to be reviewed.
  5. Since review is work, and by reviewing you are agreeing to whatever is being said, no one individual is allowed to to do this. Instead the “team” does this, which forms a type of shared responsibility.
  6. Creating the SFS and design documents, reviewing the said documents, coding and some amount of formal testing is done by the engineers themselves.
  7. Since reviews are public and everyone reviews everyone else’s designs, there is a certain softness in the design comments. Maybe this applies only to India but its usually common to let pass docs that are even mediocre to a point due to a softness in cultural values that prevent one from being rude to one’s colleagues. No one but managers therefore can really therefore enforce design quality after a certain basic level.
  8. No formal time is allocated for design reviews

Implications of all these on the way an organization engaged in product development are many. But for this series, we are interested only on what the low levels of coding time imply. That brings us to the final post in this series.

hassle-free1

When i was asked to create a custom, server   monitoring solution, in a start-up environment,  run by a very wise team, the only requirements list that i received, was a call from my overseas manager.

“Have you seen task manager on XP”
“Yes ?? !!”
“Well, we need the exact same data, except, your process must broadcast it to our messaging network”
“Get cracking on it.”

It was my sole responsibility to figure out

  1. How to extract this data
  2. How to efficiently store and compare successive values for % calculations
  3. How to make the program faster and also consume small memory
  4. Eventually, how to port the code to Solaris & Linux (I received 2 more calls)

The Process

We had frequent commits into the code server, and bi-weekly status checks to ensure things progressed as required. The output of my code was used by  another programmer for his work and occasionally i would be alerted to a bug  from him which i would then fix.

Working Managers

My managers and HIS supervisors where the ones who actually verified this code and made sure the performance and data where correct. The initial effort that these guys put in ensured that i understood the goals of the program from the beginning and what was important vs what was not.

They frequently re-aligned what i developed, with what they thought the customer might want. There was no super designs with a super-human architect / program manager deciding before hand what the customer would like to see. Our program grew in bits and pieces but relevant bits and pieces.

Deciding the next small bit and ensuring coding standards where met and that the output looked neat and clean was what the management did and other stuff that i never cared about / knew. Oh and almost all discussions was over the phone and  emails summarized whatever we discussed.

Formal Testing


Once the size of the program was big enough, (3 processes and 3 operating systems) we had a dedicated person to verify everything was correct and also integrated well. Of-course the managers still did their bit at-least every 2 months.¬† But basically this single person’s ass was fry if something went badly wrong and he did and damn good job of making sure the stuff i said i had done actually worked.

What did i do – Hands Free

During all this time, all that i did was code, learn, investigate and code again. Of-course the code had to be of the highest possible standard and i had enough time to make this happen.

Lots of Good Work

I ended up creating 3 different agents for 3 different OS’s, collecting ALL sorts of data you could imagine about the internals of processes / memory architecture / networking throughput / files and ports opened / disk sub-system / version-ing information and what not.The agents worked across almost all in-production servers versions and editions of the OSes concerned.

Lots of Education

Additionally all this information was standardized ie to say memory free reported by Linux might not be the same logically for Solaris.  Our agents normalized this information and for this a lot of kernel code peering where available, or reading heaps on internal documentation was required.

Designing all this stuff and constantly striving to improve quality again gave a lot in terms of the learnings. The fact that i had some top notch designers / coders to review and suggest changes made things even better.

Code smells / data structures / design decisions / optimization concerns / OS internals are few of the things i eventually learned.

Quality

On Solaris my agent could beat top in its CPU consumption and on the whole i might have recieved about 2 bugs max overall for the entire code base after it had shipped to the customer. (We had a single huge customer for this program at the time)

Initiatives

In between all this, i had time to work on another other side project (to get real time feeds into excel) and formal investigations like creating a distributed real-time excel prototype (you change a cell in your laptop and anyone viewing the same sheet would be immediately updated) and other works my managers deemed worthwhile to investigate.

Personal Initiatives

I had time in the interim to investogate something i thought was worth doing, on scaling our middle level layers. It never took off, but the investigations tought me a lot about scalability and scales of efficiency in database related work that helped a lot with my current project and current job.

I also ended up creating a prototype trouble-shooter for our stock trading network that listened to different TROUBLE broadcasts our individual processes made and showed them in real time to the administrator in a tree view / LIFO ordering, hashed and searchable overall. That again never took off, but i have fond memories of the same and I’m sure the folks who were in charge still remember the tool.¬†

Results

Its been over 3 years since i left the place and I’m yet to receive / know of any important bugs affecting the functionality of the program apart from of-course keeping the agents up to date with the release of new versions. My wife joined the same place i used to work and she sometimes tell me the people i worked with, the engineers, over there thought the world of me. That, is good karma any way you look.

The other fact, that got me into writing this blog, from a co-orporate environment, was the amount of code that we did and how hassle free the whole process was, which brings me to the next installment of this series.