Some time back i encountered a scenario, in which a chain of stored procedures gets called recursively and this was eating into our performance (90% cpu time). All of a sudden obscure micro issue about whether to use table variable or temp tables suddenly became important.
After ferreting out some good sources of information on comparing these, we managed to reduce the cpu consumption by
  • Making all our stored proc use temp tables rather than table variables made a difference for us, because our procedures where getting invoked frequently and stored-procedures are recompiled every time if table variables are used.
  • But we had minimal impact from db writes due to temp tables because very few data was actually used
Ultimately the maximum improvement came from changes to the algorithm to make sure that the procedures do not get needlessly invoked or at least parts of them do not run (recursive algorithm was flattened out)

LearningIf micro optimizations start to look important, it is time to look for improvements in the algorithms that are used.
NOTES – Information gleamed from the exercise
  1. Most often you will require table variables
  2. Table variables cause FEWER recompilations than temporary tables. (source MSDN)
  3. Table variables seems to be scoped at the stored proc level. So if you create a table variable inside a while loop, it would retain old data (painfully found from a bug)
  4. Table variables cannot have indexes created on them except the primary key specified when creating these tables
  5. Create index command can cause query recompilations
  6. Table variables are held in memory and involve no disk IO – therefore no rollbacks are possible on these. On the upper side no locks are acquired by using table variables.
  7. Table variables cannot have statistics created on them so for huge data it COULD be inefficient  ?

This piece of code we are about to discuss was used in production software and i was tasked with reviewing the patch. Try to see if you can spot *any* issues with code. Later we will take a look at the exact observations, which are hugely interesting and instructive regarding system behavior.

The Query

This query retains top n records of a particular type and deletes the other rows in the table of the same type from a Sybase database.

for (int i = 0;i < ListOfTypes; i++)
ExecuteQuery “delete from table where PrimaryKey NOT IN (select top n PrimaryKey from table where Id = i) AND Id = i”

Observation – No Data to be Deleted in the database

  1. CPU consumption during query execution is negligible when there are matching rows to be deleted. The fact that this code is issuing multiple dynamic queries which have to be parsed does not seem to be too much of an issue for Sybase.
  2. 2000 queries took around 3 seconds overall at around 12% of CPU cost on a Xeon box.

Observation – delete finds rows for deletion

  1. With data to be deleted, the query took 5 minutes when there was valid data in 30 loops, that is over 5 hours when deletes are extrapolated to be repeated 2000 times.
  2. The CPU consumption figure remained the same.

Why are Database Deletes So Slow ?

Relational databases tracks everything using logs or otherwise called Transaction Logs because they record all the transactions aka changes that happen to the database. The ACID nature of transactions are guaranteed using these logs. So when you make a change to the database, a LOG is written to disk.

The actual change is only made in memory and not written to the disks immediately, for efficiency sake. But Logs are always written promptly and without fail, so that if the machine crashes or rollback is required, this log can be used to reconstruct the state of the database with minimal loss of data. This means that every time you make an individual delete you are going to make the disk spin and disks spinning are horridly slower that anything else you can think about.

How slow can the disks be ?

Numbers from a talk by Jeff Dean (works on both MapReduce and BigTable) of Google, as reported here

Comparative slowness aka latency in terms of CPU cycles

Floating point calculations = 0.25 to 3 cycles
L1  cache access                       = 3 cycles
L2 cache access                       = 14 cycles
Dram                                            = 200 cycles
Flash                                            = 3 Million cycles
Disk                                               = 15 Million cycles

Comparative Slowness aka Latency in absolute time

L1 cache reference                                                        0.5 ns
Branch mispredict                                                             5 ns
L2 cache reference                                                             7 ns
Mutex lock/unlock                                                        100 ns
Main memory reference                                              100 ns
Compress 1K bytes with XXX                              10,000 ns
Send 2K bytes over 1 Gbps network                 20,000 ns
Read 1 MB sequentially from memory          250,000 ns
Round trip within same datacenter                500,000 ns
Disk seek                                                              10,000,000 ns
Read 1 MB sequentially from network    10,000,000 ns
Read 1 MB sequentially from disk            30,000,000 ns
Send packet CA->Netherlands->CA      150,000,000 ns

Can you speed up the disks?

SCSI disk array on a server

SCSI disk array on a server

You bet !!. We tried to speed up this whole experiment by moving the tests from an IDE based system (with 7500 rpm drive) to a system with a single (the picture has six – no we used just one of them) 15k RPM SCSI disk + SCSI controller (no on board memory).

The results from moving to SCSI system –

  1. The delete time improved by a factor of 4-5 times and the worse case average time period dropped to around 133 minutes from 5 hours.
  2. However with the SCSI system in use, the CPU consumption during query execution also jumped by the same amount, by 4-5 times because there are fewer blocks due to disks and therefore more query execution (4-5 times worth) is done by the DB engine.

As you can see, it would be really easy for us to calculate how many disks we would need to speed up the deletes to a manageable time. I might have to go with a couple of SCSI disks in a RAID configuration and try to achieve the speeds that i want.  At the same time the database engine’s CPU consumption would also have to be separately addressed such that it does not have to do much work in finding the rows to be deleted.

Is there any way to improve this ?

  1. Use Transactions –

    SQL server engine makes transactional records only as part of a single transaction. Not sure how this works vis via Java and sybase engine using which this query was tested. So if you ensure all your changes are made as part of a single transaction the disks would be less affected than if it was done otherwise.

  2. Do not use a database for real time stuff
    An even more optimal solution will be to not rely on the database at all and instead do all this in a specialized in memory component in a thread safe hash or something equivalent. Use database only as a slow data backup for being used during failure recovery. This would give the incomparable throughputs compared to any RDBMS / disk based solution.

My suggestion to improve CPU (untested)

  1. Move the entire loop into a stored procedure
  2. Run a select to get the top 20 and extract the ID of the bottom one
  3. Run a delete to delete all ID > ID extracted.What are the advantages – You remove  possible query compilation for the execution of 2000 queries and at the same time, remove usage of possibly highly inefficient NOT IN operator. This needs to be verified though.
  4. Remove all foreign key references to the current table – but of-course that would be going against the grain of RDBMS concept itself. However for desperate situations, this might work ….

What was implemented in the end?

Well, the production software had this delete sequence running every 30 mins. The no of events this system claims support to is only upto 25 events per hour.  So at this rate, there would be no pile up and no big deletes happening. So we dont have a problem !!!!

Thats quite easily done.

  1. For a foreign key, explicitly set “Enforce Foreign Key Relationship” = No (Default is yes)
  2. Set Cascade Deletes to be No Action instead of Delete

There, now you have a table that will not enforce foreign key relationships !!!

ps : By default when a Foreign key relationship is created, the defaults for the foreign key relationship fields are set to enforce the FK. You have to explicitly set them to No.

Disclaimer : I can think of situations where this behavior is required as in report tables, or some funk de-normalized high performance corner cases. But i would ask you to rethink those cases if ever you come against them.

MSDE workload governor kicks in whenever there more than 8 concurrent queries running against the sql server engine. It then proceeds to introduce artifical delays in your query results such that the performance of the entire system is affected.

We faced the same issue for the Cisco NetManager tool that we were constructing and in the end, after a long period of trial and errors we have managed to cure us of this persistent little “feature” MSDE has.

So how does one go about beating / disabling the Workload Governor ?

  1. Optimize Optimize all dB operations.
  2. Reduce the no of round trips to the database by getting the data you know will be required later in one shot, in a single stored procedure or performing multiple operations together.
  3. Use a semaphore to the guard the no of concurrent accesses to the database – make sure all db access has to pass through this guard. The ADO methods one would have to guard are
    1. Execute
    2. Connection Open
    3. Connection Close
    4. Recordset – Open / Update / NextRecordset  /MoveFirst
    5. BeginTransaction / CommitTransaction / Rollback
  4. Perform application level caching of  DB connections if the no of open close connections are too high.
  5. Reduce the load at the sql server – how ? By caching the most frequently required Db data in memory – Not all applications can do this as data integrity becomes an issue. But if properly done, this is the single most important step that can give the biggest bang for your money.

    However please do note that this is also the single most dangerous act that can be done on a DB application. Managing a cache and controlling concurrent updates is not an easy task. We managed to achieve this by using only volatile reads and interlocked updates. Since no explicit locks are involved, this gives us a high level of concurrency.

This it – these small 5 steps are all that it takes to transform your slow moving implementation into a killer app at least wrt to the DB. Verifying these 5 steps took 6 months of time in skunk work style investigations.But now that i have a working application with these features built in, i can now confidently vouch for their efficiency.

SQL server manages memory quite unlike other applications.  This has been a cause of many heartburns, especially when it comes to explaining to customers and test teams as to why SQL server consumes so much memory or explaining why the “memory leak” that is observed is not really a leak.

One key factor that contributes to this perceived memory leak is SQL server’s aggressive caching policy. It primarily caches the data, and the SQL execution plans and dig this “never gives up the memory unless and until it feels system is starved of memory”.  However this behavior is not all that great when it comes to less than perfect applications.  There seems to be a flaw (atleast wrt SQL server 2000) in the way it releases memory in that, the operating system seems to become aware of a memory starvation quite before SQL server realizes this fact and expands its paging system to increase the amount of “total memory” available in the system.

So it is in the best interest of the application to not eat up un-necessary cache unless it is required. This post covers the plan cache and how its growth can be curtailed without sacrificing any performance. In fact you gain more performance by better cache utilization in terms of cpu and memory. Read ahead to find out how this can be done.

Facts of life

  • SQL server will create execution plans for all queries and stored procedures it executes. Query plans are detailed steps as to how to get the results for the query specified. So this would include which tables to query, which indexes to use, how to join or merge the results, and so on. Quite understandably all this magic does not come free.
  • Creating query plans consumes CPU, as the server needs to know what is asked of it, and quite probably run it through tons of algorithms and data structure to create a reusable generic plan, which is going to fetch the results.
  • This means that, if SQL server were able to reuse the plans it creates, quite a bit of savings in terms of CPU consumption could be had whenever it runs the same queries again and again. Therefore, it follows that


What does this imply for applications and administrator?

  1. Cache means store. So if you are going to store the plans (in memory of-course) then that means that your memory requirements are going to go up. However if that saves CPU consumption that is good.
  2. However you have a problem, if the store grows uncontrollably or if the savings in CPU does not measure up to your memory costs.
  3. You can measure what amount of memory is used by the procedural cache from Perfmon. SQLInstance – Buffer Manager – Procedure Cache Pages (8 Kb pages). Do check out the other counters in the Buffer Manager object to get a relative measure of how much memory the ProcedureCache occupies wrt other pages
  4. Most installations face this problem. i.e. the memory that SQL server consumes for the query plan goes up and unless you know exactly what you are looking for, its hard to keep this under control, much less account for what eats up this memory. Therefore,


How does query caching work?

  • Exec “Proc_Get_Orders”
    1. Plan for the stored procedure “Proc_Get_Orders” is created once and used multiple times.
  • sql_string = “select * from orders where customerid=” + customerid

    1. Plan gets created for the string ‘select * from orders where customerid=1’
    2. Plan gets created for the string ‘select * from orders where customerid=2’
    3. ….ad -infinitum

    SQL server creates n plans for n ad-hoc queries. THIS is what bites you. THIS is what contributes towards bulging memory pages for the query plan cache. To control this growth either switch to stored procedures or switch to prepared statements with parameter place holders, which has the added benefit of preventing SQL injection attacks.

  • Prepared statementssql_string = ‘select * from orders where customerid=?’AdoCommandObject.CreateParameter(customerid)

    Plan gets created for the string ‘select * from orders where customerid=?’Each time this query is executed using parameters, the cache plan gets reused, with the customer id as the varying input.

How does one see what is inside the plan cache ?

  • Or in other words how can one verify that these recommendation works?
  • Use these commands to display the contents in the plan cache (syscacheobjects)
  • The great thing about this is that you can actually see the the exact text of the plan which is in the cache – talk about accountability !!!
  • query_cache_plan
  • The key in understanding the output is the usecounts. Good applications will have a high ratio for the reused plans.
  • Note : Its easy to see from this output how much memory is wasted due to a particular ad-hoc statement. You could calculated the amount of savings by converting a particular module to use prepared statements by adding up the memory usage counts.
  • Use the MSDN explanation of syscacheobjects to further understand what the different fields signify and to find out other issues that may be plaguing your application.

Once you are done optimizing away all your sql statements, the memory growth or leak if you would prefer, from procedural cache would be well under control for your application. Happy optimization !!! 🙂

For further detailed reference –

Technet Article on MSDN

Minding Memory, from Microsoft’s customer Advisory Team

If you are facing issues with SQL server memory, here are a few tips from the trenches that may help you keep up with what could be happening.

1. SQL server WILL consume all free memory

SQL server is a server class of application and it makes the assumption that it is one of the most important piece of application which is running on the machine. To this end, it simply grows to occupy all available memory in the system. By not releasing memory which is used by its various caches, SQL server is trying to ensure a higher level of performance whereby there are good chances that those cache objects and extra memory would be required again and again. However this assumption generally holds good only for the most well written applications, in which case it is best to let SQL server grow and increase the available physical memory. For all other cases, read on.

2.SQL server claims to free memory to applications as and when they require it

In fact it does do this. All you have to do is write a program, that claims HUGE amount of memory in one shot. SQL server will notice the relative lack of memory availability in the system, and can cut its flab to come back to normal.

HOWEVER, in normal scenarios, what i have found is that, the OS is much nimble than SQL server in managing its memory. So in a typically memory creep situation, where all the applications (including SQL server) are eating up more and more memory in a slow pace, the OS, ie windows will simply increase its page file size to accomodate more memory requirements. Now everyone knows what paging implies – a slower system. This situation is the bug bear of the SQL server memory management scenario.

3. What can you do

  • Know about SQL server procedure cacheAlmost all memory that sql server caches or uses up is more useful and genuine than this cache, unless of-course your system is extremely well written such that only genuine entries remain here.My next blog entry would be about what this cache does and how it can bring down your system, along with commands to inspect this cache and its contents.
  • Restrict SQL server memory requirementsOnce the procedure cache growth is under control, use the EXEC sp_configure command to to restrict the SQL server memory to a maximum beyond which it will not grow. Our installation includes scripts that restrict the SQL server memory to different limits based on the system configuration on which it is installed.However this is not recommended for applications where SQL server plays an extremely central role like data center applications and such, where the entire application revolves around SQL server.

    This step is only advisable for those class of applications which might need to deploy SQL server along with other applications on the same server.

  • How do you calculate the maximum size of memory that might be requiredYou could profile your SQL server memory growth for a typical work profile and then extrapolate it over the period the system needs to be running. Periodic backups and maintenance activities would have to be factored in depending on your application profile.

Explaining all this to customers is hard
When your application is set to run on a non dedicated box, along with other equally important applications, frequent queries could ensue about why SQL server is consuming so much memory etc. Most often for such classes of applications, SQL server might not really require as much memory as all that. In such cases, these steps are more relevant and the money and time wasted in support calls could be much better used elsewhere. In such cases restricting the SQL server memory growth could be a wise decision.