Home |  Writing |  Consulting |  Schedule |  Scripts |  Links |  About










Paul's Blog...
Top Posts
Why use Stored Procs?
10 lessons from 35K tps
SQL Server:Fresh Vision
SQL Myths
Data Architecture
CodeGen AutoAudit Trail

Interviews...
CS TechCast
TechEd ArCast
PASSCamp 2008 Panel
SQL DownUnder


Locations of visitors to this page

Optimization Theory

One of the most enjoyable projects of my career was spending a couple years reworking and tuning an insurance document database search engine. When I first saw the project the queries were running as long as 20 minutes if they finished at all. At the end of the tuning the search time was consistently less than one second. There's nothing magical about it, just good solid design combined with optimization and scalability techniques.

Not all performance strategies perform well or uniformly because there's an inherent hierarchy or dependency between performance strategies that's often overlooked. Optimization Theory addesses these dependencies and provides a framework for planning and developing an optimized data store.

In Optimization Theory, each layer is enabled by the supporting layer. Due to the dependency, no layer can overcome deficiencies in their supporting layers.

Schema Design

1. Schema Design is my number one performance strategy. It's far easier to write clean set-based queries, tune the indexes, reduce locking, and fine-tune a database with the simpler, generalized, and normalized data schema than it is to optimize a database with an overly complex schema. I believe the purist logical data modeler is the number one performance problem in our industry because of the cascading problems caused by his burdensome designs.

Queries

2. Set-based solution. SQL is a set-based system and iterative row-by-row operations actually function as zillions of small single row sets. Whether the iterations take place as server-side SQL cursors, or ADO loops through a record set, iterative code is costly. My number two performance strategy is to use set-based solutions. But good set-based code can't overcome a clumsy or overly complex schema design.

Indexing

3. Indexing is the performance bridge between queries and data and a key performance strategy. An index can make the difference between an instant response vs. a hour wait. But well-designed indexes can't overcome non-scalable iterative code.

Concurrency

4. Locking and blocking is more common a problem than most developers think, and it's an issue that needs to be addressed and solved. But reducing blocked resources won't overcome an unnecessary table scan.

Server Tuning

5. Server tuning, adding memory, and improving disk i/o all make a significant difference. In fact, many managers respond to poor performance by blindly throwing money at hardware. However, running slow software faster is still just running slow software and even the best hardware will only "wait faster" for a blocked resource.

So, while all these performance strategies function in concert, optimization begins with a clean, simplified schema and set-based queries. It then progresses to tuning the indexes, shaking out any locking problems, and finally tuning the server without wasting money on hardware that's not needed.