In the latest edition of the “Working SQL” column, Peter Vogel looks at some of the general issues around creating SQL queries that run quickly.
THIS is a bell that I keep ringing: The key to great
performance in a database application is using
SQL. Every time you find yourself looping
through a recordset and doing updates, you should be
figuring out how to replace your code with an SQL
statement. If there isn’t an SQL statement that will do
the job, take it as a sign that your database design is
wrong and fix it so that you can replace your recordsetbased
processing with an SQL statement.
But, having said that, some SQL statements run
more slowly than others. So how can you build SQL
statements that will run as fast as possible? Related to
this issue is what you can do to make all of your SQL
queries run faster.
Optimizing the database
The second question—optimizing the database to
support your SQL statements—is the easiest to
answer, as the best practice is to follow good database
maintenance. You’ll want, for instance, to keep your
indexes defragmented, move the log file on a separate
drive from the database file, and perform other
common best practices.
Many database management systems (DBMSs) offer
ways to physically organize your data so that queries
will run faster. For instance, where there’s an index
that your SQL statements use frequently (more on this
later), the DBMS can cluster your tables on that index.
Clustering in this sense refers to physically storing the
records in order by the values in the index. This can be
especially worthwhile if the index is used to sort the
records into a specific order. If you have two tables that
are frequently joined, many DBMSs will allow you to
store related records from different tables in the same
physical location on your hard disk.
There are caveats in all of these solutions. The first,
of course, is that not all DBMSs offer these options. If
you take advantage of them, you’re locking yourself
into a particular vendor. On the other hand, in my
experience, changing databases isn’t all that common
an event, so why worry? The second caveat is that all of
these solutions tend to contain the words “frequently”
or “usually” in their descriptions. In other words, these
solutions depend on how your current users use your
current applications to access data for their current
business environment. If any of these variables change,
not only will these kinds of optimizations not improve
the performance of your system, they’ll actually slow
down your applications.
For most applications, you’re better off creating
the most efficient SQL statement that you can and
ignoring these physical options. One of the benefits of
creating efficient SQL statements is that the principles
for creating efficient SQL apply, with some variations,
across all DBMSs. What works in one environment will
port to others.
Rather than provide you with a laundry list of SQL
“speed up” techniques, in the rest of this article I’ll give
you the understanding that you need to make your
SQL is a non-procedural language. Unlike, for instance,
Visual Basic, you don’t say how you want the job done.
Instead, you specify what you want as your result.
The following SQL query says that I want the customer
last name and total orders but only for customers in
Canada, sorted by the customer’s Last Name:
Select Customer.LastName, Count(Order.*)
From Customer Inner Join Order
On Customer.CustomerNumber = Order.CustomerNumber
Where Customer.Country = 'Canada'
Group By Customer.LastName
Order By Customer.LastName
While SQL is a non-procedural language, in the
end, the computer must execute a series of instructions
that will retrieve the necessary data. So, when you
consider an SQL statement, you should also consider
the procedural code that will be required to retrieve the
data that you’ve requested.
None of this is magic. Take my previous SQL
statement, for instance. There are a limited number
of ways that the data could be retrieved. Here’s
1. The customer table is read sequentially.
2. As each customer in Canada is found, the Orders
table is accessed.
3. The Order table is read sequentially each time
4. When an order for the Customer currently being
Read More Here: