Welcome to dbForumz.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Multicolumn correlated subquery?

 
   Database Forums (Home) -> General Discussions RSS
Next:  Running SQL and Oracle on the same server  
Author Message
sql_server_user

External


Since: Mar 09, 2006
Posts: 4



(Msg. 1) Posted: Tue May 09, 2006 7:56 am
Post subject: Multicolumn correlated subquery?
Archived from groups: comp>databases>ms-sqlserver (more info?)

Hi,

I have a history table with about 400 million rows, with a unique
composite nonclustered index on two columns (object id and time period)
which is used for most of the queries into the table, and an identity
column for the clustered primary key.

Many of my queries use correlated subqueries to pull unique history
rows from the history table for each of a set of objects from the
object table, for instance, pulling the earliest history row for each
object in a set. These correlated subqueries reference the object table
and return the primary key of the history table, e.g.:

select *
from lp_object l
inner join lp_object_history h
on h.lp_object_id = l.lp_id

where l.lp_set_id = 'SOME_LITERAL'

and h.lp_id = (
select top 1 lp_id
from lp_object_history
where lp_object_id = l.lp_id
and lp_some_column > 0
order by lp_time_period)

Now, if lp_some_column is not indexed, this query has no choice but to
read the entirety of every single history row for every object in the
set where lp_set_id = 'SOME_LITERAL', so that it can determine if
lp_some_column > 0, and because the history table is clustered by the
identity column rather than the ID of the relevant object whose history
we're tracking, the reads take forever - they have to bop all around
the disk. The sets I deal with tend to have about 5K objects in them
and about 200K associated history rows.

I'm considering reclustering by the (object id, time period) index, but
then my queries will need an extra bookmark lookup step to get the row
data from the identity value returned by the correlated subquery. I
think it will still be faster, though, so I will probably build a copy
of the table with the alternative clustering scheme to run some
performance tests.

What I'm wondering is, if I were to dispense with the identity column
altogether and replace it with a composite primary key of (object id,
time period), would I be still be able to use my correlated subqueries?
Because then there wouldn't be a single column that uniquely identifies
each row in the history table and I don't think SQL Server supports
multicolumn correlated subqueries.

Thanks for reading,
Seth

 >> Stay informed about: Multicolumn correlated subquery? 
Back to top
Login to vote
Erland Sommarskog2

External


Since: May 30, 2004
Posts: 1649



(Msg. 2) Posted: Tue May 09, 2006 9:45 pm
Post subject: Re: Multicolumn correlated subquery? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

sql_server_user ( ) writes:
> Many of my queries use correlated subqueries to pull unique history
> rows from the history table for each of a set of objects from the
> object table, for instance, pulling the earliest history row for each
> object in a set. These correlated subqueries reference the object table
> and return the primary key of the history table, e.g.:
>
> select *
> from lp_object l
> inner join lp_object_history h
> on h.lp_object_id = l.lp_id
>
> where l.lp_set_id = 'SOME_LITERAL'
>
> and h.lp_id = (
> select top 1 lp_id
> from lp_object_history
> where lp_object_id = l.lp_id
> and lp_some_column > 0
> order by lp_time_period)
>...
> I'm considering reclustering by the (object id, time period) index, but
> then my queries will need an extra bookmark lookup step to get the row
> data from the identity value returned by the correlated subquery. I
> think it will still be faster, though, so I will probably build a copy
> of the table with the alternative clustering scheme to run some
> performance tests.
>
> What I'm wondering is, if I were to dispense with the identity column
> altogether and replace it with a composite primary key of (object id,
> time period), would I be still be able to use my correlated subqueries?
> Because then there wouldn't be a single column that uniquely identifies
> each row in the history table and I don't think SQL Server supports
> multicolumn correlated subqueries.

Unless I'm missing something, the query without the IDENTITY column would
be:

select *
from lp_object l
inner join lp_object_history h on h.lp_object_id = l.lp_id
where l.lp_set_id = 'SOME_LITERAL'
and h.lp_time_period = (select max(h2.lp_time_period)
from lp_object_history h2
where h2.lp_object_id = l.lp_id
and h2.lp_some_column > 0)

Whether this will actually perform that much better I don't know, but
I can't see anything good coming through that IDENTITY column.

You may also want to try the effect of changíng

where h2.lp_object_id = l.lp_id

to

where h2.lp_object_id = h.lp_object_id

or even

where h2.lp_object_id = l.lp_id
and h2.lp_object_id = h.lp_object_id

--
Erland Sommarskog, SQL Server MVP, esquel.RemoveThis@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

 >> Stay informed about: Multicolumn correlated subquery? 
Back to top
Login to vote
sql_server_user

External


Since: Mar 09, 2006
Posts: 4



(Msg. 3) Posted: Wed May 10, 2006 11:55 am
Post subject: Re: Multicolumn correlated subquery? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Thanks Erland.

The query as you've rewritten it is right on, as long as we have the
object id specified in the outer part of the query so that the history
row is fully specified when given the time period from the subquery. I
guess what I was concerned about was correlated subqueries where this
isn't the case, but I can't think of any actual examples for this
application, so maybe I don't have to worry about it.

I will definitely try the other permutations of the query you
suggested.

Seth
 >> Stay informed about: Multicolumn correlated subquery? 
Back to top
Login to vote
sql_server_user

External


Since: Mar 09, 2006
Posts: 4



(Msg. 4) Posted: Thu May 11, 2006 7:38 am
Post subject: Re: Multicolumn correlated subquery? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

I reclustered the table and the query is of course much faster now. I
need to keep both versions of the table around for a while to see how
they compare for all the queries we run, so I have an insert trigger
keeping them in sync. I tried the other permutations of the query you
suggested, Erland, and they had no effect on the execution plan or
performance.

Thanks,
Seth
 >> Stay informed about: Multicolumn correlated subquery? 
Back to top
Login to vote
Erland Sommarskog2

External


Since: May 30, 2004
Posts: 1649



(Msg. 5) Posted: Thu May 11, 2006 9:33 pm
Post subject: Re: Multicolumn correlated subquery? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

sql_server_user ( ) writes:
> I reclustered the table and the query is of course much faster now. I
> need to keep both versions of the table around for a while to see how
> they compare for all the queries we run, so I have an insert trigger
> keeping them in sync.

Good to hear that it worked out.

> I tried the other permutations of the query you suggested, Erland, and
> they had no effect on the execution plan or performance.

Good! This indicates that the query might be stable. Smile I've experienced
situations where such "meaningless" permuation affected the query plan,
so I wanted you to try all three.

--
Erland Sommarskog, SQL Server MVP, esquel.RemoveThis@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 >> Stay informed about: Multicolumn correlated subquery? 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Subquery in DTS No Longer Works After Years - Hi folks, A DTS package we have run for years now no longer works. The specific part that is not working is a subquery in the SOURCE object of a transformation. The source is based on a Microsoft Data Link to a Sybase database (DSN changed a couple..

Concurrency in Transaction - hi gurus the scenario Frontend - MS Access (not yet decided whether MDB or ADP) Backend - MS SQL Server it is a conversion from MS Access backend to MS SQL Server Backend. Planning to create stored procedures for all the Inserts, Updates, Deletes and....

Configuring SQL MAIL - Hi All, My question is that can I configure SQL MAIL, if my EXCHANGE SERVER is on different domain and my SQL is running on different domain.Can anyone guide me for this because I googled a lot without any result. I am using SQL SERVER 2000..

working with daily results : results per day - Hi, I'm trying to get an overview of counted visits per day for a webapplication. The thing is that SQL doesn't work with full days. For instance: getdate() returns a fulldate instead of a date in the format 'yyyy-mm-dd'. I need this format, without....

Importing multiple files to SQL - I have over three hundred text files that I need to import to SQL Server. Each is in the exact same format. I want to import tham as seperate tables. Is there any way to do it in one process? Regards, Ciarán
   Database Forums (Home) -> General Discussions All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]