 |
|
 |
|
Next: [Info-Ingres] FW: VIP Webinar - The Ingres Source..
|
| Author |
Message |
External

Since: Nov 23, 2007 Posts: 27
|
(Msg. 1) Posted: Wed Jan 16, 2008 2:59 pm
Post subject: How top actually works Archived from groups: comp>databases>ms-sqlserver (more info?)
|
|
|
I'm coming from Oracle world and trying to find something similar to
rownum in Oracle. I know there exists TOP which normally if used in
the same select woth order by firstly sorts data and then only gets
top n. So the question is what actually happens when top is used in
inner query and order by in outer query. The problem is that it seems
to be somehow inconsistent at least for the first sight.
Using SQL Server 2005
So I have following test case:
create table t3 (id integer, data varchar(4000));
insert into t3 values (1, replicate('a', 4000));
insert into t3 values (2, replicate('b', 4000));
insert into t3 values (3, replicate('c', 4000));
insert into t3 values (4, replicate('d', 4000));
insert into t3 values (5, replicate('e', 4000));
insert into t3 values (6, replicate('f', 4000));
insert into t3 values (7, replicate('g', 4000));
insert into t3 values (8, replicate('h', 4000));
insert into t3 values (9, replicate('i', 4000));
SET STATISTICS IO ON
firstly just select all rows to know how many logical reads are needed
for all table.
select * from t3
1 aaa...
....
9 iiii....
logical reads 5
Now get first two rows without any where clause:
select top 2 * from t3
1 aaa...
2 bbb...
logical reads 1
Now the same first two rows just with outer select without any order
by:
select * from (
select top 2 * from t3
) as q
1 aaa...
2 bbb...
logical reads 1
OK till now it's as expected, just one logical read get first 2 rows
and end query.
However look at next query's logical reads 5. This somehow is very
interestingly equal to logical reads for select all rows from t3.
select * from (
select top 2 * from t3
) as q
order by data asc
1 aaa...
2 bbb...
logical reads 5
So the next one shows that order by clause has affected the result set
and actually semms to be pushed into inner query. Also logical reads
are 5 meaning that actually we have scanned all the table.
select * from (
select top 2 * from t3
) as q
order by data desc
9 iii..
8 hhh...
logical reads 5
However for TOP 1 everything works in a different way i.e. there is
always the same one row and the same one logical read in spite of
diffferent order by clauses:
select * from (
select top 1 * from t3
) as q
order by data asc
1 aaa....
logical reads 1
select * from (
select top 1 * from t3
) as q
order by data desc
1 aaa....
logical reads 1
So where is the truth? Why the functionality is different?
The business case is that we have search with potentially weak user
criteria resulting in BIG potential result sets, but we want to show
the user just ANY N rows satisfying criteria. But these N rows should
be ordered. So what I'd like to achieve is:
1) get ANY no more than N rows according to my criteria
2) sort these N rows according to my order by clause.
I DEFINITELY don't want:
1) get ALL rows
2) sort them and throw away all but first N.
TIA, Gints >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: Nov 11, 2003 Posts: 678
|
(Msg. 2) Posted: Wed Jan 16, 2008 9:01 pm
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
"Gints Plivna" wrote in message
> I'm coming from Oracle world and trying to find something similar to
> rownum in Oracle. I know there exists TOP which normally if used in
> the same select woth order by firstly sorts data and then only gets
> top n. So the question is what actually happens when top is used in
> inner query and order by in outer query. The problem is that it seems
> to be somehow inconsistent at least for the first sight.
>
> Using SQL Server 2005
> So I have following test case:
>
> create table t3 (id integer, data varchar(4000));
> insert into t3 values (1, replicate('a', 4000));
> insert into t3 values (2, replicate('b', 4000));
> insert into t3 values (3, replicate('c', 4000));
> insert into t3 values (4, replicate('d', 4000));
> insert into t3 values (5, replicate('e', 4000));
> insert into t3 values (6, replicate('f', 4000));
> insert into t3 values (7, replicate('g', 4000));
> insert into t3 values (8, replicate('h', 4000));
> insert into t3 values (9, replicate('i', 4000));
>
> SET STATISTICS IO ON
> firstly just select all rows to know how many logical reads are needed
> for all table.
>
> select * from t3
> 1 aaa...
> ...
> 9 iiii....
> logical reads 5
>
> Now get first two rows without any where clause:
> select top 2 * from t3
> 1 aaa...
> 2 bbb...
> logical reads 1
>
> Now the same first two rows just with outer select without any order
> by:
> select * from (
> select top 2 * from t3
> ) as q
> 1 aaa...
> 2 bbb...
> logical reads 1
>
> OK till now it's as expected, just one logical read get first 2 rows
> and end query.
> However look at next query's logical reads 5. This somehow is very
> interestingly equal to logical reads for select all rows from t3.
>
> select * from (
> select top 2 * from t3
> ) as q
> order by data asc
> 1 aaa...
> 2 bbb...
> logical reads 5
>
> So the next one shows that order by clause has affected the result set
> and actually semms to be pushed into inner query. Also logical reads
> are 5 meaning that actually we have scanned all the table.
>
> select * from (
> select top 2 * from t3
> ) as q
> order by data desc
> 9 iii..
> 8 hhh...
> logical reads 5
>
> However for TOP 1 everything works in a different way i.e. there is
> always the same one row and the same one logical read in spite of
> diffferent order by clauses:
>
> select * from (
> select top 1 * from t3
> ) as q
> order by data asc
> 1 aaa....
> logical reads 1
>
> select * from (
> select top 1 * from t3
> ) as q
> order by data desc
> 1 aaa....
> logical reads 1
>
> So where is the truth? Why the functionality is different?
>
> The business case is that we have search with potentially weak user
> criteria resulting in BIG potential result sets, but we want to show
> the user just ANY N rows satisfying criteria. But these N rows should
> be ordered. So what I'd like to achieve is:
> 1) get ANY no more than N rows according to my criteria
> 2) sort these N rows according to my order by clause.
>
> I DEFINITELY don't want:
> 1) get ALL rows
> 2) sort them and throw away all but first N.
>
> TIA, Gints
TOP n without ORDER BY is non-deterministic. Ie. you are telling SQL Server
to return ANY n rows from the table, which it will therefore do by whatever
method it finds convenient. Since a table is an unordered set of rows by
definition the only way to select a specific "top n" rows is to specify some
logical ordering. This is much the same as with rownum in Oracle, which is
not bound to any fixed ordering in the table.
See also the ROW_NUMBER() function, which is standard SQL and supported by
both Oracle and SQL Server.
Hope that helps.
--
David Portas >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: Sep 09, 2003 Posts: 152
|
(Msg. 3) Posted: Wed Jan 16, 2008 9:01 pm
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Gints,
"How top actually works" cannot be determined with a table with only 9
rows in 48 KB.
I ran your example query
select * from (
select top 2 * from t3
) as q
order by data asc
and replaced "t3" with a 65 million row table (18 GB). It would complete
with only 4 logical reads. When I replaced "top 2" with "top 10" in the
inner query, it would complete after 4 logical reads.
Your table is so small, that even the simplest query plan is so cheap
that the optimizer seems to consider it useless to search for anything
better.
--
Gert-Jan
Gints Plivna wrote:
>
> I'm coming from Oracle world and trying to find something similar to
> rownum in Oracle. I know there exists TOP which normally if used in
> the same select woth order by firstly sorts data and then only gets
> top n. So the question is what actually happens when top is used in
> inner query and order by in outer query. The problem is that it seems
> to be somehow inconsistent at least for the first sight.
>
> Using SQL Server 2005
> So I have following test case:
>
> create table t3 (id integer, data varchar(4000));
> insert into t3 values (1, replicate('a', 4000));
> insert into t3 values (2, replicate('b', 4000));
> insert into t3 values (3, replicate('c', 4000));
> insert into t3 values (4, replicate('d', 4000));
> insert into t3 values (5, replicate('e', 4000));
> insert into t3 values (6, replicate('f', 4000));
> insert into t3 values (7, replicate('g', 4000));
> insert into t3 values (8, replicate('h', 4000));
> insert into t3 values (9, replicate('i', 4000));
>
> SET STATISTICS IO ON
> firstly just select all rows to know how many logical reads are needed
> for all table.
>
> select * from t3
> 1 aaa...
> ...
> 9 iiii....
> logical reads 5
>
> Now get first two rows without any where clause:
> select top 2 * from t3
> 1 aaa...
> 2 bbb...
> logical reads 1
>
> Now the same first two rows just with outer select without any order
> by:
> select * from (
> select top 2 * from t3
> ) as q
> 1 aaa...
> 2 bbb...
> logical reads 1
>
> OK till now it's as expected, just one logical read get first 2 rows
> and end query.
> However look at next query's logical reads 5. This somehow is very
> interestingly equal to logical reads for select all rows from t3.
>
> select * from (
> select top 2 * from t3
> ) as q
> order by data asc
> 1 aaa...
> 2 bbb...
> logical reads 5
>
> So the next one shows that order by clause has affected the result set
> and actually semms to be pushed into inner query. Also logical reads
> are 5 meaning that actually we have scanned all the table.
>
> select * from (
> select top 2 * from t3
> ) as q
> order by data desc
> 9 iii..
> 8 hhh...
> logical reads 5
>
> However for TOP 1 everything works in a different way i.e. there is
> always the same one row and the same one logical read in spite of
> diffferent order by clauses:
>
> select * from (
> select top 1 * from t3
> ) as q
> order by data asc
> 1 aaa....
> logical reads 1
>
> select * from (
> select top 1 * from t3
> ) as q
> order by data desc
> 1 aaa....
> logical reads 1
>
> So where is the truth? Why the functionality is different?
>
> The business case is that we have search with potentially weak user
> criteria resulting in BIG potential result sets, but we want to show
> the user just ANY N rows satisfying criteria. But these N rows should
> be ordered. So what I'd like to achieve is:
> 1) get ANY no more than N rows according to my criteria
> 2) sort these N rows according to my order by clause.
>
> I DEFINITELY don't want:
> 1) get ALL rows
> 2) sort them and throw away all but first N.
>
> TIA, Gints >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: Nov 23, 2007 Posts: 27
|
(Msg. 4) Posted: Thu Jan 17, 2008 12:28 am
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On 17 Janv., 01:33, Gert-Jan Strik
wrote:
> Your table is so small, that even the simplest query plan is so cheap
> that the optimizer seems to consider it useless to search for anything
> better.
OK I've also tried it with much bigger table and got the same results
as you. Let's hope optimizer will be smart enough to distinguish big
work from small work and in case of big work won't do order before
top
Gints >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: Feb 01, 2007 Posts: 7
|
(Msg. 5) Posted: Thu Jan 17, 2008 3:58 pm
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
|
|
| Back to top |
|
 |  |
External

Since: May 30, 2004 Posts: 1649
|
(Msg. 6) Posted: Thu Jan 17, 2008 9:01 pm
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
|
|
| Back to top |
|
 |  |
External

Since: Nov 23, 2007 Posts: 27
|
(Msg. 7) Posted: Wed Jan 23, 2008 4:07 am
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On 18 Janv., 00:28, Erland Sommarskog wrote:
> Gints Plivna ( ) writes:
> > OK I've also tried it with much bigger table and got the same results
> > as you. Let's hope optimizer will be smart enough to distinguish big
> > work from small work and in case of big work won't do order before
> > top
>
> Since you are on SQL 2005, any reason to not use row_number()? Then
> you would have code that would run both on SQL Server and on Oracle?
I'm quite sure row_number won't help me in this case, because I'd like
to limit found number of rows. row_number actually must have order by
clause and ordering before limiting returned number of rows is the
thing I'd like to avoid.
And also this code will run only on SQL Server, so no need for
"portable SQL" (I'm BTW quite sceptical about such "portable SQLs"
generally, because usually it means code will be slow on all
databases  .
Gints >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: May 30, 2004 Posts: 1649
|
(Msg. 8) Posted: Wed Jan 23, 2008 9:00 pm
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
Gints Plivna ( ) writes:
> I'm quite sure row_number won't help me in this case, because I'd like
> to limit found number of rows. row_number actually must have order by
> clause and ordering before limiting returned number of rows is the
> thing I'd like to avoid.
That can be achieved with row_number with some trickery. Consider:
with numbered AS (
select *, rn = row_number() OVER (order by x)
from (SELECT *, x = 'x' FROM Orders) as s
)
SELECT * FROM numbered WHERE rn < 20
ORDER BY CustomerID
> And also this code will run only on SQL Server, so no need for
> "portable SQL" (I'm BTW quite sceptical about such "portable SQLs"
> generally, because usually it means code will be slow on all
> databases .
Sorry, since you mentioned that you came from Oracle, I somehow drew
the conclusion that you were porting code.
I agree on your opinion on "portable SQL".
--
Erland Sommarskog, SQL Server MVP, esquel RemoveThis @sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
External

Since: Nov 23, 2007 Posts: 27
|
(Msg. 9) Posted: Thu Jan 24, 2008 2:25 am
Post subject: Re: How top actually works [Login to view extended thread Info.] Archived from groups: per prev. post (more info?)
|
|
|
On 24 Janv., 01:17, Erland Sommarskog wrote:
> Gints Plivna ( ) writes:
> > I'm quite sure row_number won't help me in this case, because I'd like
> > to limit found number of rows. row_number actually must have order by
> > clause and ordering before limiting returned number of rows is the
> > thing I'd like to avoid.
>
> That can be achieved with row_number with some trickery. Consider:
>
> with numbered AS (
> select *, rn = row_number() OVER (order by x)
> from (SELECT *, x = 'x' FROM Orders) as s
> )
> SELECT * FROM numbered WHERE rn < 20
> ORDER BY CustomerID
Ahh that's real trickery!
OK I'll remember it just in case in the future I'll need something
like that!
Thanks!
Gints >> Stay informed about: How top actually works |
|
| Back to top |
|
 |  |
| Related Topics: | Subquery in DTS No Longer Works After Years - Hi folks, A DTS package we have run for years now no longer works. The specific part that is not working is a subquery in the SOURCE object of a transformation. The source is based on a Microsoft Data Link to a Sybase database (DSN changed a couple..
Collation error - I can ussually resolve this issue but I'm looking for help this time. I'm running the following select statement on one server and pulling the information from a second server. SELECT CustomerName from [SQL-STL].FSDBPE.dbo.FS_Customer Results: Server:....
Concurrency in Transaction - hi gurus the scenario Frontend - MS Access (not yet decided whether MDB or ADP) Backend - MS SQL Server it is a conversion from MS Access backend to MS SQL Server Backend. Planning to create stored procedures for all the Inserts, Updates, Deletes and....
Configuring SQL MAIL - Hi All, My question is that can I configure SQL MAIL, if my EXCHANGE SERVER is on different domain and my SQL is running on different domain.Can anyone guide me for this because I googled a lot without any result. I am using SQL SERVER 2000..
working with daily results : results per day - Hi, I'm trying to get an overview of counted visits per day for a webapplication. The thing is that SQL doesn't work with full days. For instance: getdate() returns a fulldate instead of a date in the format 'yyyy-mm-dd'. I need this format, without.... |
|
You can post new topics in this forum You can reply to topics in this forum You can edit your posts in this forum You can delete your posts in this forum You can vote in polls in this forum
|
|
|
|
 |
|
|