Welcome to dbForumz.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

calculating correlation coefficient

 
   Database Forums (Home) -> Datamining RSS
Next:  SQL Server 2000 SP4 client connection problem  
Author Message
csmba

External


Since: Dec 29, 2005
Posts: 3



(Msg. 1) Posted: Fri Feb 15, 2008 2:16 pm
Post subject: calculating correlation coefficient
Archived from groups: microsoft>public>sqlserver>datamining (more info?)

The end game is that I want a report to show me the correlation (number
between -1 to 1) between each 2 items in my transactional database.

1. Is there a way to have an SP or function that calculates correlation for
2 clms in a table?
2. is there a better way, so that I don't need to create fake tables (cause
I want correlation between lets say x,y,z and a,b,c, so that is:
x-a,x-b,x-c,y-a....)
3. would you tell me that the only way is to use Analysis Services?

What would be the recommendation that will let me at the end, deliver such
report?

thanks

 >> Stay informed about: calculating correlation coefficient 
Back to top
Login to vote
Dejan Sarka

External


Since: Mar 18, 2004
Posts: 317



(Msg. 2) Posted: Mon Feb 18, 2008 7:00 am
Post subject: Re: calculating correlation coefficient [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Hi!

It is quite simple to calculate covariance, correlation coefficient and
coefficient of determination (squared correlation) with T-SQL queries. Here
is an example using vTargetMail view from AdventureWorksDW demo db:

/* Linear relationship */
USE AdventureWorksDW;
GO
-- Covariance
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT CoVar=
SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*)
FROM dbo.vTargetMail
GO
-- Correlation and CD
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT Correl=
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))),
CD=SQUARE(
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))))
FROM vTargetMail
GO

--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx

"csmba" wrote in message

> The end game is that I want a report to show me the correlation (number
> between -1 to 1) between each 2 items in my transactional database.
>
> 1. Is there a way to have an SP or function that calculates correlation
> for 2 clms in a table?
> 2. is there a better way, so that I don't need to create fake tables
> (cause I want correlation between lets say x,y,z and a,b,c, so that is:
> x-a,x-b,x-c,y-a....)
> 3. would you tell me that the only way is to use Analysis Services?
>
> What would be the recommendation that will let me at the end, deliver such
> report?
>
> thanks
>
>

 >> Stay informed about: calculating correlation coefficient 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
SBA Invoice email - Although the standard invoice within sba is lame, it is much easier to use than exporting to Word, etc. Is there a way to email this form? If printed to a file, it uses a format that no one else can read. There should be an easy way to email this..

Shrink .LDF File - Dear All, The SQL Server 2000 in my network is hosting many database for many business units within the organization. The server is heavily used for data collection and web hosting. One of the .LDF file has grown very fast and I am very sure that it....

SELECT in sections - I want to know how to retrieve records from a SELECT command in sections. I mean, if total records of a SELECT command is 100,000, I want to use SELECT that retrieves me them in 20 "sections" of 5000 : first 5000, second 5000, sixth 5000 or ...

One query result included in another - I'm stumped with trying to create a dynamic query against items in two many-to-many relations that are both related to a single features table. For simplicities sake, I'll say we've got a table called FEATURES, to which we can add "features"...

Comparing Data and Filtering Results - shopper_tracking shopper_id tracking_num closed date_entered 12345 1za45690987 1 8/15/2005 65984 1za45690988 1 8/15/2005 23569 1za45690990 1 8/15/2005 25648 1za45690991 ...
   Database Forums (Home) -> Datamining All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]