Hi!
It is quite simple to calculate covariance, correlation coefficient and
coefficient of determination (squared correlation) with T-SQL queries. Here
is an example using vTargetMail view from AdventureWorksDW demo db:
/* Linear relationship */
USE AdventureWorksDW;
GO
-- Covariance
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT CoVar=
SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*)
FROM dbo.vTargetMail
GO
-- Correlation and CD
DECLARE @mean1 decimal(20,6)
DECLARE @mean2 decimal(20,6)
SELECT @mean1=AVG(YearlyIncome*1.0)
,@mean2=AVG(NumberCarsOwned*1.0)
FROM vTargetMail
SELECT Correl=
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))),
CD=SQUARE(
(SUM((YearlyIncome*1.0-@mean1)*(NumberCarsOwned*1.0-@mean2))/COUNT(*))
/((STDEVP(YearlyIncome*1.0)*STDEVP(NumberCarsOwned*1.0))))
FROM vTargetMail
GO
--
Dejan Sarka
http://blogs.solidq.com/EN/dsarka/default.aspx
"csmba" wrote in message
> The end game is that I want a report to show me the correlation (number
> between -1 to 1) between each 2 items in my transactional database.
>
> 1. Is there a way to have an SP or function that calculates correlation
> for 2 clms in a table?
> 2. is there a better way, so that I don't need to create fake tables
> (cause I want correlation between lets say x,y,z and a,b,c, so that is:
> x-a,x-b,x-c,y-a....)
> 3. would you tell me that the only way is to use Analysis Services?
>
> What would be the recommendation that will let me at the end, deliver such
> report?
>
> thanks
>
>