This blog has moved!

Check out www.CodeBetter.com/blogs/grant.killian

<September 2008>
SuMoTuWeThFrSa
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011


Navigation

Professional Props...

Extracurricular Props...

Subscriptions

Article Categories



R-Squared et al in C# (Math Class anyone?)

Ever try to program an R-Squared, Y-Intercept, and Slope calculator via a Linear Regression -- the good old sum of squares and codeviates from math class so many years ago?  It sounds like an academic exercise, but this was an actual task I had to tackle for a custom reporting engine we wrote a while back.  I had a rough time finding the nuts and bolts of the algorithm -- many online resources point you through Excel functions or a graphing calculator, but that wouldn't cut it for our app.  In case there's another poor soul out there looking, let me post the foundation:

First, we create a ReportPoint object:

public class ReportPoint
 {
  private double _dblX ;
  private double _dblY ;
 
  public double X_Coord
  {
   get{ return _dblX ; }
   set{ _dblX = value ; }
  }
  public double Y_Coord
  {
   get{ return _dblY ; }
   set{ _dblY = value ; }
  }
 
  public ReportPoint( double X_Coordinate, double Y_Coordinate )
  {
   _dblX = X_Coordinate ;
   _dblY = Y_Coordinate ;
  }
 }

Nothing extraordinary there.  Here is the good part, assuming you pass in an ArrayList of our ReportPoints above:

public static void calcValues( ArrayList alPoints )
  {
   double sumOfX = 0 ;
   double sumOfY =0 ;
   double sumOfXSq = 0 ;
   double sumOfYSq = 0 ;
   double ssX = 0 ;
   double ssY = 0 ;
   double sumCodeviates = 0 ;
   double sCo = 0 ;

   for( int ctr = 0; ctr < alPoints.Count; ctr++ )
   {
    ReportPoint objPoint = ( ReportPoint ) alPoints[ ctr ] ;
    double x = double.Parse( objPoint.X_Coord.ToString() ) ;
    double y = double.Parse( objPoint.Y_Coord.ToString() ) ;
    sumCodeviates+= ( x*y ) ;
    sumOfX += x ;
    sumOfY += y ;
    sumOfXSq = sumOfXSq + ( x*x ) ;
    sumOfYSq = sumOfYSq + ( y*y ) ;
   }
   sumOfXSq = Math.Round( sumOfXSq, 2 ) ;
   sumOfYSq = Math.Round( sumOfYSq, 2 ) ;
   ssX = sumOfXSq - ( ( sumOfX*sumOfX ) / alPoints.Count ) ;
   ssY = sumOfYSq - ( ( sumOfY*sumOfY ) / alPoints.Count ) ;
   double RNumerator  = ( alPoints.Count * sumCodeviates ) - (sumOfX * sumOfY ) ;
   double RDenom = ( alPoints.Count*sumOfXSq - ( Math.Pow( sumOfX, 2 ) ) )
    * ( alPoints.Count*sumOfYSq - ( Math.Pow( sumOfY, 2 ) ) ) ;
   sCo = sumCodeviates - ( ( sumOfX*sumOfY ) / alPoints.Count ) ;
   double dblSlope = sCo / ssX ;
   double meanX = sumOfX / alPoints.Count ;
   double meanY = sumOfY /alPoints.Count ;
   double dblYintercept = meanY - ( dblSlope * meanX ) ;
   double dblR =  RNumerator / Math.Sqrt( RDenom ) ;
   double dblSlope = dblSlope ; 
   Console.WriteLine( "R-Squared: {0}",  Math.Pow( dblR, 2 ) ) ;
   Console.WriteLine( "Y-Intercept: {0}",  dblYIntercept ) ;
   Console.WriteLine( "Slope: {0}",  dblSlope ) ;
   Console.ReadLine() ;
  }

Yes, yes, yes, I know a typed collection instead of an ArrayList would be better; I moved this code into a Console program to make an easy to follow demo of the logic and wanted to keep non-essentials to a minimum.  Let's say I'm saving myself for Generics!   So, in our main method we'd have:

[STAThread]
  static void Main(string[] args)
  {
   ArrayList al = new ArrayList() ;
   al.Add( new ReportPoint( 3, 2.6 ) ) ;
   al.Add( new ReportPoint( 5.6, 20 ) ) ;
   al.Add( new ReportPoint( 8.2, 30 ) ) ;
   al.Add( new ReportPoint( 8.4, 50.7 ) ) ;
   al.Add( new ReportPoint( 9, 51.4 ) ) ;
   al.Add( new ReportPoint( 10, 37.9 ) ) ;
   calcValues( al ) ;
  }

There you have it.  You really need to watch your order of operations. 

Happy .Netting!

posted on Friday, May 21, 2004 8:36 AM by grant.killian





Powered by Dot Net Junkies, by Telligent Systems