Thursday, May 03, 2012

R Is Not Enough For "Big Data"

by Douglas Merrill
“… // Side note 1: I was an undergraduate at the University of Tulsa, not a school that you’ll find listed on any list of the best undergraduate schools.  I did pretty well at Princeton in my doctoral studies.  I’ve hired a lot of people from “bad” schools — like Washington State University — that have been very successful.  Although school is a decent proxy for intellectual horsepower, it’s only a proxy — I believe that the top 1% at any school will likely be pretty awesome.  The hard part is finding that 1%, because there’s likely a material difference between the mean of a second-rate school and the mean of a, say, Harvard. //
// Side note 3: OK, I’m about to take some real liberties with the math here, to help make my point.  All the real mathematicians out there are going to experience almost uncontrollable body twitches over the next few paragraphs.  Breathe deeply, it will pass.  //
// Side note 3: There are all kinds of mathematical problems with most regression models, notably that few things are linearly related and that many things have “correlated errors”, but I’ll leave that to Wikipedia if you’re interested. // …”

No comments: