Apache Drill is an innovative distributed SQL engine designed to enable data exploration and analytics on non-relational datastores […] without having to create and manage schemas. […] It has a...continue reading.
Well, 2018 has flown by and today seems like an appropriate time to take a look at the landscape of R bloggerdom as seen through the eyes of readers of...continue reading.
This dance, it’s like a weapon: Radiohead’s and Beck’s danceability, valence, popularity, and more from the LastFM and Spotify APIs
Giddy up, giddy it upWanna move into a fool’s gold roomWith my pulse on the animal jewelsOf the rules that you choose to use to get looseWith the luminous movesBored...continue reading.
The truth is out there R readers, but often it is not what we have been led to believe. The previous post examined the strong positive results bias in optimism...continue reading.
In the previous parts of the series we demonstrated a positive results bias in optimism corrected bootstrapping by simply adding random features to our labels. This problem is due to...continue reading.
Welcome to part III of debunking the optimism corrected bootstrap in high dimensions (quite high number of features) in the Christmas holidays. Previously we saw with a reproducible code implementation...continue reading.
Some people are very fond of the technique known as ‘optimism corrected bootstrapping’, however, this method is clearly bias and this becomes apparent as we increase the number of features...continue reading.
There are lots of ways to assess how predictive a model is while correcting for overfitting. In Caret the main methods I use are leave one out cross validation, for...continue reading.