Blogging A to Z: The A to Z of tidyverse
Announcing my theme for this year’s blogging A to Z!The tidyverse is a set of R packages for data science. The big thing about the tidyverse is making sure your...continue reading.
Announcing my theme for this year’s blogging A to Z!The tidyverse is a set of R packages for data science. The big thing about the tidyverse is making sure your...continue reading.
Via Digg:This data visualization, put together by takeasecond on Reddit, shows the tallest building in all 50 states in 2020. As the graph demonstrates, the current tallest building in America...continue reading.
If you are a professor teaching or a student enrolled in machine learning program or non-technical program with a machine learning hands-on lab becoming a member of the H2O.ai Academic...continue reading.
Revisiting the river flow profile plot from an earlier post, the video below loops each day’s flow profile for the Delaware River in 2019. Data is from USGS gages processed...continue reading.
# Calculate scorestestScore1 = h2o.predict( trainingModel1, testData_hex )testScore2 = h2o.predict( trainingModel2, testData_hex )testScore3 = h2o.predict( trainingModel3, testData_hex )# Add row scores at the beginning of datasettestData_hexScores = h2o.cbind( round( testScore1[,1],...continue reading.
H2O engineers continually innovate and introduce new techniques by adopting latest research, working on cutting edge use cases, and participating and winning machine learning competitions like Kaggle. But thanks to...continue reading.
Introduction:We will identify anomalous patterns in data, this process is useful, not only to find inconsistencies and errors but also to find abnormal data behavior, being useful even to find...continue reading.
Estimating the carbon cost of psycholinguistics conferencesNote: If I have made some calculation error, please point it out and I will fix it.The above analysis is probably very coarse-grained. One...continue reading.
So far on this blog, we’ve used the data containing information on Pitchfork music reviews (available on Kaggle at this link) for a number of different data analyses. I’ve found...continue reading.
In this post, we will see how to download personal Fitbit data histories for step counts, heart rate, and sleep via the Fitbit API. We will use a combination of...continue reading.
Using R and H2O Isolation Forest to predict car battery failures.Carlos Kassab2019-May-24This is a study about what might be if car makers start using machine learning in our cars to...continue reading.
As I was preparing some graphics for a presentation recently, I started digging into some of the different color palette options. My motivation was entirely about creating graphics that weren’t...continue reading.
Some say data is the new oil. Others equate its worth to water. And then there are those who believe that data scientists will be (in fact, they already are)...continue reading.
Steps Needed in a Process to Detect Anomalies And Have a Maintenance Notice Before We Have Scrap Created on The Production Line.Describing my previous articles( 1, 2 ) process flow:Get...continue reading.
Introduction:We will identify anomalous units on the production line by using measurements data from testing stations and Isolation Forest model. Anomalous products are not failures, anomalies are units close to...continue reading.
In this post, we will solve a simple problem (called “FizzBuzz”) that is asked by some employers in data scientist job interviews. The question seeks to ascertain the applicant’s familiarity...continue reading.
Introduction:We will identify anomalous products on the production line by using measurements from testing stations and deep learning models. Anomalous products are not failures, these anomalies are products close to...continue reading.
As I conduct some analysis for a content validation study, I wanted to quickly blog about a fun plot I discovered today: ggpairs, which displays scatterplots and correlations in a...continue reading.
In this post, we will analyze government data from the Flemish region in Belgium on A) official crime statistics and B) self-reported feelings of safety among residents of Flanders. We...continue reading.
Now that the US Government shutdown is over, it is time to download NOAA weather daily summaries in bulk and store them somewhere safe so that at the next shutdown...continue reading.