A reader, e.g. Mr. Wayne Zhang, of my previous post (https://statcompute.wordpress.com/2018/09/03/playing-map-and-reduce-in-r-by-group-calculation) made a good comment that “Why not use directly either Spark or H2O to derive such computations without involving...continue reading.
Introduction Market Basket Analysis or association rules mining can be a very useful technique to gain insights in transactional data sets, and it can be useful for product recommendation. The...continue reading.
Evaluation metrics play a critical role in machine learning ecosystem. Especially for machine learning products, evaluation metrics are like the heart beats. They show how healthy the model is and...continue reading.
Consider the following two spark dataframes:df1.show()+—-+——+——-+|id_a|time_a|value_a|+—-+——+——-+| 1| 1| CA|| 1| 2| CA|| 2| 1| TX|| 3| 5| NE|| 4| 6| WA|+—-+——+——-+df2.show(…continue reading.