In the previous post (https://statcompute.wordpress.com/2015/12/27/import-csv-by-chunk-simultaneously-with-ipython-parallel), we’ve shown how to implement the parallelism with IPython parallel package. However, in that specific case, we were not able to observe the efficiency gain...continue reading.
Monthly Archive: December 2015
IPython Parallel is a friendly interface to implement the parallelism in Python. Below is an example showing how to import a csv file into the Pandas DataFrame by chunk simultaneously...continue reading.
I work with R on both Mac OS and Windows. On Windows, you get the option to copy the path of a file or folder by holding Shift while right-clicking...continue reading.
In my previous blog post, I analyzed my Twitter archive and explored some aspects of my tweeting behavior. When do I tweet, how much do retweet people, do I use...continue reading.
I’ve got a NetAtmo weather station. One can download the measurements from its web interface as a CSV file. I wanted to give time series analysis with the extraction of...continue reading.
Evaluation metrics play a critical role in machine learning ecosystem. Especially for machine learning products, evaluation metrics are like the heart beats. They show how healthy the model is and...continue reading.
The matrixStats package provides highly optimized functions for computing common summaries over rows and columns of matrices. In a previous blog post, I showed that, instead of using apply(X, MARGIN...continue reading.
In : import statsmodels.datasets as datasets In : import sklearn.metrics as metrics In : from numpy import log In : from pyearth import Earth as earth In : boston =...continue reading.
In the operational loss calculation, it is important to use CPI (Consumer Price Index) adjusting historical losses. Below is an example showing how to download CPI data online directly from...continue reading.
When conducting Cohort Analysis, one of the most important measures is Customer Retention Rate. I will share a few ideas for visualizing this parameter When conducting Cohort Analysis, one of...continue reading.
In : # LOAD PACKAGES In : import pandas as pd In : import numpy as np In : from sklearn import preprocessing as pp In : from sklearn import...continue reading.
A new package has hit the CRAN shelves this week. While knitr is one of the most useful R packages in existence, ezknitr is a simple extension to it that...continue reading.
Poisson and Negative Binomial regressions are two popular approaches to model frequency measures in the operational loss and can be implemented in Python with the statsmodels package as below: Although...continue reading.