Data Mining OCR PDFs — Using pdftabextract to liberate tabular data from scanned documents
During the last months I often had to deal with the problem of extracting tabular data from scanned documents. These … Read More →continue reading.
During the last months I often had to deal with the problem of extracting tabular data from scanned documents. These … Read More →continue reading.
Yesterday, I read ‘Measurement error and the replication crisis’ by Eric Loken and Andrew Gelman, which left me puzzled. The first part of the paper consists of general statements about...continue reading.
Lately I notice a sharp increase in my coffee consumption (reading Howard Schultz’s Starbucks book, which is actually quite good by the way, does not help either :grimacing:). Having recently...continue reading.
Couldn’t make it to Orlando in January? We’re excited to bring you the next best thing. Whether you missed the conference, missed a talk, or just want to refresh your...continue reading.
This week, the 30-th issue of my Demographic Digest was published. Demographic Digest is my project that started in November 2015. Twice a month I select fresh demographic papers and...continue reading.
“MODIStsp” is a R package allowing automatic download and preprocessing of MODIS Land Products time series, available at this https://github.com/lbusett/MODIStsp github page (See also here for additional information) v1.3.1 adds...continue reading.
If big data is your thing, you use R, and you’re headed to Strata + Hadoop World in San Jose March 13 & 14th, you can experience in person how...continue reading.
In the process of designing my latest experiment in PsychoPy I realised that setting up the serial port connection is not the most obvious thing to do. I wrote this...continue reading.
What happens when you combine Pokemon with Neo4j? I’m a huge Pokemon fan. So, when I found about this awesome post from Joshua Kunst, I just couldn’t wait to throw...continue reading.
We have just adopted weighted Log-rank tests to the survminer package, thanks to survMisc::comp. What are they and why they are useful? Read this blog post to find out. I...continue reading.
We’re excited to announce the latest release of RStudio Connect: version 1.4.2. This release includes a number of notable features including an overhauled interface for parameterized R Markdown reports. Enhanced...continue reading.
We will use Document-Term Matrix that is the result of Vocabulary-based vectorization for training the model for Twitter sentiment analysisRecently I’ve worked with word2vec and doc2vec algorithms that I found interesting...continue reading.
We will use Document-Term Matrix that is the result of Vocabulary-based vectorization for training the model for Twitter sentiment analysisRecently I’ve worked with word2vec and doc2vec algorithms that I found interesting...continue reading.
I’m excited to announce the release of my new e-book: Introduction to Empirical Bayes: Examples from Baseball Statistics, available here. This book is adapted from a series of ten posts...continue reading.
Note: Cross-posted with the Stack Overflow blog. Check out the code for this analysis on Kaggle. For me, the weekends are mostly about spending time with my family, reading for...continue reading.
R Tutorial: Visualizing multivariate relationships in Large Datasets A tutorial by D.M. Wiig In two previous blog posts I discussed some techniques for visualizing relationships involving two or three variables...continue reading.
At Thursday (12.01.2017) we had a chance to attend the first TriCity R Users Group (Pomerania, Poland) meeting. The meetup was unexpectedly very successful! The success can be measured in...continue reading.
Russia is sadly notorious for its ridiculously high adult male mortality. According to Human Mortality Database data (2010), the probability for a Russian men to survive from 20 to 60...continue reading.
Sex ratios reflect the two basic regularities of human demographics: 1) there are always more boys being born; 2) males experience higher mortality throughout their life-course. The sex ratio at...continue reading.
I was working on to expand the disease transmission model from my first post but with the currents events, I felt compelled to work on something different. The new United States presidential administration formulated an...continue reading.