Chunk Averaging of GLM

by statcompute · July 8, 2019

This article is originally published at https://statcompute.wordpress.com

Chunk Average (CA) is an interesting concept proposed by Matloff in the chapter 13 of his book “Parallel Computing for Data Science”. The basic idea is to partition the entire model estimation sample into chunks and then to estimate a glm for each chunk. Under the i.i.d assumption, the CA estimator with the chunked data is asymptotically equivalent to the estimator with the full data. The possibility of converting the full model estimation with an excessively large dataset to the chunked estimation with small pieces is particularly attractive in real-world model developments where the model convergence could be challenging given the data size.

The ca_glm() function below is my attempt to implement the Chunk Averaging of GLM. As shown, CA estimations by various chunks are consistent with the estimation with the full data.

Thanks for visiting r-craft.org
This article is originally published at https://statcompute.wordpress.com
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Chunk Averaging of GLM

You may also like...

Categories

Chunk Averaging of GLM

You may also like...

partial rankings and aggregate ranks

Customer segmentation – LifeCycle Grids, CLV and CAC with R

R/Medicine Conference 2021 Sponsored By Appsilon

Categories