In this next installment of the Data Science Maturity Model (DSMM) dimension discussion, I focus on 'methodology':
What is the enterprise approach or methodology to data science?
The most often cited methodology for 'data mining' - a key element of data science - is CRISP-DM. However, the breadth and growth of data science may require expanding beyond the traditional phases introduced by CRISP-DM: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. Indeed, the value of explicit feedback loops or expanded data awareness/access phases may be useful. In addition, enterprise-specific workflows involving data science project players and work products may be necessary to increase productivity and derived value.
The 5 maturity levels of the "methodology" dimension are:
Level 1: Data analytics are focused on business intelligence and data visualization using an ad hoc methodology.
For Level 1 enterprises, data analysts and other players typically follow no established methodology, relying instead on their experience, skills, and preferences. The focus is on business intelligence and data visualization through dashboards and reports and relies on traditional deductive query formulation.
Level 2: Data analytics are expanded to include machine learning and predictive analytics for solving business problems, but still using ad hoc methodology.
Like Level 1, Level 2 enterprises typically follow no established methodology, relying instead on player experience, skills, and preferences. However, enterprises at Level 2 supplement traditional roles such as data analysts who provide business intelligence and data visualization with data scientists who introduce more advanced data science techniques such as machine learning and predictive analytics. With the introduction of data scientists, there are implicit enhancements to the ad hoc data science methodology.
Level 3: Individual organizations begin to define and regularly apply a data science methodology.
Level 3 enterprises are in the experimental stage where individual organizations start to define their own methodological practices or leverage existing ones. Goals include increasing productivity, consistency, and repeatability of data science projects while controlling risk. Data science projects may or may not effectively track performance of deployed model outcomes.
Level 4: Basic data science methodology best practices established for data science projects.
Level 4 enterprises build on the progress from Level 3 by establishing methodology best practices throughout the enterprise. Such best practices are derived from organizational experimentation or adopted from an existing methodology. As a result of establishing best practices, the enterprise sees increased productivity, consistency, and repeatability of data science projects with reduced risk of failure.
Level 5: Data science methodology best practices formalized across the enterprise.
Having established best practices for data science in Level 4, the Level 5 enterprise formalizes additional key aspects of data science projects, including project planning, requirements gathering / specification, and design, as well as implementation, deployment, and project assessment.
In my next post, we'll cover the 'data awareness' dimension of the Data Science Maturity Model.