3 Origins of the Data Analysis Process
3.1 Introduction
Data analysis has a long and fascinating history that reflects humanity’s curiosity and desire to understand the world through information.
While no one knows exactly when the first data was recorded, the practice of collecting and analyzing information has existed for thousands of years.
3.2 Historical Foundations
Data analysis is deeply rooted in statistics, which dates back to ancient Egypt.
- The Egyptians used statistical methods during the construction of the pyramids.
- They documented calculations and theories on papyri, which served as early versions of spreadsheets and checklists.
- These early record-keeping practices laid the groundwork for organized data management — a foundation upon which modern data analytics is built.
Today’s data analysts owe much to these early innovators who transformed information into structured systems for decision-making.
3.3 The Modern Data Analysis Process
Data analysis is now a structured discipline that moves systematically from data to decision.
While there is no single universal process, most models share common principles such as asking questions, preparing data, analyzing insights, and acting on results.
The Google Data Analytics Process, which forms the foundation of this program, includes six key steps:
- Ask – Define the business challenge, objective, or question.
- Prepare – Generate, collect, and manage data.
- Process – Clean and ensure data integrity.
- Analyze – Explore, visualize, and interpret data.
- Share – Communicate findings clearly and effectively.
- Act – Apply insights to solve problems and drive decisions.
This model provides a practical, repeatable approach for turning data into actionable insights.
3.4 Other Models of the Data Analysis Process
3.4.1 1. EMC’s Data Analytics Process
Developed by David Dietrich at Dell EMC, this model highlights the cyclical nature of analytics projects:
- Discovery
- Pre-processing data
- Model planning
- Model building
- Communicate results
- Operationalize
Key ideas: - Steps are interconnected and repeat continuously.
- Each phase prepares the groundwork for the next.
- Analysts use checkpoints to ensure data is ready before moving forward.
📘 Reference: Data Science & Big Data Analytics (free access available as PDF).
3.4.2 2. SAS Iterative Data Analysis Process
Created by SAS, a leading analytics company, this seven-step cyclical process focuses on creating repeatable and predictive results:
- Ask
- Prepare
- Explore
- Model
- Implement
- Act
- Evaluate
Key feature:
This process is shaped like an infinity loop, emphasizing continuous improvement.
The final step, Evaluate, encourages analysts to assess outcomes and, if necessary, return to the Ask phase.
3.4.3 3. Project-Based Data Analytics Process
Proposed by Vignesh Prajapati, this streamlined five-step process focuses on project execution:
- Identifying the problem
- Designing data requirements
- Pre-processing data
- Performing data analysis
- Visualizing data
Key feature:
This model omits the “Act” phase but still follows the core logic — defining a problem, preparing data, analyzing it, and presenting findings visually.
📘 Reference: Understanding the Data Analytics Project Life Cycle.
3.4.4 4. Big Data Analytics Process
Developed by Thomas Erl, Wajid Khattak, and Paul Buhler, this model expands on earlier frameworks to address large-scale data systems:
- Business case evaluation
- Data identification
- Data acquisition and filtering
- Data extraction
- Data validation and cleaning
- Data aggregation and representation
- Data analysis
- Data visualization
- Utilization of analysis results
Key feature:
This process adds more granularity to the prepare and process stages, reflecting the complexity of managing big data pipelines.
📘 Reference: Big Data Fundamentals: Concepts, Drivers & Techniques.
3.5 Key Takeaways
- Ancient origins: Data analysis began with early civilizations like ancient Egypt, emphasizing record-keeping and organization.
- Shared foundations: Every model—whether from Google, EMC, SAS, or others—follows similar core steps:
- Ask → Prepare → Process → Analyze → Share → Act
- Ask → Prepare → Process → Analyze → Share → Act
- Cyclical nature: Data analysis is iterative — insights lead to new questions and new analyses.
- Diverse models, same goal: Each variation tailors the process to different contexts (e.g., big data, project-based work) but aims to turn data into informed action.
- Modern relevance: Understanding multiple frameworks helps analysts choose the best process for their organization’s goals and data environment.
3.6 Summary
From ancient pyramid builders to modern data professionals, the essence of data analysis remains the same:
Collect, understand, and act on data to make better decisions.
Whether you follow the Google Data Analytics framework or another model, the goal is consistent — using data effectively to generate insights, drive innovation, and create meaningful impact.