12  The Data Life Cycle

Just like living things, data has a life cycle — a series of stages it passes through from creation to destruction.
Understanding this cycle helps data professionals manage information responsibly and effectively at every stage.

The six stages of the data life cycle are:

  1. Plan
  2. Capture
  3. Manage
  4. Analyze
  5. Archive
  6. Destroy

Let’s explore each stage in detail.


12.0.1 Plan

Planning happens before any data is collected or analyzed.
In this stage, organizations determine: - What kind of data is needed
- How it will be managed throughout its life cycle
- Who will be responsible for handling it
- What the expected outcomes are

Example:

An electricity provider planning an energy-saving project might decide to collect: - Electricity usage per customer
- Types of buildings being powered
- Types of electrical devices used

They would also assign team members to handle data collection, storage, and sharing.
This structured planning ensures the rest of the project runs smoothly and efficiently.


12.0.2 Capture

The capture phase is where data is collected from various sources and brought into the organization.
With so much data generated every day, there are countless ways to gather it.

Common methods include:

  • External sources: Public datasets, APIs, or third-party databases.
    Example: A weather analysis project might use data from the National Climatic Data Center (NCDC).
  • Internal sources: Company databases, reports, or transaction logs.

Definition:

A database is a collection of data stored electronically in a computer system.
For example, an electricity company might maintain a customer database tracking energy usage patterns.

During this phase, it’s essential to ensure data integrity, credibility, and privacy — especially when handling customer or sensitive information.


12.0.3 Manage

In the manage phase, organizations focus on how data is stored, protected, and maintained.
This involves: - Implementing security protocols
- Using data management tools and systems
- Ensuring data accuracy and consistency
- Maintaining compliance with privacy laws and regulations

This stage is closely connected to data cleaning — the process of correcting or removing inaccurate, incomplete, or irrelevant data to ensure reliability.

Proper management guarantees that data remains trustworthy and ready for analysis.


12.0.4 Analyze

The analyze phase is where data analysts shine.
Here, the collected and cleaned data is examined to: - Identify patterns and relationships
- Draw conclusions
- Support business goals
- Drive decision-making

Example:

An electricity company might analyze its usage data to: - Identify customers who consume excessive power
- Recommend ways to reduce energy use
- Create data-driven programs that promote efficiency

This phase transforms raw data into actionable insights that inform strategic decisions.


12.0.5 Archive

In the archive stage, data that is no longer actively used is stored securely for future reference.
Archiving helps organizations retain valuable information without cluttering active databases.

Key Points:

  • Archived data remains accessible but inactive.
  • It saves storage space and improves system performance.
  • It ensures compliance with data retention policies.

Example:

Once the electricity provider completes its yearly analysis, it can archive old consumption data while keeping current data active for new studies.

Archiving ensures that historical data is preserved without interfering with ongoing analytical work.


12.0.6 Destroy

The final stage of the data life cycle is destroying data that is no longer needed.
When data becomes irrelevant or reaches the end of its retention period, it must be securely erased or disposed of.

Common destruction methods:

  • Digital data: Secure data-erasure software or disk wiping.
  • Physical data: Shredding paper documents or destroying hard drives.

Example:

The electricity provider might use secure erasure software to wipe customer data from old hard drives and shred paper files to protect privacy.

Secure data destruction is vital for maintaining confidentiality and trust.


12.0.7 Summary: The Six Stages of the Data Life Cycle

Stage Description Example
Plan Decide what data is needed and how it will be managed Electricity provider plans to collect usage data
Capture Collect data from internal or external sources Gather data from customer meters
Manage Store, protect, and maintain data integrity Securely store customer records
Analyze Use data to solve problems and make decisions Find patterns in electricity usage
Archive Store data not currently in use Keep last year’s data for future comparison
Destroy Securely delete data no longer needed Erase old files to protect privacy

12.0.8 Key Takeaway

The data life cycle provides a roadmap for how data should be handled responsibly —
from initial planning to final disposal.

By understanding each stage — Plan, Capture, Manage, Analyze, Archive, and Destroy
data analysts can ensure that data remains useful, secure, and ethical throughout its life.


12.1 Variations of the Data Life Cycle

You have learned that there are six stages to the data life cycle.
Here’s a quick recap:

  • Plan: Decide what kind of data is needed, how it will be managed, and who will be responsible for it.
  • Capture: Collect or bring in data from a variety of different sources.
  • Manage: Care for and maintain the data. This includes determining how and where it is stored and the tools used to do so.
  • Analyze: Use the data to solve problems, make decisions, and support business goals.
  • Archive: Keep relevant data stored for long-term and future reference.
  • Destroy: Remove data from storage and delete any shared copies of the data.

Note:
Be careful not to confuse the six stages of the data life cycle (plan, capture, manage, analyze, archive, destroy)
with the six phases of the data analysis process (ask, prepare, process, analyze, share, act).
These frameworks are related but not interchangeable.


12.1.1 How Data Life Cycles Vary Across Sectors

The data life cycle provides a general framework for how data is managed.
However, organizations in different industries often adapt it to meet their unique needs.
Below are examples of how various sectors — government, finance, and education — interpret and implement the data life cycle.


12.1.2 U.S. Fish and Wildlife Service

The U.S. Fish and Wildlife Service uses the following six-stage model:

  1. Plan
  2. Acquire
  3. Maintain
  4. Access
  5. Evaluate
  6. Archive

This version emphasizes data evaluation and long-term access — both critical for environmental monitoring and conservation efforts.

For more details, refer to the U.S. Fish and Wildlife’s Data Management Life Cycle page.


12.1.3 U.S. Geological Survey (USGS)

The USGS follows a slightly different approach, focusing on data integrity and accessibility throughout research projects.

USGS Data Life Cycle:

  1. Plan
  2. Acquire
  3. Process
  4. Analyze
  5. Preserve
  6. Publish/Share

In addition, USGS performs several cross-cutting activities during every stage:

  • Describe: Create metadata and documentation
  • Manage Quality: Ensure data accuracy and consistency
  • Backup and Secure: Protect data integrity and availability

For more information, refer to the USGS Data Lifecycle page.


12.1.4 Financial Institutions

Financial organizations, which prioritize compliance, security, and efficiency, tend to use a life cycle focused on data validation and reporting.

According to the article “The Data Life Cycle” in Strategic Finance magazine, the financial sector often follows this model:

  1. Capture
  2. Qualify
  3. Transform
  4. Utilize
  5. Report
  6. Archive
  7. Purge

This version highlights qualifying (verifying data accuracy before use) and purging (securely deleting outdated data), which are vital in financial risk management and compliance.


12.1.5 Harvard Business School (HBS)

A research-driven model from Harvard University includes eight stages that reflect the academic focus on exploration, insight, and knowledge creation.

HBS Data Life Cycle:

  1. Generation
  2. Collection
  3. Processing
  4. Storage
  5. Management
  6. Analysis
  7. Visualization
  8. Interpretation

This model extends beyond management to emphasize data storytelling — turning analysis into insights through visualization and interpretation.

For more information, refer to 8 Steps in the Data Life Cycle.


12.1.6 Key Takeaways

  • While the core principles of data management remain the same, industries adapt the life cycle to suit their specific goals.
  • Historical data is crucial for organizations like U.S. Fish and Wildlife and USGS, so their models emphasize archiving and data preservation.
  • Harvard Business School’s model includes visualization and interpretation, which align more closely with research and teaching goals.
  • Financial institutions explicitly include purging to ensure regulatory compliance and security.

Universal Principle:
No matter the variation, one rule applies everywhere — govern data responsibly.
Data must remain accurate, secure, and accessible to meet the organization’s operational and ethical standards.