Beyond data integration: Why data movement is the future

Future uses of data will require an automated data movement platform, not just data pipelines as single point solutions
February 27, 2023

The rapid adoption of the cloud has radically changed the realities of using and handling data. As data sources such as applications, operational databases, file systems, data streams and more have proliferated, so have the tools, infrastructure and processes for handling data. These new approaches often enable automation and real-time turnaround. The upshot is that, beyond centralizing data to support decisions, there are now growing opportunities to monetize and operationalize data and incorporate it into innovative products. As companies grow more reliant on data, they will need data flowing across complex and heterogeneous environments. At the same time, this growing usage of data poses serious organizational, technological and regulatory complexities that must be comprehensively addressed.

In light of this new reality, “data integration” no longer adequately describes the activities that support end uses of data. Instead, a better model for thinking about how to handle data is “data movement.”

Data integration centralizes data in order to establish a single source of truth for analysis. It is fundamentally a unidirectional process, in which data flows from various sources to a central destination. By contrast, for operational or product-oriented uses of data, data must flow in many directions between many kinds of platforms.

An organization might replicate data across operational systems that are distributed across different regions so that users can access local servers with less latency. To prevent disruptions in the event of failures, an organization might build redundancies into its operations by replicating data in high volumes and real-time from production servers to failover instances. 

For instance, Lufthansa Systems, the IT subsidiary of airline Lufthansa, bidirectionally replicates data between a central data repository and hundreds of distributed data repositories belonging to customers in order to continuously track flight schedules, payload and operational conditions in real time and optimize flight plans. Lufthansa’s airline customers also replicate data to create hot-standby databases to ensure high availability.

Other use cases for data movement require productionizing or activating data by moving data models back into applications and operational systems. Predictive models, for instance, depend on training, testing and validation sets to be present in operational systems. But the most straightforward use case for data activation is to simply make data available in real time to people across an organization or to feed it into systems to automate business processes. South African multinational fast casual dining chain Nando’s moves modeled user data, such as recent orders, from the data warehouse back into marketing tools so that their marketing team can readily identify and make special offers to loyal customers. 

As organizations expand their usage of data, they must also assume certain obligations and responsibilities concerning data. In order to scale the use of data responsibly, organizations must use tools that can support data governance, security and extensibility.

Data governance is essential for enabling organizations to know, access and protect their data. Data governance features include easy integration with data catalogs, graphical exposure of data model lineage, metadata capture and other auditing tools. Programmatic management of a data platform via an API also enables an organization to more systematically govern its data. Without governance, organizations face the uncontrolled proliferation of data assets (models, dashboards, etc.), accompanied by bloating cloud expenses. Organizations also run a risk of nearly completely denying, or worse, completely opening access to sensitive data and running the risk of serious misuse or unwanted exposure. In 2022 there were at least 4,100 data breaches with no-compliance costing on average $15 million per organization, which doesn’t include the unactualized revenue due to reputational losses. 

In a similar vein, security features are must-haves to ensure regulatory compliance, manage brand risk, protect internal operations and intellectual property, safeguard customer information and other business critical data in an ethical manner as it is moved around. When considering platform security, common features include flexible deployment and secure networking options, security compliance certifications for SaaS platforms, end-to-end encryption data protection and process isolation.

Finally, extensibility features enable an organization to programmatically control a growing ecosystem of data management tools and embed data assets into products. As data needs grow in scale and complexity over time, organizations will need the ability to manage users at scale, integrate with other data operations technologies and construct custom processes and workflows that depend on data.

When it comes to the full range of possible uses for data, the classic analytics use case, supported by data integration, is just the tip of the iceberg. For your organization to be competitive and innovative, you will need to move data in real-time in many directions to both analytical and operational platforms. Without such capabilities, your business sacrifices opportunities to innovate as well as agility and responsiveness to dynamic and rapidly changing markets.

[CTA_MODULE]

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

Beyond data integration: Why data movement is the future

Beyond data integration: Why data movement is the future

February 27, 2023
February 27, 2023
Beyond data integration: Why data movement is the future
Future uses of data will require an automated data movement platform, not just data pipelines as single point solutions

The rapid adoption of the cloud has radically changed the realities of using and handling data. As data sources such as applications, operational databases, file systems, data streams and more have proliferated, so have the tools, infrastructure and processes for handling data. These new approaches often enable automation and real-time turnaround. The upshot is that, beyond centralizing data to support decisions, there are now growing opportunities to monetize and operationalize data and incorporate it into innovative products. As companies grow more reliant on data, they will need data flowing across complex and heterogeneous environments. At the same time, this growing usage of data poses serious organizational, technological and regulatory complexities that must be comprehensively addressed.

In light of this new reality, “data integration” no longer adequately describes the activities that support end uses of data. Instead, a better model for thinking about how to handle data is “data movement.”

Data integration centralizes data in order to establish a single source of truth for analysis. It is fundamentally a unidirectional process, in which data flows from various sources to a central destination. By contrast, for operational or product-oriented uses of data, data must flow in many directions between many kinds of platforms.

An organization might replicate data across operational systems that are distributed across different regions so that users can access local servers with less latency. To prevent disruptions in the event of failures, an organization might build redundancies into its operations by replicating data in high volumes and real-time from production servers to failover instances. 

For instance, Lufthansa Systems, the IT subsidiary of airline Lufthansa, bidirectionally replicates data between a central data repository and hundreds of distributed data repositories belonging to customers in order to continuously track flight schedules, payload and operational conditions in real time and optimize flight plans. Lufthansa’s airline customers also replicate data to create hot-standby databases to ensure high availability.

Other use cases for data movement require productionizing or activating data by moving data models back into applications and operational systems. Predictive models, for instance, depend on training, testing and validation sets to be present in operational systems. But the most straightforward use case for data activation is to simply make data available in real time to people across an organization or to feed it into systems to automate business processes. South African multinational fast casual dining chain Nando’s moves modeled user data, such as recent orders, from the data warehouse back into marketing tools so that their marketing team can readily identify and make special offers to loyal customers. 

As organizations expand their usage of data, they must also assume certain obligations and responsibilities concerning data. In order to scale the use of data responsibly, organizations must use tools that can support data governance, security and extensibility.

Data governance is essential for enabling organizations to know, access and protect their data. Data governance features include easy integration with data catalogs, graphical exposure of data model lineage, metadata capture and other auditing tools. Programmatic management of a data platform via an API also enables an organization to more systematically govern its data. Without governance, organizations face the uncontrolled proliferation of data assets (models, dashboards, etc.), accompanied by bloating cloud expenses. Organizations also run a risk of nearly completely denying, or worse, completely opening access to sensitive data and running the risk of serious misuse or unwanted exposure. In 2022 there were at least 4,100 data breaches with no-compliance costing on average $15 million per organization, which doesn’t include the unactualized revenue due to reputational losses. 

In a similar vein, security features are must-haves to ensure regulatory compliance, manage brand risk, protect internal operations and intellectual property, safeguard customer information and other business critical data in an ethical manner as it is moved around. When considering platform security, common features include flexible deployment and secure networking options, security compliance certifications for SaaS platforms, end-to-end encryption data protection and process isolation.

Finally, extensibility features enable an organization to programmatically control a growing ecosystem of data management tools and embed data assets into products. As data needs grow in scale and complexity over time, organizations will need the ability to manage users at scale, integrate with other data operations technologies and construct custom processes and workflows that depend on data.

When it comes to the full range of possible uses for data, the classic analytics use case, supported by data integration, is just the tip of the iceberg. For your organization to be competitive and innovative, you will need to move data in real-time in many directions to both analytical and operational platforms. Without such capabilities, your business sacrifices opportunities to innovate as well as agility and responsiveness to dynamic and rapidly changing markets.

[CTA_MODULE]

Learn more about the Fivetran approach to data movement, security, governance and extensibility.
Download now

Related blog posts

The ultimate guide to data integration
Data insights

The ultimate guide to data integration

Read post
How real-time data movement boosts supply chain visibility
Data insights

How real-time data movement boosts supply chain visibility

Read post
No items found.
Automating credit card fraud detection with Google BigQuery ML and Fivetran
Blog

Automating credit card fraud detection with Google BigQuery ML and Fivetran

Read post
Why Rocket Software is betting big on predictive analytics
Blog

Why Rocket Software is betting big on predictive analytics

Read post
Automated fraud detection with Fivetran and BigQuery
Blog

Automated fraud detection with Fivetran and BigQuery

Read post

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.