The data mesh is a thoughtful decentralized approach that facilitates the creation of domain-driven, self-service data products.
Data mesh-including data mesh governance-requires the right mix of process, tooling, and internal resources to be effective.
Much in the same way that software engineering teams transitioned from monolithic applications to microservice architectures, the data mesh is, in many ways, the data platform version of microservices.
As first defined by Zhamak Dehghani in 2019, a data mesh is a decentralized approach that embraces the ubiquity of data in the enterprise by leveraging a domain-oriented, self-serve design.
Self-service functionality: A data mesh allows users to abstract technical complexity and focus on self-serving their individual data use cases with a central platform that includes the data pipeline engines, storage, and streaming infrastructure.
Interoperability and standardization: Underlying each domain is a universal set of data standards that helps facilitate collaboration between domains with shared data, including formatting, data mesh governance, discoverability, and metadata fields, among other data features.
As the data mesh theory has peddled its way through the hype cycle, it's become clear that the use case for a data mesh is far narrower than the concept initially suggested.
As fantastically flexible as data mesh is-and we really do love it-below are a few times when a data mesh probably doesn't make sense.
While the primary objective of the data mesh is to federate product ownership across domains, that only works if the domain team in question knows what to do with those data responsibilities once they get them.
Even with all its abstracted technical complexity, a data mesh still requires enough data talent embedded within each domain to make it work.
Without the experience at the helm, your data mesh will be plagued with low-quality, poorly maintained data products that will eventually need to be rebuilt anyway.
Before your data team jumps head-first into a data mesh project, take a minute to consider the context of your organization.
Another time a data mesh might not make sense is when your data products overlap across business domains.
While the idea of a data steward has fallen out of vogue somewhat over the years, this is a great example of augmenting data mesh for a given use case.
First, building a data mesh is expensive-not just for your budget, but for the critical engineering time of the data team responsible for facilitating the change.
What's more, it's not uncommon for larger organizations who've democratized data ownership to complain that they've become too decentralized and that their democratization has actually created new silos that make it difficult to unify and leverage data across the organization-one of the key components a data mesh is intended to deliver.
The primary reason that a data mesh enables centralized data teams to release control of their data products is that they still control the infrastructure that supports them.
That means that in order for platform teams to effectively regulate a data mesh and enable data to be shared across teams, each domain needs to be operating on a single platform with standardized tooling and data mesh governance practices.
Tools like data lineage and data observability can help data leaders understand consumption patterns across their organizations and help them transition toward a more decentralized structure.
The best thing you can do for your data architecture-whether you choose to democratize or centralize-is to support your data products with high-quality and reliable data.
This Cyber News was published on feeds.dzone.com. Publication date: Tue, 12 Mar 2024 00:13:07 +0000