“Every time I touch a software product, it brings me to data, as it should.” – says JGP.
However, many businesses still fail to treat data properly and, as JGP emphasizes, with due respect. According to him, the most common problems leading to poor data treatment are:
JGP is firmly assured that data contracts are an integral part of the solution for each of these problems.
This term has gained recognition since the publication of Andrew Jones’ book Driving Data Quality with Data Contracts: A Comprehensive Guide to Building Reliable, Trusted, and Effective Data Platforms. However, while the term “data contract” may be relatively new, the very concept behind it has been with us for many years. In particular, JGP first encountered it in the late 1990s when he was working on a code generator that was using schemas of databases to generate code. The need to enrich the schema brought him to the data contract. Since that time, JGP had encountered data contracts countless number of times long before
Andrew Jones published his book. The “data contracts” term started to spicked in Google searches.
JGP defines data contracts as link between a data producer and one or many data consumers. It is also a it is a link between the logic and the physical implementation of data. JGP emphasizes that data contracts when implemented correctly lead companies to the shared understanding of data as a product that gives a certain level of treatment and contributions from the entire team of stakeholders involved with this data. A well-prepared data contract, a document that defines how the data is exchanged between different parties, lays the foundation for Data Mesh and enables Agile in the data world. JGP sees data contracts as a facilitator for different stakeholders with different levels of data awareness to contribute to data and data treatment practices in iterative cycles, enhancing the continuous development of data rules and policies.
While both a data contract and an SLA (service-level agreement) pursue common goals, these two concepts should not be confused. According to JGP, a data contract is a massively broader concept than an SLA. A data contract defines:
According to JGP, one of the most valuable benefits of a data contract is that it facilitates the discoverability of data and data governance rules. Typically, data engineers outline this information in CRM systems, such as Confluence, without involving other stakeholders to contribute. This leads to a, somewhat, static approach to data, while all attempts to contribute to data policies become more challenging and time-consuming. Various data rules and related entries can be hard to find in Confluence, especially if there’s no structured approach to the organization of such information. A decent data contract with clear structure and formatting gathers all the information on data and all data policies in one place, involving a feedback loop, as people keep adding information to the contract. This builds a culture of data-awareness within the company and allows making data policies more complete.
JGP argues that the technical part of implementing a data contract is quite easy. The main problems, in this case, are people and their resistance to change. Building a data contract is much faster than creating a Confluence page with loads of information copied and pasted from multiple sources. A data contract can be pre-generated with proper tooling so that multiple stakeholders will just fill in the corresponding fields and organize the most critical data policies in a well-structured manner.
However, JGP has faced many challenges with stakeholders unwilling to do this. According to him, the main reasons behind such a situation are:
From JGP’s experience, it takes time for people to understand the meaning and the value of data contracts. However, he is not blaming data engineers (or anyone) but highlights the need for team leaders and middle management to give time, encourage, and reward data engineers who contribute to the new way of thinking.
Lowering the entry level to data contracts and their completion is an excellent way to deal with people’s resistance to change and lack of motivation. In this case, automation becomes a must.
JGP concludes that we are still in the infancy of data contracts and that the opportunities are in front of us. He extended the initial open-source project to tens of contributors and ensured that the standard is now part of a Linux Foundation project called Bitol (https://bitol.io), to guarantee wider availability and sustainability.
Post Tags :