How do we deliver on our promises from Data Transformations?
As we all know, organisations are undergoing constant change to their Data and System Landscapes.
In order to obtain funding for larger Data Transformations, typically impressive benefits are promised. But often these are exaggerated and ultimately, what actually gets delivered may well under-deliver on them.
Their success depends on many factors. Key to these is the integration of the target state with the organisation’s strategy, functions and operations. In addition, the dependence of successful Data Transformations on corresponding successful organisational transformations is often paid too little attention – if not totally ignored.
We also know that most organisations, certainly the larger ones, have heterogeneous System Landscapes and these can be made yet more complex by Data Transformations that depend on new infrastructure. To ensure that increased System complexity does not equate with increased fragmentation and new data silos, we need to ensure that the new data world is completely integrated with our existing holistic Data Governance and Data Management functions.
A Blueprint for successful Data Transformations
In my opinion, a key tool for an organisation to successfully and sustainably derive benefit from its data, is to develop a Reference Data Architecture.
Typically, Data Transformations are specified in Technological terms. Whilst this level of detail is quite obviously required, there is an abstracted level above this which specifies what data behaviours are required. These behaviours must form the design basis of the Data Flows and the Data Persistence elements specified in Solution Architecture, or Technical Architecture diagrams.
These Architectural patterns can guarantee successful outcomes from Data Transformations, as they promote standardised integration of the Data Governance and Data Management infrastructure and Metadata.
The Data Gym’s Reference Data Architecture
For the preceding reasons we have developed our Reference Data Architecture to provide a guidance standard that can be used when developing and delivering any technical solution to accelerate delivery and ensure coherence.
It specifies the ‘What?’ rather than the ‘How?’ and so remains technologically agnostic.
Our Reference Data Architecture is defined around the Producer/Consumer concept common to many organisations and is therefore easy for them to adopt, adapt and integrate.
In this model, any Consumer must access data only from Authorised Producers and these must have their Governance Metadata Domains fully documented and ratified, in order to be deemed ‘Authorised’.
The interactions and integration of these Metadata Domains and the actual data is also specified in the framework. They must support the standard Data Governance roles they play over the data in terms of its:
- Control and
The Reference Data Architecture can be used as a check in technical delivery streams for each component being delivered, in terms of the expected data behaviour of that component. Of particular criticality is the way that the component needs to interact with the Governance Metadata Domains.
As an example, if we look at the Transformation component in the schematic, the transformation details need to be defined in the Metadata capability, so they are standardised and transparent for people to review. They are used to control the actual data transformations and any failures or data quality breaches will be monitored and reported through the Metadata capability.
It looks Monolithic is it?
The Architecture in the schematic may appear monolithic and ‘centralised’, but in fact this is only true in a Logical sense.
For example, the data that is persisted in the Semantic Layer may actually be physically stored in multiple technologies, database instances, and even in different data centres for the same technology. The important factor here is that it is defined with the same ‘meaning’ irrespective of the technology or the geo-residency of the data.
How is it used?
The Reference Data Architecture provides a standardised set of patterns that will be used within the more detailed and technology-specific Solution Architectures. We apply the appropriate characteristics of any logical component to its corresponding technical component. For example, notifying of any Data Quality breaches from a transformation component.
In this way we can ensure that wherever we look across the System Landscape, we will see the same data behaviours being applied. For example, the way that transformations are handled across the resulting System Landscape, will always ensure the definitions, control and monitoring will conform to the prescribed standards.
Thus we can start to automate and derive many of the Data Governance and Data Management functions, simplifying and increasing the cadence and agility of delivery. This approach will result in improved quality and consistency of BAU data and controls – and we can improve the outcomes from Data Transformations and reduce the effort required to deliver them at the same time.