How do we deliver our Data Transformation promises?
As we all know, organisations are constantly undertaking changes to their Data and System Landscapes. Impressive benefits are typically promised to justify the funding of the larger Data Transformations. But often, these are exaggerated, and ultimately what actually gets delivered may well fall short of the promised hyperbole.
Obviously, the success of any Data Transformation depends on many factors, but the successful integration of the target state with the organisation’s strategy, functions and operations will make or break its overall success. In addition, the dependence of successful Data Transformations on corresponding successful organisational transformations, is often paid little attention – if not totally ignored.
We also know that most organisations, certainly the larger ones, have heterogeneous System Landscapes. They are, of course, here to stay, but often can be made yet more complex by Data Transformations that depend on new technical solutions.
To ensure that increased System complexity does not equate with increased fragmentation, the new data world is must be effectively integrated with existing holistic data definitions, Data Governance and Data Management functions.
A Blueprint for successful Data Transformations
I strongly believe that developing a Reference Data Architecture is critical for an organisation to guarantee its ongoing success by sustainably deriving benefit from its data.
It also provides a framework that is especially important during any Data Transformations.
Typically, Data Transformations are specified and articulated almost exclusively in technological terms. Whilst this level of detail is quite obviously required, there is an abstracted level above this which specifies what data outcomes are essential. These outcomes must form the Data Architectural definitions of the Data Flows and the Data Persistence elements specified in any Solution Architectural and Technical Architectural diagrams. Such definitions are the patterns that will underpin the success of Data Transformations, as they promote standardised integration of the Data Governance and Data Management infrastructure and metadata.
The Data Gym’s Reference Data Assurance Framework
For the preceding reasons we developed the Data Gym’s Reference Data Assurance Framework.
This provides a blueprint that can be used when developing and delivering any technical solution. It will guarantee that the data demands are made paramount, whilst also accelerating delivery.
We designed the framework around the Producer-Consumer approach common to many organisations. It will have a familiarity for them which facilitates its easy adoption, adaptation and integration.
In the Producer-Consumer approach, any Consumer must access data only from Authorised Producers and these must have their Governance Metadata Domains fully documented and ratified, in order to be deemed ‘Authorised’.
To become Authorised, the Producer must satisfy the constraints imposed by the data control framework. This would include, for example, Data Quality and Data Retention policies. The interactions and integration of these Control Domains and the operational data of an organisation is also specified in the framework. The metadata flows and exchanges must support the governance and management of the data in terms of its:
- Control and
Our Reference Data Assurance Framework can be used as a design ‘check-list’ in technical delivery streams for each component being delivered, because it specifies the expected data behaviour of that component.
Let’s take a look at ③, the Transformation component in the schematic. This defines the mappings and any other modifications, such as aggregation, applied to the source data so that is conforms to the semantic definitions of the produced data. These transformation details need to be defined in the Metadata Foundation capability so they can be enforced in the system landscape, and be transparent for people to review. They are used to control the actual data transformations. Any failures, or data quality breaches, will be monitored and can be reported through the Data Assurance Metadata capability.
It looks Monolithic is it?
The Architecture in the schematic may appear monolithic and ‘centralised’, but in fact this is only true in a Logical sense. It specifies the ‘What?’ rather than the ‘How?’ and so remains technologically and implementation agnostic.
Thus, for example, the data persisted in the Semantic Layer may actually be physically stored in multiple technologies, database instances, and even in geographically different data centres. The important factor here is that it is defined with the same ‘meaning’ irrespective of the implementation details.
How is it used?
The Reference Data Assurance Framework provides a standardised set of patterns that will be used within the more detailed and technology-specific Solution Architectures. The appropriate characteristics of any logical component will be applied to its corresponding technical component. For example, ensuring that any mapping failures from a Transformation component are monitored and notified to the Data Assurance Metadata capability.
In this way, we can ensure that wherever we look across the System Landscape, we will see the same data controls and behaviours being applied.
The standardisation of patterns will provide a path to the automated delivery of many of the Data Governance and Data Management functions. This will simplify and increase the cadence and agility of delivery. In addition, the approach will result in improved quality and consistency of BAU data and controls.
Thus the framework can be used to substantially improve the outcomes from Data Transformations and yet reduce the effort required to deliver them at the same time.