In the first part of this blog series I discussed the traditional data warehouse and some of the issues that can occur in the period following the initial project.
I concluded by outlining the 5 common challenges that I hear from customers.
This blog will focus on the fifth and final of these issues: LACK OF GOVERNANCE.
Before discussing a lack of governance, let us first consider the meaning of governance.
According to the dictionary, governance refers to the “action or manner of governing a state, organisation, etc.” However for the purposes of this post we will concern ourselves with the more common IT definition of the term, as neatly summed up by CIO.com:
“Governance means all the processes that coordinate and control an organisation’s resources and actions. Its scope includes ethics, resource-management processes, accountability and management controls.”
What do we mean when we talk about Data Governance?
The Data Governance Institute (DGI) is a well-respected authority on the subject of data governance and provides information and guidance on how to avoid a potential minefield. The definition provided is:
“Data Governance is a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”
The following diagram is the reference Data Governance Framework provided by DGI.
So how do we relate the framework to the implementation of a traditional data warehouse?
It is not hard to see the value and benefit of having strong data governance within an organisation. At the same time, it is equally easy to see that the delivery of data governance is not solely a technical challenge. In fact, technology is only one piece of the puzzle. Delivering effective Data Governance requires a synergy between your people, the processes they use and the technology that supports those processes.
Where does Data Management fit into Data Governance?
The Data Management Associate International (DAMA) lists 10 major functions of Data Management in their DAMA-DMBOK (Data Management Body of Knowledge). Data Governance is identified as the core component of Data Management, tying together the other 9 disciplines, such as Data Architecture Management, Data Quality Management, Reference and Master Data Management. Data Warehousing & Business Intelligence Management is highlighted it in green.
“How you manage your data warehouse and business intelligence is crucial to the success of any data governance initiative.”
But this post is not about data governance as a whole, we will save that for another day. For now, let us concentrate on the impact that a data warehouse has on data governance. Or to be even more specific, how a traditional data warehouse can not only struggle to support data governance, but in certain circumstances, can undermine it.
The Data Warehouse and Data Governance
In order to provide effective governance, it is essential to have agreed processes and then of course visibility into those processes. Data warehousing is no different. Let us revisit the conceptual architecture we discussed in challenge 2.
If we consider this in the context of a data governance initiative, then there are a few immediate questions that come to mind:
- Are all the source systems included in the organisations’ overall data governance structure?
- Do developers, users and management have access to an easily referenceable up to date list of all the source systems and their ownership?
- Are the necessary data governance standards applied to external as well as internal data sources?
- Do we have a roadmap for the evolution of each of the source systems?
- If the source system is internally developed, who owns it and what succession plans are in place to ensure knowledge is not limited?
- When confronted with proposed changes to source systems, are you able to easily perform impact analysis that includes not only the data warehouse but also the reports beyond?
- Can you easily map and catalogue new source systems without compromising your data governance rules?
Extract Transform and Load
- Are all of the individual ETL processes fully documented with identified technical and business owners?
- Can the ETL be reviewed from a business process centric rather than data centric view?
- Do all the ETL processes produce standard log details that can be easily reviewed and queried?
- Do we have documented standards covering our ETL processes and do they have identified owners?
- Have all the ETL Processes been built using consistent methods, approaches and tools?
- Are the ETL processes and their effect on the organisations data included within the enterprise wide data governance model?
- Do we have controls in place to ensure that personal, sensitive or confidential data is processed correctly?
- Can you quickly and experimentally load new datasets without compromising your data governance structures?
Store and Optimise
- Is the data warehouse main repository part of the enterprise data governance strategy?
- Are all the data items catalogued and aligned to competent and responsible owners?
- Do we have rules governing the retention of data within the data warehouse and can we measure adherence?
- Does the data warehouse contain rich metadata to support semantic based discovery?
- Are there strong rules in place (and enforced) to control storage and access to personal or confidential data?
- Can data be segmented or partitioned based on standard rules?
- Are all the data models for the data warehouse standardised and contained within a suitable enterprise data modelling solution?
- Are there suitable levels of sign off in place for any changes to the data warehouse model?
- Can you provide data lineage from the presentation / data mart layer back through to the underlying data sources along with relevant audit and trace information?
- Can you extend data stewardship including data quality responsibility to business users?
So what’s the answer?
Is it possible to address the requirements for a flexible, agile data warehouse while still meeting your data governance obligations?
Can you deliver robust data governance processes within a data warehouse even if they don’t exist within the wider organisation?
Can you alleviate the core concerns around a lack of data governance without embarking upon a huge enterprise wide project?
The answer is: Yes (Of course…).
The Modern Data Platform
The Modern Data Platform delivers on all the requirements for a next generation data warehouse. Enabling organisations to radically simplify their existing legacy or overly complex solutions in order to lower running costs, improve agility and gain breakthrough performance that delivers real business value, without compromising on their governance requirements.
In fact, the modern data platform can form the foundation of a data governance strategy that can be extended across the organisation.