Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Missing the functional piece in a data project puzzle
The financial industry is going through a disruptive phase, in which buzzwords such as blockchain,big data and deep learning are enticing financial institutions to ride the technological wave.Solid data management is the foundation of these developments. Financial institutions not only havean internal drive to create this foundation, as this improves analysis and decision-making, but alsoface challenging regulations imposed by national and international supervisors. Regulatory focus has become more stringent and more data-intensive, thereby challenging the capabilities of financialinstitutions to perform timely and accurate data aggregation while maintaining consistent risk and finance reporting. A failure to produce timely andaccurate risk and finance reports can ultimately lead to financial penalties or additional capital charges that directly impact the profitability of the firm.
Functional data management: What makes dataprojects excessively costly and never-ending?
Quite often, data transformation projects are seen as a technical exercise to bring data from source systems to end users. This undercuts the required focus for the functional part of this process, where a lot of added value can be gained.
In this paper, we advocate a functional-driven data flow throughout the whole reporting chain. In addition to the technical perspective, the functional perspec-tive ensures focus on the long-term strategy and business as usual, while having a foundation that can adapt to new regulations and the changing business priorities.
Furthermore, we outline the principles for a founda-tion to build a resilient and robust finance and risk data landscape. A clear functional data flow is defined with in-depth analyses on the application and imple-mentation of the flow.
Assessment of current data landscapeThe increased focus on quantifying risk substantiated
with reliable data, and even more on being in control
of the risk figures, has required financial institutions to
move their focus to a more data-driven environment.
The road to this robust data landscape is theoretically
sound and rational, but the execution is always difficult
as financial institutions face several challenges:
• A landscape full of legacy systems and long (manual)
data chains, making change difficult.
• Ever-changing regulatory requirements that frequently
derail the strategic roadmap.
• A multidisciplinary set of endusers, each with their
own very specific requirements and definitions.
• Increasing integration over different functional
domains emphasize the need for consistent data over
different end users.
This can also be seen, for example, with the imple-
mentation of the PERDARR BCBS 239)1 . PERDARR is a
principles-based guideline for banks, emphasizing the
importance of data-related topics, such as achieving the
desired data quality, data definitions, data availability,
data accountability, as well as the data storage and
retrieval process.2
From small local banks to the global systemically
important banks (G-SIBs), countless programs and
projects have been initiated in order to tackle the data
management challenges financial institutions are
facing. But countless programs and projects have also
been terminated before objectives were met.
On the flip side of these huge challenges, there are
also huge benefits. Regulators have pushed financial
institutions to align the use of data within their orga-
nization over different departments. Recent examples
are the alignment between credit risk and finance (e.g.
IFRS 9) and ALM/market risk and finance (e.g. IRRBB,
EBA stress test). However, compliance with regulatory
requirements is not the only driver for a solid data
foundation. Benefits include institutions that have
perfect reconciliation of data, end users who obtain a
better insight into their risk positions, spend far less
time on periodical reconciliation and are less prone to
operational risks.
Furthermore, they are far better positioned to adopt and
implement the next (regulatory) change in their organi-
zation, benefit from increased client analysis potential
and improved input for management decisions. Growth
is achieved more easily on a scalable data landscape, as
is evidenced by the emergence of fintechs who have the
luxury of not having any legacy systems.
The case for functional data managementThe key to achieving these benefits is to ensure
involvement of key persons with functional knowledge
in setting up the IT landscape (systems, applications,
databases, etc.) which can support the entire risk and
finance reporting and analytics data chain.
Functional knowledge is essential in the design phase
of data models and data flows to create a resilient
data landscape which is scalable and flexible enough
for future developments in regulations and changes
in business strategy of the firm. This enables financial
institutions to swiftly adapt to new regulations such as
IFRS 9 or IFRS 17, new requirements for the stress tests
or a data request for ANACredit. Our belief is supported
by the latest assessment of EBA on the progress of
banks adopting the PERDARR guidelines.3
Three of the key features that the EBA identifies in the
failure to comply with PERDARR are:
1. Incomplete integration and implementation of bank-wide data architecture and frameworks (e.g.
data taxonomies, data dictionaries, risk data policies)
This is a direct consequence of not having a holistic
and functional view over the complete chain.
Alignment between all layers in the chain is bound to
lead to failure if different data taxonomies and risk
data policies are used and different quality standards
are adhered to between layers or even between
business units.
2. Flaws in data quality controls (e.g. reconciliation,
validation checks, data quality standards)
Business must be involved in data quality controls, to
extend technical data quality controls with functional
1Source: The Principles of Effective Risk Data Aggregation and Risk Reporting - BCBS 239, Bank of International Settlement, Jan 2013, https://www.bis.org/publ/bcbs239.pdf2For more information about the introduction of BCBS 239 see also: Why is implementing BCBS 239 so challenging? https://zanders.eu/en/latest-insights/why-is-implementing- bcbs-239-so-challenging/3Source: Progress in adopting the Principles for effective risk data aggregation and risk reporting, Bank of International Settlement, March 2017, https://www.bis.org/bcbs/publ/d399.pdf
“On the flip side of the huge challenges, there are also huge benefits”
data quality controls based on business logic.
EBA states that data quality is often deemed
insufficient for regulatory reporting.
3. Over-reliance on manual processes and interventions to produce risk reports As a result of the second point, many manual proces-
ses and interventions are created in order to produce
risk reports. Consequently, the quality of the end-to-
end reporting cannot be guaranteed if just a slight
change is made at the start of the chain, a so-called
snowball effect. In relation to this, due to many
manual adjustments by different users in the chain,
numbers in end reports cannot be reconciled anymore.
These findings can be attributed to a missing functional
perspective in data management and the holistic view
of the entire risk and finance reporting chain. The re-
porting unit has thorough knowledge of the regulatory
requirements and the creation of the risk and finance
reports, while the architects and developers creating
the data landscape have in-depth knowledge on data
management from a technical point of view. This creates
a gap in overall data management, increasing the risk
on misalignment. The positioning of key persons with a
functional background overlooking the entire data chain
will bridge this gap.
Involvement of these key persons starts at the founda-
tion, when defining the single source of truth (SSOT).
We apply a framework with five crucial principles to set
a solid foundation for a robust data landscape and
address the key drivers of failures identified by EBA.
Key principles of the single source of truthThe key principles are the starting points when defining
the data landscape to adhere throughout the whole
reporting chain. The most essential element in this
chain is the introduction of the single source of truth,
a Generic Data Layer (GDL) that forms the basis for all
data deliveries to end users. Benefits of a SSOT is that
all reports and analytics are based on a single version
of the truth, hence no reconciliation, definition or timing
differences between reports.
Involvement of functional knowledge starts when the
internal and external data requirements and definitions
are defined. This is crucial for building a robust and
resilient SSOT. This functional view creates a level of
comprehension on how to structure and design the data
landscape. The requirements must provide a clear over-
view of the known and expected future risk and finance
attributes. Moreover, as development of the regulatory
landscape is always ongoing, an effective SSOT in line
with the key principles will be able to absorb the chan-
ging requirements without affecting the chosen setup
and structure.
At Zanders we have defined the following key principles
for an effective SSOT:
1. Data in the GDL must be stored at the lowest possible
level of granularity, making aggregation and deri-
vation of calculated information further down the line
more structured and interpretable across reporting
purposes. This avoids the inclusion of redundant
information in the SSOT.
2. Generic external data must be stored in the GDL.
This entails data such as economic variables
(e.g. interest rates and bond prices), ensuring that
the same market data is used throughout the
organization.
3. A unique key must be generated for all loans,
counterparties and other instruments in the GDL.
This is paramount in the data lineage process and
reporting alignment.
4. The source systems are the owners of the data and
not the GDL. No corrections are executed, and data
quality issues or data gaps are restored at the source.
Data enrichments in subsequent layers are owned by
the specific layer.
5. The GDL combines all requirements from all end
users, resulting in a generic setup of definitions,
dimensions and dictionaries that is understood by all
the end users and, more importantly, is accepted by
the end users. Involvement of key persons with func-
tional knowledge is crucial in this step.
These key principles should be governed by a clear
interdepartmental operating model and roles, respon-
sibilities and accountabilities must be made explicit to
have an effective data management process.
Graph 1.
A functional
data flow
from source
to output
4Note: Functional work flow in this context means the logical flow of data and information which is independent of any particular storage technology or data warehouse and its technical implementation
“The single source of truth is a Generic Data Layer (GDL) that forms the basis for
all data deliveries to end users”
Ultimately, depending on the organization structure,
the chief data or risk officer is the owner and account-
able for the data and data quality of the whole reporting
chain.
With these key principles in mind, the functional data flow4
can be further outlined when setting out the data landscape.
Functional data flowIn a typical data work flow for finance and risk, five
distinctive layers can be distinguished, each with their
different purpose.
1. Source layer The source layer contains raw data from all sources
of all assets, liabilities and off-balance sheet items.
The data is source system-specific and often does
not align with other source systems, which limits the
possibilities to perform calculations directly on the
data. Hence, all source data should be loaded into
the GDL. A distinction can be made between data
from within the bank, i.e. from its own IT systems, or
external data from market data vendors or
subsidiaries.
• Internal
Position data: This layer contains all position data
regarding the asset and liability portfolios of a bank
that are administered in front office systems such as
loan origination systems and deal capture systems.
• External
Product related data: External instrument related
data, i.e. market prices, trade volume.
Subsidiary data: For larger institutions it is common
that subsidiaries, entities specialized in fields such
as real estate, leasing or securitizations, have their
own data landscape which needs to be consolidated
with the institutions’ balance sheet.
Generic external data: Next to external position
data, other generic data is required such as interest
rates, FX rates or macroeconomic indexes, e.g.
CBOE Volatility Index (VIX) and the Gross Domestic
Product (GDP).
2. Generic Data Layer (GDL) All data from the source systems is transformed to fit
the target data model and are integrated into the data
layer. Limiting the data flow to one recipient, the GDL,
creates clarity and decreases the operational burden
for the source systems, while also ensuring minimum
vulnerabilities in the distribution of data. The setup
focuses on durability and stability as this is the core
of the data landscape and changes will be difficult
and costly to implement.
To achieve a clear communication between the delive-
ring parties and the GDL, a set of agreements must be
in place. This ensures that both parties know what to
expect and the delivery to the GDL is the only expec-
ted delivery. This set of agreements is based on the
input requirement throughout the data chain. A clear
owner of the agreements must be specified.
At the GDL the key principles are essential. This is
where the data is stored at the lowest possible level
of granularity and a unique key is generated for all
loans, counterparties and other instruments availa-
ble. The GDL is not the owner of the data itself, as this
is still with the source system, however the GDL is the
owner of data definitions, dimension, dictionaries and
transformation. All historical data as per availability
Graph 1.
A functional
data flow
from source
to output
is stored. In the setup of the GDL, the alignment
between the definitions across sources should be
monitored and tested, ensuring all sources provide
the same data. It is essential that the definitions and
dimensions as defined in the GDL are well docu-
mented, understood and clear to the entire chain.
Ambiguity at this stage leads to inconsistent results
further downstream in the data chain.
3. Business Information Layer (BIL) For each specific internal and external requirement,
specific data requirements exist which can range from
different categorization of counterparties to specific
risk metrics that aren’t relevant in other regulatory
reports. Each data hub in the BIL is filled with only
the relevant data attributes from the GDL, preparing
the data for the calculation or reporting layer.
Within the BIL, data is enriched and specified to the
requirements, which also implies that, potentially
within the BIL, calculations take place to prepare risk
factors. The BIL is owner and responsible for all data
enrichments in this layer. There might be overlap-
ping factors and enrichment steps between data
hubs; these should be shared in order to ensure
consistent treatment.
An exception to the above is, for example, the internal
and external stress test requirements. This stress test
overlay is dependent on multiple data hubs to per-
form the overall risk calculations. The external regula-
tor has its own set of requirements for calculations,
but the input data should align with the BILs for re-
conciliation purposes and alignment of the results.
The macro-economic factors are the drivers for the
shocks and are, therefore, separately added to the
regular data stream and the calculator for stress
test purposes.
4. Calculation Layer In the calculation layer, specific risk models are ap-
plied to calculate the required metrics, for example
the lifetime expected credit loss calculator for
IFRS 9. The input to the calculation and the output
that is expected from this calculation engine is the
driving force behind the data requirements. Without
the correct level and accuracy of the input data, the
quality of the output cannot be guaranteed, poten-
tially leading to greater risks or actual losses.
5. Reporting Layer The final layer of the data flow is the reporting layer,
consisting of a reporting cube and a reporting engine.
The reporting cube is filled with the results from the
calculation layer and directly from the BIL if no calcu-
lations are required. Reports, both internal and exter-
nal, are compiled by the reporting engine on the
reporting cube. In all cases, reports should solely be
based on the reporting cube and should not be filled
with data from a different layer. This ensures trans- parency of the chain and consistency in the reports.
Relevant information, which is (re)used in other risk
calculations or financial processes, such as IFRS 9
provisions or the predicted cash flows of certain
instruments, is fed back to the GDL or BIL.
As a minimum requirement, data validation and quality
controls should be in place in the first three layers
(source systems, GDL and BIL). When data reaches
the calculation engine, the data is already validated
and checked for specific calculation purposes. At each
layer, the nature of data quality controls can differ, as
data quality controls at the BIL are undeniably more
functional in nature with more business logic. Note that
data quality issues are not solved in each layer, the
pitfall of current landscape as mentioned above, but
should be looped back to the source. Furthermore, data
quality controls between all layers should be in place
checking on the completeness, correctness and lineage
of the data.
Implementation of the functional data flowA key factor in the implementation of a functional data
flow for a reporting chain is being able to maintain an
overview of the entire chain and considering all the
principles defined in the previous section. Alignment
with all the stakeholders is one of the major hurdles
that needs to be tackled, where an understanding of all
parties’ needs should be clearly in scope. Implementa-
tion of the functional data model and flow requires all
stakeholders, functional and technical, to collaborate
and bridge the gaps in terms of understanding the
Key elements for a successful data project
• Add functional knowledge to the team, specifically
to govern the entire data chain
• Actively involve the end user in the setup of a
data model
• Set up strict data governance rules and follow through
• Create a single source of truth and assign data
owner(s)
• Implement data validation and quality controls
between all layers
• Create a set-up that is adaptable to regulatory
changes or new end users
• Create clear requirement documentation starting at
the end-user perspective
requirements. While risk and finance should be closely
involved in drafting the requirements for the data
model, flow and controls, the IT manager should be
more involved in the actual purpose and have more
functional knowledge about usage of data.
This implementation process is not straightforward
and does imply a considerable amount of effort from
all stakeholders prior to the realization of potential
benefits.
Besides bridging the functional gap between stake-
holders, the introduction of functional and technical
data validation and quality controls at each of the layers
will ensure that the data is in line with the requirements.
This provides a direct lineage overview on where
the data is coming from and any alterations or
business logic.
To ensure a form of standardization and feasible level
of implementation, principles need to be set for the data
flow as new issues and challenges will arise and will
require solutions within the existing data landscape.
It is important to keep the five layers consistent with the
intended purpose of each layer.
The last piece of the puzzleThe pitfall of current large transformation projects is
the lack of functional knowledge throughout the whole
reporting chain. Functional knowledge is key at all
levels and layers: on a strategic level where the
strategic roadmap of the IT landscape is defined, but
also on the lowest level when determining the
requirements for the GDL or a specific BIL.
Zanders believes that functional knowledge over the
whole reporting chain is the last piece of the puzzle for
completing the finance and risk architecture and
corresponding IT landscape.
Jasper van [email protected]
Scott Lee
Vincent [email protected]
Save time and moneyDo you want to get in control, stay in control, and save time and money on your (big) data processes?
Contact us:
The added value of Zanders
Our track record and expertise can help you overcome
challenges in the areas of:
• Managing large transformation projects
• Improving functional knowledge through the entire
data flow
• Translating regulatory guidelines into functional
requirements and data models
• Creating risk processes and reports
• Adaptation to changing regulatory requirements
• Optimizing risk and finance models
• Modeling assets and liabilities at financial institutions
• Data validation and data quality controls