Virtualization and Cloud executives share their predictions for 2016. Read them in this VMblog.com series exclusive.
Contributed by Wayne Applebaum, Vice President of Analytics; Applied Innovation at Avalon
Data Governance in the World of Cloud and Big Data
The Gartner Group published a study last year that stated:
"... by 2017, 33 percent of
Fortune 100 organizations will experience an information crisis, due to their
inability to effectively value, govern and trust their enterprise information."
cites the cause of this crisis as the ever-increasing amount of data that is being
collected, stored and analyzed.
their information sources to resemble a utility; turn the tap and the water
flows. No need to worry about which source the water came from, the pipes that
carried it to you or the treatment plant it went through. The water provided is
reliably assumed to be from a trusted source. But in terms of information
utility, we are not quite there yet when it comes to reliably assuming it comes
from a trusted source.
correctly describing a situation that has been brewing for the past 15-20
years-a situation that is now accelerating because of Big Data and
the Cloud technologies. In the
case of Big Data, we are storing and sourcing a wider variety of information.
It is also information that may have go through a number of transformations.
Think about what we have to do to score tweets for customer sentiment analysis.
The scoring rules we create to accomplish this determine the results we will
get. If two groups create different
scoring rules for the same phenomenon, then their results will not be
comparable. If there is a desire to compare results across products or time
periods, then consistency is a necessary component.
needs to play a role in providing this consistency and helping us achieve that
"utility" state. Maintaining consistency
of definition and metadata becomes more difficult when users are integrating
this information with other data sources.
While the cloud
adds to accessibility, it also creates a more difficult data governance
situation. Logically, no matter how seamless it may appear to the user, the
more sources, locations and access, the more difficult it is to control.
A key issue is
you have to control security, access, metadata and life cycle across multiple systems.
Information needs to be compatible across systems. As analytics need continue
to grow, so will the need to integrate data across systems to get the business
insight we are seeking.
of these issues can be solved by technology, many are highly "people dependent."
For example, while technology can be used to enforce practices, standards and
definitions, we still need people to agree on consistent definitions,
transformations and calculations that form the basis for any effective data
big data and cloud technologies, it is not unusual to find organizations with multiple
data sources that may contain different definitions of basic things like
revenue, customer, sales and fulfillment rate. Plus companies routinely extract
information from enterprise data sources and manipulate them in Excel spreadsheets.
At this stage data governance goes out the window.
When we add
to that Big Data and Cloud to the mix, we are taxing our traditional governance
constructs. One of the primary reasons is that data governance has evolved in a
world that focused on two types of data: Transactional and Master. Implementers
of ERP systems focus of these two types of data almost exclusively.
changes when instead of just wanting to process transactions (such as orders,
payments, payroll) we want to consolidate and analyze transactional information
to make better business decisions.
Decision data results from the combination of transactional and master-data and
outside data to answer business questions like "what is a products contribution
margin (which may have to access current commodity prices)?"
Decision data could also be the customer sentiment about a given product
created from consolidating tweets. In
short, any time we combine data to create new information we are creating
Business Decision Data.
and algorithms that are employed to create this type of data often go
undocumented. As the number of big data and analytics projects increases, so
does the amount of business decision data, accelerating the problem.
Cloud and Big Data, we need to consider data governance in terms of some new
conditions that have resulted from the Big Data movement:
- Data now comes
in a wider variety of different forms and sources
- It may be
stored in various locations, some outside the firewall or control of the
organization using it.
Decision Data, and Big Data Transformations create new Data Governance
are some groups like the Cloud Security Alliance that are in the process of
establishing standards for Data Governance in a Cloud Environment, this remains a relatively new area that is
being tackled in the standards arena. Developing
ways of leveraging repeatable methods to implement data governance in the world
of the cloud and Big Data will be a key for the competitiveness of your
enterprise in the future. This is a
critical topic I work with on a daily basis with my clients, and I'm encouraged
by the attention it is receiving. There is much work to be done still, but the
future is bright for those who truly embrace Data Governance in this context.
About the Author
Applebaum, Vice President of Analytics & Applied Innovation at Avalon
Applebaum is Avalon
Consulting, LLCs Vice
President of Analytics and Applied Innovation. He is responsible for delivery
of Avalon's Analytics Services as well as its Big Data, Search and Semantics
practices. He has more than thirty years of experience in data analytics and
enterprise consulting. Prior to joining Avalon, Dr. Applebaum was a Principal
Business Architect at SAP, responsible for designing and leading the
implementation of analytic solutions. Dr. Applebaum holds an MA and Ph.D. in
Statistics from the University of Pittsburgh; he earned a Bachelor of Science
in Psychology from The City College of New York.
About Avalon Consulting, LLC
Consulting, LLC transforms data investments into actionable business results
through the visioning and implementation of Big Data, Web Presence, Content
Publishing, and Enterprise Search solutions. We are the trusted partner to over
one hundred clients, primarily Global 2000 companies, public agencies, and
institutions of higher learning. Avalon is known for providing clients a
superior engagement experience through a combination of business acumen,
intellectual curiosity, collaborative work style, and strong partnerships with
award-winning vendors. Avalon's deep technical expertise mitigates project risk
and reduces total cost of ownership for our clients. Headquartered in Plano,
Texas, Avalon also maintains offices in Austin, Texas; Boulder, Colorado;
Chicago, Illinois; Minneapolis, Minnesota; St. Louis, Missouri and Washington,
D.C. For more information, please visit www.avalonconsult.com.