IMA Data Analytics & Visualization Fundamentals

IMA Data Analytics & Visualization Fundamentals CertificateTM – Course Script

Page | 1

Welcome

Introduction Welcome to the IMA Data Analytics & Visualization Fundamentals Certificate course.

We are at a dynamic time in the profession as new technologies are emerging and creating a

lasting impact on the way we do business. The Digital Age is upon us, and bringing with it are

challenges, but also opportunities. As finance and accounting professionals, we are in a unique

position to harness the power of these technological advancements and expand our roles of value

steward into creating value.

When you're ready click to get started.

Course Overview

This course is based on IMA’s thought leadership found in its research and publications, along

with contributions from various leaders across the organization. You’ll hear straight from industry

experts who share insights into their proficiency and perspectives on topics such as the future of

the profession, emerging technologies, and data analytics and visualization.

Through four modules, you’ll be introduced to the impact of technology and analytics on the

management accounting profession and delve into understanding and applying data analytics and

visualization through a case-based scenario.

The four modules are:

Module 1 - Becoming Data-Driven will set the stage by introducing some of the changes that are

happening in finance and accounting, how business processes will be impacted, and what skills

you will need to succeed. You’ll also be introduced to concepts of data science, data analytics, and

data governance as an overall foundation of the course.

Module 2 - Visualizing the Present & Predicting the Future will discuss how to harness the power

of data through analytics and effectively communicate the data with visualizations. All of this will

be depicted using a fictitious case scenario that describes a dilemma in which data analytics and


Page | 2

effective data visualization is necessary to resolve it. Throughout the module, as the

case progresses, a video series will elaborate on key visualization concepts and

techniques.

In Module 3 - Applying Data Analytics and Visualization, you’ll learn how to apply some data

analytics and visualization concepts covered in prior modules to solve issues presented in

the case-study scenario. After that, you’ll be in the driver’s seat as you take your newly

acquired knowledge on a test drive. You’ll navigate through various exercises relating to the

scenario

described in the Huskie Motor case study and apply the concepts to solve problems. This

practical approach to data analytics and visualization will prepare you for applying this

knowledge to your own organization.

In Module 4 - Conclusion and Final Assessment, we’ll wrap up the course with a summary and

provide you with some valuable resources that will enable you to expand your learnings and

dig deeper into these topics. You will then be presented with a final assessment that will test

your knowledge of all topics discussed throughout the course.

Data analytics can be performed using a multitude of software available in the market. In this

course, you’ll be introduced to many of these software tools. However, in order to perform the

requisite analytics, you are required to have access to Microsoft Excel. It will be necessary for

you to have working knowledge of basic functions, including pivot tables, in Microsoft Excel.

Acknowledgements

IMA would like to give thanks and recognition to Ann Dzuranin, Dean’s Distinguished Professor

of Analytics at Northern Illinois University and co-author of Huskie Motor Corporation:

Visualizing the Present and Predicting the Future case study, and Daniel Smith, the head of

innovation and founder of TheoryLane LLC, for their expertise and contributions to this course.

IMA would also like to acknowledge the other valued speakers and authors of the publications

and resources used in this course. For more information on all of the contributors, refer to the

biographies document within the Resources link above.

Course Requirements

Knowledge check questions are included throughout each module. The purpose of

these knowledge check questions is to assess your grasp of the material. You’ll have

two attempts for each question. You must complete all knowledge check questions

before taking the final assessment.


Page | 3

To earn your Data Analytics and Data Visualization Fundamentals Certificate and NASBA CPE

credit, you are required to pass the Final Assessment in Module 4 with a 70% or higher score,

which is based on content from all preceding modules. After passing the final assessment, you

can access and print your professional certificate, NASBA CPE Certificate and digital badge. The

digital badge can be uploaded to your LinkedIn profile as well as shared on social media.

You have one year from the time you enroll in the course to complete it. You do not have to

complete the course in one sitting. If you exit the course before you finish, your current location

will be remembered, and you can pick up from where you left off when you return to the course.

Course Navigation

You can use the Main Menu as your guide as you navigate through the course. You may use the

Script tab to view the full script to follow along with the narration.

This course is designed to allow you to work at your own pace, either in one sitting, or by

completing some of the course, returning later and continuing right from where you left off.

By clicking on the three horizontal line icon in the upper left corner of the screen, you will be able

to open and close the course outline. Here, you may jump to any aspect of the course outline by

clicking on it. To close this outline, simply click the icon again.

On the bottom right-hand corner of your screen, there are two arrows. In this course, we’ll refer

to the arrow pointing to the right as the “next” button, and the arrow pointing to the left of the

“back” button.

Now, click the next button to continue.

Course Resources

The Resources link, located to the top left of the screen, includes many of the materials you will

need for this course including:

A course script

The course contributors’ bios

Spreadsheets to download

And other helpful materials You may print the course script provided so that you can take notes as you proceed through the


Page | 4

course.

The spreadsheets found in the Resources link are to be used to complete and answer the

exercises in Module 3. You’ll need to download these files before you begin Module 3.

When you’re ready, click the NEXT button to learn more about this course.

Course Learning Objectives

Let’s take a moment to review the learning objectives for the IMA Data Analytics & Visualization

Fundamentals CertificateTM course. At the conclusion of the course, you should be able to:

Recognize the impact of technology and analytics on the accounting profession.

Demonstrate how data analytics can influence organizational strategy.

Identify ways data visualizations effectively enable appropriate business decisions

These are the overall learning objectives for this course. Each module will begin with a set

of unique goals that align with these overall course learning objectives.

Course Roadmap

As we previously outlined, this course consists of four modules. The beginning of each module

starts with the module goals. These goals outline the topics that will be discussed in that

module.

Each module is divided into lessons. You will see a menu screen as you begin each lesson.

The menu screen will list the sections that are included in that lesson.

We highly recommend that you go through this course in sequential order for an optimal learning

experience. Click Module 1 to proceed to Module 1 Lesson 1.

Module 1: Becoming Data‐Driven

Introduction Welcome to Module 1, Becoming Data-Driven!

This module will introduce changes to the accounting and finance profession that are on the

rise caused by advancements in technology, and discuss the potential lasting impact of these

emerging technologies.



Page | 5

Module 1 Goals

Upon completing Module 1, learners will be able to:

Recognize the impact of technological advancements on the finance and

accounting profession.

Describe the typical life cycle of data.

Explain the impact data governance has on business and its stakeholders.

Define types of data analytics and how each progressively creates value within

an organization.

Module Menu

Module 1 consists of two lessons, Lesson 1: Changes to the Profession and Lesson 2:

Making Sense of Data. After completing the lessons, we’ll wrap up the module with some

concluding thoughts.

Click Lesson 1 to proceed.

Lesson 1: Changes to the Profession

Section 1: Technology is Changing

As technology transforms the management accounting profession at an ever-increasing rate, are

you concerned about having the necessary skills to further your career? Do you want to enhance

your skill set but don’t know where to start? Through this lesson, we will define the essential

knowledge and skills needed to adapt to the modern workplace and set the stage for how you

will be able to develop these skills.

Video: The Future of the Profession Now, let’s hear from IMA’s President and CEO, Jeff Thomson as he discusses his thoughts on

the future of the finance and accounting profession.

Video Transcript (Est. time: 13:28 min):

Hello, I’m Jeff Thomson, president and CEO of IMA, the Institute of Management Accountants.

And I’m proud and honored to be speaking with you about the future of the accountant in

business, especially in a digital age. When you think about the challenge of digitalization of the

value chain, it really is a challenge, but it’s also an opportunity. So think robotics process


Page | 6

automation, think automation in general. Think about machine learning, cognitive computing.

The challenge of this course is to how to harness all of these technologies to add value. But a

challenge also relates specifically to our profession. Will this automation result in fewer jobs,

displaced jobs? I would argue that it will result in different types of jobs. Jobs that will add more

value of foresight and insight to the organization, but only if we commit to upskilling in areas

such as data analytics and data visualization.

So let’s talk briefly about the CFO team today and how it’s evolved over time. And as many of

you know, I was a CFO in industry during this evolution of the role expansion of the CFO and the

CFO team. So in the past, the CFO team had a very, very critical fiduciary responsibility and that

was safeguarding of assets, what we would call value stewardship, protecting value, internal

controls, accurately and fairly reporting on the financial condition of the corporation. But over

time, that

role has expanded to include those table stakes plus additional responsibilities in the area of

value creation, merger and acquisition activity, financial planning and analysis, and much more.

So the evolution has resulted in greater expectations for the CFO and his or her team. In fact, the

CFO is often referred to as the chief futures officer or the chief value officer. In many cases, the

CFO of large and small companies have responsibility for operations, for strategy, for IT, and for

more. So with this expectation, with this challenge comes the opportunity, to add greater

influence and greater relevance as individuals and as professionals in accounting.

So technology and analytics is sweeping us, and moving at such an incredible pace that you

almost can’t keep up with all of the technology, but really we must all embrace technology.

Millennials are digital natives. It comes very, very naturally. But even in the middle of our careers,

as we see some jobs being lost or displaced, it’s important that we learn new skills and new

capabilities, especially in data analytics, data visualization, data governance, and even storytelling.

When we’re trying to influence the organization on a business case or a new merger or a new

product or service, we have to tell a story. We have to tell a story beyond the numbers that’s

futuristic, inspiring, and

leads to great outcomes.

But the other aspect of learning and embracing technology is not just the Millennials and those

mid-career in terms of upskilling and staying current, and harnessing technology, but it’s also

tone at the top. It’s critically important that those at the top embrace analytics as the new science

of winning in the market, of being a competitive differentiator, to know your consumers even

better than they know themselves in terms of needs and wants and desires of future purchases.

So it’s very important that senior leaders embrace technology, not necessarily to become experts

in programming and coding, but to understand the value that analytics and technology can bring

to

a differentiated value proposition to support training, courseware, education, and more so,

many would say that it’s crunch time for finance and accounting. You know, finance talent


Page | 7

models are evolving quickly with a premium placed on data scientists, business analysts, and

storytellers. Everyone’s talking about it, the big four, Downmarket, McKinsey, Accenture,

etcetera, etcetera.

And so this is the trend we’re seeing to embrace technology, but to also step up and upscale in

the area of analytics so that we can add more foresight and insight. And the softer skills of us as

humans-empathy, active listening, professional judgment, professional skepticism-really is not

replaceable by a robot.

So some key messages, automation’s been around for a long time. What’s different now? Look,

we’ve been through multiple industrial revolutions. So why is automation the talk of the town, so

to speak, whether you’re a consumer or a professional. Well simple, the reason it’s different this

time is because we’ve programmed robots to be much smarter to think, to synthesize, to react,

to adapt. However, keep in mind that it’s human beings doing the programming and harnessing

the computing and cognitive power of the machine and not vice versa. Second key message,

look, we have a choice here. We’re at a turning point in our profession. We could choose to be

more

relevant and influential or less relevant and influential. Think about it this way. As automation

over time replaces or displaces more routine, repetitive tasks, it is a reality that some of our jobs,

some of our tasks will go away, especially in auditing and transaction processing. That’s at one

end of

the spectrum. At the other end of the spectrum, if we say, you know what? This advanced

analytics and providing insight and foresight is not what I was trained to do as an accountant. It’s

really not for me. I’m going to outsource the analytics and the deeper thinking, the deeper

judgmental capabilities to a consultant, to a machine, to whatever. Well, my question to you is

what’s left? What’s left of our great profession. So this is a call to action for us to upskill in

strategy management, data science, data analytics, and more. It’s a business imperative critical to

the

future of our profession. So nothing less than our relevance and influence is at stake.

And when you think about the rate of change of technology, I mean just two years ago, who knew

what blockchain is or was. Today we’re talking about more and more use cases of blockchain in

financial services and insurance. And more, five years ago we weren’t even aware of what

robotics process automation is all about. And now there’s hundreds of use cases in organizations

large and

small to create greater operational efficiencies. And then the final point, closer to home as

your CEO, CEO of IMA, is we are absolutely committed to being leaders in preparing you for a

great future. By the way, that future is now.

So I’ve referred to data science several times, and it could be a little bit intimidating. Change is

hard. Upskilling is necessary, but data science is really the overlap, the intersection of three

broad domains that I’d like to describe to you in turn. The first is business context. You know,

whether you are an FPA professional, financial planning and analysis, a statistician, an


Page | 8

econometrician, regardless of your level of analytical horsepower, you have to understand the

business question being asked. What is the problem we’re trying to solve? What new discoveries

are we trying to seek? As we mine through data and understand data, what are the opportunities

to learn something else to put us at a great competitive position? So business context,

understanding how customers behave, understand how value and cash flows in the organization

within your industry vertical is very, very important. To ask the right questions and to seek

answers in the data as a data explorer, business context, critically important. The second element

of data science is the technical modeling, some level of statistics, some level of knowledge of

forecasting tools and techniques. And this can be a little bit intimidating and overwhelming, but it

also is very manageable. If you focus on the problem at hand and the opportunities to be seized,

I’m not suggesting that every finance and accounting professional be a Ph.D. statistician. But I am

suggesting that the more data science, the more statistics you know and can apply, the stronger

you will be in this brave new world of digitalization. There was actually a book written called

exploratory data analysis written by a famous statistician. And that very much is what statistics is

all about. It’s not just reporting averages and standard deviations, it’s to kind of seek new

possibilities by mining through large sets of data and doing forward-looking analysis. In fact,

according to Glassdoor, data scientist is the No. 1-ranked job in terms of satisfaction and entry-

level pay. So we need to infuse more and more of that at a reasonable pace, at a reasonable level

into our profession. And then the third component of data science is the actual applications. It

could be Excel, advanced analytics and data mining and Excel. It could be open source software

like R, visualization, like Tableau, Python, Power BI. So learning these tools and application

capabilities is also critically important.

I made reference earlier to a book that while it’s over 12 years old is timeless and it’s called

Competing on Analytics, The New Science of Winning. It’s one of my favorite books because the title

speaks volumes for what we must be doing in a competitive age. When you think about winning

in the market, you think about apps and things that face consumers, not things that are kind of in

the background but make no mistake about it in a incredibly treacherous competitive

environment with geopolitics at play, nontraditional competition analytics is and can be the new

science of winning. But you know, analytics should not be equated with information technology.

It is the human and organizational aspects of analytical competition that truly differentiates. You

know, there is a language barrier though when we think about analytics, data scientists, IT

professionals, finance and accounting. We’ve got to get over it. We’ve got to solve it. We’ve got to

become data translators. Finance and accounting professionals need to learn and understand

more about IT. IT and information systems professionals need to learn and understand more

about the language of finance and accounting. Because guess what? At the end of the supply

chain, consumers and shareholders are expecting that we’re doing that translation. So there’s

plenty of opportunities to upskill in these areas and to speak the language of the machine and

speak the language of business and who better than management accountants, CMAs, CFO

teams to do that translation to create great, great outcomes. How to prepare for the future of

work.


Page | 9

In summary, look, there is no simple or prescriptive solutions. The environment is moving too

quickly and too rapidly. Technology is evolving very, very quickly and rapidly, and we must not

only keep pace, we must stay ahead. So as we embrace this new future, which is here today, we

need a combination of technological know-how, problem solving, and critical thinking as well as

the human skills, the soft skills as perseverance, collaboration, and empathy. We are absolutely

committed to preparing you today and in the near and future, whether it’s five years out or 10

years out, that’s our obligation to you. Thank you.

Innovation and Change These emerging technologies continue to drive innovation and change throughout the business

landscape. While these specific trends can seem abstract, accounting and finance professionals

do need to have an understanding of what these technologies are.

Arguably more important than any specific technical knowledge, however, is the ability to

leverage these innovations to become strategic partners.

Section 3: The Changing Role of the Management Accountant As management accountants adapt to new technologies, they are well-suited to

increasingly assume the role of facilitator, using the breadth and depth of their knowledge

to find opportunities for improvement and craft a shared vision for product stakeholders.

Management Accountants' Skills Management accountants already possess most of the skills needed to implement advanced

analytics: They have a holistic view of business, they intuitively understand the interrelation

between financial and strategic business decisions, and they have strong written and verbal

communication skills. When management accountants can combine these skills with

technological knowledge and increased aptitude, they have the power to add further value to

their organizations.

In this course, we’ll explore how to build on these on these skills by understanding some of

the current technology trends. These trends lead to the creation of data that can be analyzed

for value-added, strategic decision-making.

Section 2: Keeping Up With the Trends As a management accountant in the digital age, it’s important to have a fundamental

working knowledge of technological trends. Those include robotic process automation,

artificial intelligence, and blockchain.


Page | 10

Important Technology Trends Defined

Let’s take a moment to define each of these.

Robotic Process Automation (RPA) is an advanced and intelligent form of process

automation leveraging software tools (or software robots) to effectively and efficiently

perform tasks traditionally performed by humans.

Artificial Intelligence (AI) is a field of computer science concerned with empowering machines

to think, behave, and act like human beings to draw conclusions and make judgements.

Examples include speech recognition, machine learning, and natural language processing.

Blockchain is a decentralized and distributed ledger of encrypted information. Uses for

blockchain include cryptocurrencies, smart contracts, and transparent proof of work.

Management accountants can leverage emerging technologies such as AI and RPA to streamline

processes and decrease the amount of time spent on repetitive tasks. They can instead focus

their time on strategy and decision making.

Leveraging Technology For Leadership Once accounting and finance professionals are armed with the knowledge of which technology

trends are currently generating attention and attracting investment, there are several

approaches and tactics they should embrace as they leverage technology to elevate themselves

as strategic thinkers and leaders. Click each icon to learn more.

Communicate Implications

Be able to communicate the implications of technology for the organization and its strategic

priorities. Not every new technology platform or tool is going to be equally helpful for every

organization. Regardless of whether it’s RPA, AI, or some other emerging technology, every

tool has its own set of pros and cons. Taking an objective perspective on technology and

articulating the pros and cons of these tools are critical steps for effective leaders to take in an

increasingly digital world.

Link Stakeholders

Link different stakeholders across the company into the conversation. One of the most common

pitfalls that can trip up even the most sophisticated and well-prepared organizations is when

important dialogue occurs in silos. Discussing technology tools and options with only the

accounting and finance team isn’t going to drive change and might very well lead to project

failure. True leaders and strategic management teams understand that involving as many

participants from different departments as possible creates a more diverse team, better

solutions, and more robust use cases.


Page | 11

Identify and Present Easy Wins

Identify and present easy wins. Having a conceptual understanding of, and a vision for, how

technologies will eventually transform a profession, subset, or field is a great start but

doesn’t usually generate actionable ideas for professionals to implement. After

communicating the potential of a technology or process and linking in a variety of

stakeholder groups, presenting

goals that are achievable in the short term and documenting first steps toward achieving them

will get the ball rolling within the organization, increase the likelihood of stakeholder buy-in, and

help to create enthusiasm for future projects that are likely to be more complicated.

Section 4: The Big Picture of Data The best way to begin is to learn the terminology surrounding data. The first term is

“architecture,” and just like when designing a building, the architecture is the overall structure of

whatever it is we’re trying to create. First term is “architecture,” and just like when designing a

building, the architecture is the overall structure of whatever it is we’re trying to create.

The next term is “solutions” because the only 100% consistent aspect of any analytics project is

that we are trying to solve some problem or answer a question. “Solution architecture” may

take many forms from a simple report performing calculations once a week to an automated

artificial intelligence (AI) ultra-high-frequency trading application.

“Patterns” are templates or guidelines that solution architects will reuse frequently.

Architect a Solution In our new data-focused terminology, we hope to architect a solution using an established

pattern. There are different types of solutions that may use different patterns and/or expertise.

At the highest level (a.k.a. “least detailed” or “10,000-foot view”), we typically group solutions into

domains.

Just as a building architecture may require domain solutions in landscaping, interior design,

plumbing, and more, an analytics solution architecture may include domains such as data

storage, data creation, and different business specializations. To architect effective analytics

solutions, we require knowledge and skills in multiple domains of business data.

High‐Level Domain Design Pattern

Next, we use the following high-level domain design pattern: data creation, data storage,

and business domain(s). The business domain may include a single business domain, such


Page | 12

as marketing, or many, such as marketing, supply chain, and finance.

Data Governance, a set of processes and policies that help an organization manage and secure

its data, which we will discuss in later lessons, is an integral part of every domain.

Section 6: Data Driven Culture

There are four essential elements in establishing a data-driven organization. These include data-

savvy people, quality data, appropriate tools, and processes and incentives that support

analytical decision making. Organizations attempting to adopt leading-edge analytics often face

challenges in each of these dimensions. The result is an inability to effectively support managerial

decision making through the use of analytical technologies.

Much of the focus on implementation of advanced analytics has been on the tangible elements of

a successful data-driven organization (people, data, and tools). Less attention has been paid to

the fourth factor -- organizational intent. An organization committed to the goal of being data-

driven will work to develop the people, data, and tools needed to accomplish that objective.

Resolve to Be Data‐Driven

Becoming a data-driven organization requires creating structures, processes, and

incentives to support analytical decision making. It requires the organization to resolve

to be data-driven and define what it hopes to accomplish through the use of Big Data

and analytics. The top leadership of the organization needs to describe how analytics will

shape the business’s performance.

Six Factors For a Data‐Driven Culture

While many organizations are striving to implement a data-driven culture, success isn’t

assured. Achieving this goal requires that certain elements be present. There are six key

factors for successfully establishing a data-driven organizational culture. Click each icon to

learn more.

Having the right tone at the top. Setting the right tone at the top is critical for most

organizational initiatives, and this includes developing a data-driven culture. In most

organizations, executives are championing the use of leading-edge analytics, although in some

companies the initiative is being led from the bottom up, with various departments being first to

embrace it.

In addition to championing the use of leading-edge analytics, executives in a data-driven


Page | 13

organization must consider ethics in the context of technology and analytics. While ethics is

not covered in this course, it is important for the executive team to incorporate ethics policies

for managing technologies.

Having strategies for the effective use of technology. The ability to use leading-edge

analytic techniques effectively is important for a variety of reasons. Companies whose decision

making is reactive to the competition are less likely to have developed strategies for the

effective use of techniques and technologies. Being reactive instead of proactive implies that

these organizations lack the ability to predict trends or to turn customer data into useful

insights that can be used to enhance the organization’s business.

Having a commitment to collecting and using data from both internal and external

sources to support analytics efforts. To harness the potential of leading-edge analytics,

organizations need to utilize a wide variety of data sources. This is especially true when it comes

to strategy development and execution. In this regard, about half of organizations use data from

both internal and external sources. Of concern is that the other half of organizations are only

using

internal data, only using data to validate strategy post-execution, or (in a few cases) not using

data at all! Using a wide variety of data sources yields better insights. Organizations that truly

want to derive value from their data must be comfortable with complexity and remain flexible

enough to respond to what the data tells them.

Using both monetary and nonmonetary rewards to promote analytical decision

making. Slightly more than half of organizations use incentives to promote analytical

decision making. These can be monetary, nonmonetary, or both. Yet nearly half of

organizations aren’t doing so. This may be a mistake: The use of incentives is key to conveying

the importance of developing

enhanced analytics capabilities throughout an organization. Those that do believe in the

importance of developing such capabilities are more likely to create the appropriate culture

by providing incentives to their employees.

Having a willingness to adequately provide resources to the analytics efforts. Organizations

often are facing resource challenges concerning the development of enhanced analytics

capabilities. By far, the most frequently cited challenge is the ability to find staff with the

necessary skill set. The next most common resource challenge is budget. A third challenge, related

to the previous two, is a lack of staffing resources and competing priorities. Clearly, these four

essential elements needed for companies to develop advanced analytical capabilities are


Page | 14

interrelated: data- savvy people, quality data, state-of-the-art tools, and organizational intent.

Alignment of analytics efforts throughout the organization. Responsibility for analytics can

reside in various parts of an organization. It has been argued that CFOs should “own” analytics

as they are regarded as impartial “guardians of the truth.” Most companies seem to agree, with

finance being an owner (although often not the sole owner) of analytics. Other popular options

include analytics being owned by IT, a dedicated analytics group, or operations, or having each

department independently maintaining its own analytics capabilities. Of course, these options

aren’t mutually exclusive, with a variety of possible combinations, the most popular being

analytics jointly owned by finance and IT.

The Benefits are Clear

The benefits of implementing a data-driven culture are clear. Organizations possessing such

cultures more effectively perform key business processes such as strategy formulation and

performance evaluation. In implementing such a culture, establishing processes and

incentives that support analytical decision making (i.e., organizational intent) is critical.

When deciding to venture along the path of implementing leading-edge analytics, evaluate the

extent to which the six factors discussed are present in your organization. By ensuring that

they are, you can improve the chances of successful implementation and achieving the

competitive benefits that come with being data driven.

Section 5: Business Domain Pattern

What does the interrelation of data creation, storage, and business domains look like?

Consider a traditional business selling widgets directly to customers.

Data is created by customer activity such as purchasing widgets or browsing websites. Data

integration moves data generating application and into data storage. Raw data transactions are

cleaned and shaped with other data into an aggregate data store. Aggregate data is used for

descriptive, diagnostic, predictive, and prescriptive analytics by various business domains. The

business domains use the analysis to make business decisions on how to sell more widgets.

Business activity resulting from business decisions influences customer activity.

The entire point of this cycle is for the business to better influence the customer. This could be

through product design, marketing, support, warranty, shipping, inventory, etc. The business

exists to make decisions that result in value to its customers translated into profit. Therefore, the


Page | 15

goal of an analytics solution architect is to accelerate the cycle of customer activity to business

activity.

https://sfmagazine.com/post-entry/january-2019-leverage-technology-to-become-a-strategic-

partner/

Data Science This is data science: the intersection of data, analytics, and business decision making. The current

state of data science is to accelerate the data-to-decision process.

The Management Accountants’ understanding of data science better facilitates the data-to-

decision process for their organizations. In this course, we will focus on the Statistics &

Analytics component of data science.

Enhancing Analytical Capabilities Is Critical to Success

The Digital Age is upon us, and it brings with it challenges and opportunities for businesses.

Management accountants have the opportunity, and need, to develop their data and

technological skills so they can use advanced analytics and glean new insights from their data for

their organizations.

The collection, assessment, interpretation, and use of data are enabling companies to create new

business models and make existing ones more efficient. Most organizations now believe that

enhancing their digital and analytical capabilities is critical to their continued success and survival

and are leaning on management accountants to take the lead in facilitating this strategic

initiative.

As management accountants develop their skills and complete their repertoire of strategic

business competencies, they can share their holistic view of business and the interrelation

between financial and strategic business decisions to communicate the importance of

creating and establishing a data-driven organization.

Lesson 2: Making Sense of Data

Section 1: Introduction to Data

Big Data is top of mind for many finance leaders. The question is: How can we leverage all


Page | 16

this data to drive business success? How can we can turn data into meaningful action?

The progression includes identifying key information from the data, turning that information

into knowledge, creating valuable insight from that knowledge, and taking action.

In this lesson, we will further define the progression and life cycle of data and introduce how

and where data analytics fits in.

Phases of the Data Life Cycle

Data is an essential part of any enterprise. An enterprise that’s agile and innovative requires an

understanding of data as it flows through the organization, interacts within various

departments, and transforms itself. Though an international standard for the data life cycle

doesn’t exist, the following phases, in order, are identified as typical during data life cycle

management. Click each image in the order shown to learn more.

Capture. Business is surrounded by data, but an enterprise needs to capture it in order to

make use of it. Data capture occurs in three major distinct ways:

· Data Entry. Manual or automated entry of data into the data warehouse to create new

data values.

· Data Acquisition. Acquiring or transferring data from an already existing data source

or data warehouse.

· Connected Devices. Internet of Things (or IoT), or the interconnection of computing

devices enabling them to send and receive data, has will continue to transform the way data

is captured by making it real time and continuous as devices listen to and interact with the

environment and each other. These devices capture and transmit the data so it can be stored.

Qualify. Have you ever wondered why month-/year-end close processes are prone to errors or

why reconciliations take such a long time? Inaccurate or incomplete data may lead to major

problems later in the data life cycle. These problems may include critical business processes

being held up, bad decision making, or final reports running afoul of compliance because of

erroneous data values. In this phase, data is assessed for its quality and completeness using a set

of predefined rules.

The Capture and Qualify phases are traditionally seen as under the purview of the IT team,

which sets up the system architecture. But management accountants, with their knowledge of

accounting processes and the way data will be utilized in later phases, have the ability and

responsibility to envision the framework of the system in partnership with IT.


Page | 17

Transform. The advent of Big Data has led to enterprises being able to capture a seemingly

infinite amount of data. Couple this situation with IoT, and soon you can be drowning in data.

Thus, enterprises have to transform, synthesize, and simplify the data so it can be utilized by

functional departments. This phase is commonly called “analytical modeling” in the financial

world. A certain

level of functional expertise is required at this phase as data from different sources is

linked together to find the intrinsic value hidden beneath.

Utilize. The final aim of data is to help enterprises make good business decisions, and, in some

cases, data itself is the final product or service of the enterprise. Either way, the true value of

data is unlocked in this phase, and the previous efforts made in data capture, qualification, and

transformation finally bear fruit. Management accountants act as business partners during the

Transform and Utilize phases. As business partners, we need to translate the data values into

business stories that help enterprise leadership understand the magnitude of their decisions

and their long-term impact.

Report. This phase relates to external reporting. Internal management reporting for decision

making is realized during the Utilize phase in the data life cycle. External reporting could involve

quarterly/yearly financial reports, financial data sent to other vendors for bids, and other

compliance reports. Reporting of data is a key phase of the data life cycle that is ripe for

automation. Since rules, definitions, and requirements for these reports either rarely change or

have slight changes year after year, automated processes designed by management

accountants help to create and publish these reports in a more efficient manner.

Archive. This is the beginning of the end for the data that the enterprise has spent a considerable

amount of time and resources on to unlock its value. Data archiving is the transfer of data from

an active stage to a passive stage so that it can be retrieved and reutilized as needed.

Purge. The final phase of the data life cycle is the removal of the data (and any copies) from the

enterprise. It occurs in the data archive and is sometimes accompanied by a communication

both inside and outside the enterprise.

Financial compliance rules within an enterprise or those imposed by regulatory bodies

normally drive the Archive and Purge phases. And management accountants act as custodians

to ensure that these compliance rules are followed within the enterprise for financial data.


Page | 18

Section 2: Introduction to Data Governance

As users of systems, technology, and data, finance professionals are invested in the integrity

of our information and its sources. IT and data governance can provide the structure and

rules to ensure data accuracy and availability while managing the associated risks.

Every organization with shared data is concerned with data integrity. Data governance, a

specific sub-element of IT governance, parallels the capabilities of corporate governance and IT

governance at the data level. This can be as simple as a set of rules specifying what data (for

example system fields) is to be entered by whom, when, and from what source, to as complex

as you want (as an example multiple levels of data entry, audit, and control structures).

In this lesson, we will identify how organizations can build trust in data by implementing a

data- driven culture with effective and efficient data governance.

Data Governance and the Data Life Cycle

Data governance helps an enterprise administer the data as it flows through the various phases

of the data life cycle discussed in section 1 of this lesson.

During the Capture phase, enterprises need to identify the capture points for the data and

define the data that will be captured.

As data enters the Qualify phase, the rules of data governance act as a check to ensure

that inaccurate data is identified, assessed for completeness, and secured.

At the Transform and Utilize phase, focus shifts toward adherence to transformation rules and

the legal utilization of the data according to regulatory standards for decision-making purposes.

As

the Reporting phase is all about showcasing data to external parties, data governance lists

the steps to take when inaccurate data is reported outside the enterprise.

Archiving data relies on a set of rules that define what occurs, as well as when and how. And in the


Page | 19

Purge phase, it’s critical to set a purge schedule for the data as per the retention period requirements.

Why Data Governance is Needed

Effective and efficient data governance can facilitate powerful analytics and decision making

across an organization. Demand for advanced analytics is increasing, and, as previously

discussed in the course, so, too, is the expectation that management accountants will perform

the analytics.

But how often have we heard of an analysis resulting in a business decision that later proved

problematic because of faulty data? As technology makes it easier for employees to access data,

write reports, and conduct their own analysis, data governance becomes an even more

important safeguard to ensure the integrity of underlying data.

As businesses become more data-driven, data governance provides the foundation for

growth into predictive modeling and automation.

Data‐Driven Culture

When asked, 99 percent of leaders of large organizations say they want a data-driven culture to

maximize the value of data through analytics. As we also previously identified, they aim to

make business decisions faster and more accurately through automation.

Why emphasize culture? The limitation for achieving analytics maturity isn’t usually related to

data or technology but, rather, people’s reluctance to use data and technology to answer

business questions-in other words, using data analytics rather than intuition as a driver for

business decisions. A shift is needed toward a culture that trusts that these data-driven decisions

will be effective.

Governance Attributes

For this analytics process to function effectively, the data inputs (“raw materials”) must be

consistent and reliable for the information outputs (“finished goods”) to be relevant and

comparable. Relevant, reliable, comparable, and consistent are the four desired attributes of

accounting information. Effective governance means that data used in decision making is of

consistent quality and from reliable sources. Efficient governance leverages connectivity and

technology to enable the comparison of data from many different sources and to deliver

relevant analysis.

Governance Problems


Page | 20

Unfortunately, only about one-third of data-driven culture initiatives succeed in larger firms.

Often the reasons for failure stem from insufficient data governance. If an organization suggests

its data quality is insufficient to use for decision making, that signals ineffective data governance.

If the data can’t be accessed, that’s a symptom of inefficient data governance. Click each icon to

learn more.

Ineffectiveness. “I think you forgot to adjust your dates for time zone…”; “I’ve heard ‘churn’ defined

three different ways today…”; “We can’t do that analysis; we don’t have good data….” All common

phrases for emerging data-driven cultures who struggle with effective data governance. Issues

with data and analyses, from little mistakes in calculations to instability in data sets, all build to

create a culture of mistrust in data and, by extension, mistrust in any decision based on that

data. The root cause of these issues is often ineffective data governance.

Inefficiency. Certain phrases signal inefficient data governance: “Will you email me that database

extract?”; “Why can’t I access that reporting database? There’s no way for me to get approval?”;

“It’s going to take more than a week to access?”; “I’ll just get someone to build out a new data

platform for my department.” Inefficient data governance is usually more difficult to resolve than

ineffectiveness. Sometimes it’s necessary due to regulations, e.g., open-source language

restrictions or the EU General Data Protection Regulation (GDPR); in other cases, it’s a symptom

of an organization failing to commit sufficient resources to data governance.

Often the easiest way to govern a data set is to block access. But blocking access can create

inefficiencies: While no access means no one can compromise the data, it also means no one

can use it to improve the business.

Blocked Access. Often the easiest way to govern a data set is to block access. But blocking access

can create inefficiencies. While no access means no one can compromise the data, it also means

no one can use it to improve the business.

Nonconnected Data Sharing is also inefficient. If users are constantly getting data from File

Transfer Protocol (FTP) or email, then it will be difficult for them to create a report that updates

automatically and impossible to automate decisions. Commitment to analytics governance

means giving people access to data in a way that facilitates analytics maturity to create further

value, even though that takes time.

Principles for Implementation


Page | 21

Efficient and effective data governance provides clear ownership and standards for data and

data processes to ensure data quality. Although many approaches exist for data governance

implementation, most share the following principles. Click each icon to learn more.

Accountability. There must be clearly defined ownership of, and accountability for, different

types of data. Interconnected data managed throughout the organization requires consistent

practices in order to maintain its effectiveness and value. In most organizations, data oversight

doesn’t reside within one department. Human resources is the keeper of employee-related

data, for example, while accounting maintains financial data. Shared governance brings

consistency by establishing organization-wide policies and procedures.

Standardization. Data is an asset and must be protected like one. Clear policies on access,

definitions, privacy, and security standards are needed. The committee must define the policies,

and each department head must ensure adherence. This approach will ensure that the

organization is in compliance with regulations, such as GDPR (General Data Protection

Regulation).

Quality. Analysis is a critical tool for decision making and is only as good as the data upon which

it relies. The quality of data should be managed from the time it’s captured. Good data

governance includes defining one set of data-quality standards for the organization and

establishing consistency in how that data quality is measured and recorded.

We live in a fast-paced world where management accountants are asked to provide insight and

foresight through analytics, often on short notice. Data governance helps ensure that our data is

readily accessible and accurate. That’s especially true in situations where data is spread

throughout disparate systems and departments. Often the analysis we’re asked to perform

relies on data that we don’t oversee. By coordinating with other data owners in the organization,

we can

protect the integrity of data and spend more time on value-added analysis than on scrubbing data.

Committee of Sponsoring Organizations'

Standards for data and data processes to ensure data quality are outlined in various frameworks.

The Committee of Sponsoring Organizations’ (or COSO’s) mission is to provide thought leadership

through the development of comprehensive frameworks and guidance on enterprise risk

management, internal control and fraud deterrence designed to improve organizational


Page | 22

performance and governance and to reduce the extent of fraud in organizations.

The COSO Internal Control - Integrated Framework, is a globally recognized framework for

developing and accessing internal controls.

The purpose of the COSO Internal Control – Integrated Framework is to help management

better control the organization and to provide a board of directors’ with an added ability to

oversee internal control. Internal control enables an organization to deal more effectively with

changing economic and competitive environments, leadership, priorities, and evolving

business models. The framework identifies five components that are of comprised of 17

principles. For more information, you can download the framework from the Resources link

above.

The COSO Internal Control ‐‐ Integrated Framework

As a high-level overview of the COSO Internal Control - Integrated Framework, we’ll refer to the

COSO cube and two of its principles that can be applied to data governance. These principles

reside within the Control Activities and Information & Communication components of the cube.

Click each icon to learn more about these principles and how they apply to data governance.

Principle 11 states “The organization selects and develops general control activities over

technology to support the achievement of objectives.”

Ensuring an acceptable system of internal control over systems and processes will protect

stakeholders and the data integrity itself. Management must carefully select appropriate control

activities over the technology infrastructure that help ensure data completeness, accuracy, and

availability of technology processing.

Principle 13 states “The organization obtains or generates and uses relevant, quality information

to support the functioning of internal control.”

To achieve this, information systems should capture internal and external sources of data, and

process and transform relevant data into information that is timely, current, accurate, complete,

accessible, protected, verifiable and retained. Information must be reviewed to assess its

relevance in supporting the internal control components.

Although we pointed out two principles from two different components within this COSO Internal

Control – Integrated Framework, factors relating technology must be considered throughout all five


Page | 23

components and seventeen principles to provide reasonable assurance of appropriate controls. An

effective system of internal control will demonstrate that all components are present and functioning

to the achievement of objectives throughout the entire entity structure.

Section 3: Introduction to Data Analytics Data analytics is the science of examining data with the purpose of creating actionable insight.

It gives you the ability to react quickly to an increasingly complex, volatile, and competitive

environment. Most organizations know that enhancing their analytical capabilities is critical to

their success and survival-helping them gain a competitive advantage or helping them maintain

their current market position by helping managers and organizational leaders make better

business decisions.

Data Analytics and Strategic Management

Data analytics must be anchored in the entire strategic management process: strategic

analysis, strategy formulation, strategy execution, and strategy evaluation.

It is all about quantifying business issues and making decisions with more accurate and fact-

based data.

Business analytics and business intelligence use data mining -- examining large databases to

generate information and extract patterns, statistics, and modeling software to support data-

driven business decision-making.

Data Analysis Models There are many data analytics models available including regression analysis, classification

analysis, customer segmentation, market basket analysis, and others. Let’s take a moment

to define some of these models.

Regression analysis is a statistical model used for obtaining an equation that best fits a set of

data and is used to show relationships between variables.

Classification analysis attempts to find variables that are related to a categorical (often binary)

variable.

Customer segmentation, also known as clustering, groups customers into similar clusters,

based on the values of their variables. This method is similar to classification except that there

are not fixed groups. The purpose of clustering is to discover the number of groups and their

characteristics, based entirely on data.


Page | 24

Market basket analysis tries to find products that customers purchase together in the same

“market basket.” In a supermarket setting this knowledge can help a manager position or price

various products in the store. In banking or other retail settings, it can help managers to cross-

sell (sell a product to a customer already purchasing a related product) or up-sell (sell a more

expensive product than a customer originally intended to purchase).

It is important to note that Artificial Intelligence and machine learning also play an important

role in performing data analytics.

Now you will be presented with an overview of how data analytics can be used to create value

in an organization. organization.

Video: Value Creation Through Data Analytics


Technological advances in gathering and processing data are increasing at an

unprecedented speed. Data analytics includes the extraction and analysis of data using

quantitative and qualitative techniques to gain insights, improve predictions, and support

decision making.

For management accountants, the ability to exploit this data through meaningful data analytics

is a critical component of the accounting and finance profession. In their evolving roles,

management accountants will be required to acquire enhanced skills in data mining, analysis,

and effective communication through data visualization. As business partners, management

accountants become storytellers, providing relevant hindsight, insight, and foresight to their

organization, effectively turning information into intelligence through data analytics.

Let’s take a few moments to review the key aspects of data analytics and their contribution to

the value creation within an organization. Value is added to an organization as the sophistication

level of the analytics being performed becomes more robust.

We begin by leveraging historical data of an organization to provide hindsight into what

happened. This type of analytics is known as descriptive analytics. Many accountants leverage

tools in Excel, such as pivot tables and graphing to perform descriptive analytics. There are also

many other

tools in the market being used to perform this type of analytics.


Page | 25

Although understanding what happened is very important, further hindsight explaining why things

happened is even more valuable. The type of analytics that help us understand why things

happened is referred to as diagnostic analytics. Tools often used to perform diagnostic

analytics may include Excel’s customer segmentation, what-if analysis, and multi-variable

regression. Here again, other software products in the market are also being used.

Up to now, we’ve defined analytics that provide hindsight. What if we could gain insight into what

is likely to happen going forward? Predictive analytics adds value to an organization by

attaining this type of insight into what is likely to happen. Accountants can access Excel tools like

exponential trend smoothing, and solver to perform predictive analytics. As software

sophistication continuously improves, the market will continue to expand upon product offerings

that provide this type of insight.

Taking the value added through analytics even further, Prescriptive analytics help organizations

attain foresight needed to decide what an organization should do or what actions should be

taken to create added value. Applying some of the predictive analytic tools from Excel may also

help in performing prescriptive analytics.

Finally, Adaptive analytics help further gain value by providing additional foresight into how

machine learning can help. As organizations continue to innovate new technologies with

artificial intelligence, more adaptive analytics will be applied.

Advances in technology and analytics is rapidly changing the role of the management

accountant, how their work is done, and which types of accounting functions are becoming

obsolete.

Embracing the changes brought on by artificial intelligence and further technological

achievements, will enable management accountants to play a more proactive role in

providing insight, transparency, and foresight as valued business partners within their

organization.

Understanding what it takes to perform relevant data mining, value added data analytics and

effective communication through data visualization are critical competencies associated with

the future of the accounting and finance profession.


Page | 26

ABC Electronics Data Analytics

You’ve just heard about how management accountants use data analytics to make informed

strategic decisions and increase company value. Data analytics provide hindsight, insight,

and foresight for helping companies achieve their goals.

Let’s learn about how the five types of data analytics can be used to tackle a declining

sales problem using a fictional scenario.

Let’s find out:

How the sales decline was identified and why it was problematic.

What they learned about reasons for the sales decline.

What options they had to choose from for addressing the problem.

Which options they chose to meet their goals.

ABC Electronics – Sales Decline Data Mining

ABC Electronics is a retailer of household electronics products including televisions,

refrigerators, washers and dryers, and more.

Through data mining, they discovered a decline in Brand C television sales.

ABC Electronics has an interactive dashboard that provides insightful visualization for analysis

and decision making. Take a moment to review ABC’s sales dashboard for their line of televisions.

Then click next to continue.

Data Analytics Phases

Use the interactive analytics value chart to learn more about how ABC Electronics made the

most of data mining and modeling to address the declining sales.

Working from hindsight to foresight, click on each analytics type to discover ABC Electronics

data revelations and decisions.

Descriptive Analytics

Descriptive analytics define a business problem. Raw data was cleaned, transformed,

and summarized. ABC then used this data to create their television sales dashboard.


Page | 27

The dashboard’s drill-down capabilities for descriptive information is how the Brand C sales

decline was identified.

The dashboard revealed overall television sales was up 5% but Brand C sales decreased 2%. Diagnostic Analytics

Diagnostic analytics provides possible reasons for the business problem. It tries to find

correlations of the product’s activity using data.

In the case of ABC Electronics, Inc., diagnostic analytics tries to find possible reasons for the

sudden decline in the sales of Brand C televisions by searching for relationships between data

attributes and product activity. For ABC Electronics, attributes are factors that could have an

impact on sales include:

Time of year (seasonality)

New comparable TV from competitor

Product reviews

Product price

Social media mentions

Statistical modeling looks at the relationship between the attributes and sales. Visually, the more

linear the relationship, the better the results.

What did ABC’s analysis find?

Reasons related to the attributes are:

1. The decline happened in July and August, a typical slow sales period.

2. The competitor’s product sells for less.

3. Customer reviews gave average ratings Predictive Analytics

Predictive data analytics can provide insight about what might happen given the current

circumstances. This is done through building analytical models, such as regression or market

basket analysis, to compare with actual results.

Based on their analysis, ABC Electronics’s prediction is that Brand C sales will continue to decline


Page | 28

due to the attributes of new competitor product, pricing, and customer reviews. This predicted

decline will result in leaving ABC with twice as much inventory as desired.

Is that what happened during the next sales cycle?

The results from actuals in the next reporting period matched the prediction. Brand C sales

declined 2%.

With the prediction proven correct, the model will be adjusted by assigning higher weighted

scores to the attributes associated with the decline in sales.

Prescriptive Analytics

As we just learned, predictive analytics are a company’s attempt to foresee how changes they

make might address a business need. Taking it a step further, prescriptive analytics uses complex

modelling to suggest actions to take in order to achieve possible outcomes.

The accuracy of possible outcomes depends on two important factors: data quality and model

quality.

With the prediction of twice as much Brand C inventory than needed, what do you think ABC’s best

options are?

Two possible actions were identified:

1. Offer a 30 % discount which predicts sales will improve 10%.

2. Offer a package discount for purchasing a soundbar with the television because analytics

showed customers frequently buy a soundbar when buying the television.

Adaptive Analytics

Adaptive analytics is another resource to provide foresight about what might happen given what

has happened so far. Machine learning models incorporate actual results and continuously adjust

based on new data received.

Adaptive analytics collect Big Data into one central repository. Included in Big Data is information

related to sales, marketing, email, websites, and content management systems. Machine Learning

models use this data to make more accurate predictions.

Analytics at this stage are useful in validating the accuracy of predictions. For example, if a

prediction recommends a change in marketing strategy, adaptive data analytics tells whether the

change is working and how to adjust the model if it isn’t.


Page | 29

Adaptive analytics can also help find unrelated trends such as customers who recently purchased

homes and bought a television are also likely to buy five additional electronic products.

What did ABC Electronics do?

ABC electronics used adaptive modeling to build customer profiles based on customer behavior

data collected from web ads, emails, text messages, etc. Using the information from these specific

customer profiles, ABC Electronics will plan to offer personalized promotions such as pricing

discounts and package deals to increase sales and effectively reduce inventory.

ABC will continue to adjust their analytical models as new actuals data is received. With

continuous monitoring and adjusting, ABC will be equipped to make the best use of the data

available for making effective strategic decisions.

Analytics Value at Your Organization

Time and more data will tell if ABC Electronics is able to reduce the excess Brand C inventory.

While thinking about your organization and data analytics, what information is already gathered

that can tell a story about a business need? Or what information is not gathered that could be

helpful? How can the types of data analytics help drive decisions in your organization?

Data Analytics Competencies

According to IMA’s Management Accounting Competency Framework, within the “Technology &

Analytics” domain, data analytics competencies encompass extracting, transforming, and

analyzing data to gain insights, improve predictions, and support decision making. While the

required level of competency may vary, at a minimum it must include knowing what types of

analytical models are available and to what business problems they can be applied.

Beyond that, important, perhaps essential, skills include the ability to transform raw, unstructured

data into a form more appropriate for analysis (data wrangling), the ability to mine large data sets

to reveal patterns and provide insights, and the ability to interpret results, draw insights, and

make recommendations based on analysis. At the conclusion of this module, we will begin to work

with data sets and apply analytical models to begin making business decisions.

For more information on the IMA Management Accounting Competency Framework, refer to

“Management Accounting Competencies” under Resources on your player.


Page | 30

Concluding Thoughts

No Longer a Human Endeavor

Accounting is no longer solely a human endeavor. Similar to what has already occurred in

manufacturing, transportation, and medical industries, robots are now performing previously

people-driven accounting and finance tasks, including transaction matching, variance analysis, and

reconciliations.

Exponential Growth

Technology is transforming the management accounting profession at an accelerated rate, but

management accountants should not be concerned about job security or possessing the skills to

adapt to their new roles in the digital age. RPA, AI, blockchain, and any other technological

advancement that contributes to the exponential growth of available data can and should be

properly leveraged to make strategic business decisions.

Strategic Business Partner

When financial data is available within minutes, management accountants have the time to serve

as true strategic business partners and analyze and visualize the data so organizations can

respond more quickly to the marketplace, capitalize on innovation opportunities, ensure

continuous integrity, and, most importantly, uphold stakeholder and consumer confidence.

In the next module, we will focus on the development and practical application of data analytics

and data visualization for accounting and finance professionals.

Module 1 Wrap‐up You have completed Module 1 of this course and should now be able to:

Recognize the impact of technological advancements on the finance and accounting

profession.

Describe the typical life cycle of data.

Explain the impact data governance has on business and its stakeholders.

Define the various types of data analytics and how each progressively creates value within an

organization.


Page | 31

Module 2: Visualizing the Present & Predicting the Future

Course Roadmap

Click Next to proceed to Module 2, Visualizing the Present & Predicting the Future.

Introduction

Welcome to Module 2, Visualizing the Present & Predicting the Future.

The goal of this module is to gain a deeper understanding of data analytics and data visualization

through the fictitious scenario depicted in the case study, Huskie Motors Corporation: Visualizing

the Present & Predicting the Future.

The case study is available in the Resources link of this course. You are not required to read the

case study ahead of time, however, it is available to download if you prefer to print it and follow

along.


Module 2 Goals

Upon completing these lessons, learners will be able to:

Define data visualization.

Describe how data visualization can impact the way data is communicated.

Identify various data visualization tools and their different uses.

Recognize the importance of choosing the right visualizations based on your audience. In Module 2, Knowledge Check questions are dispersed intermittently throughout the lessons. There

are a total of six knowledge check questions among the three lessons. Be sure to answer all of the six

questions in this module before moving ahead to the next module.

Module Menu

Module 2 consists of three lessons:


Page | 32

Lesson 1: Case Study Introduction

Lesson 2: Huskie Motors Operations, and

Lesson 3: The Data Dilemma.


Lesson 1: Case Study Introduction

Section 1: Introduction to the Case Study

In order to effectively work with, analyze, and make business decisions from the abundance of

data made available through automation, we have made it clear that solid data governance is vital

to the success of an organization’s strategy.

To establish a practical understanding of data analytics and data visualization, we will now

examine the opportunities and challenges created by Big Data and demonstrate how

management accountants can apply new competencies to the various data processes of an

organization.

We’ll now begin the case study Huskie Motor Corporation: Visualizing the Present and Predicting

the Future. Before we begin, it’s important to note that Huskie Motor Corporation and all of the

characters that are introduced are fictional and have no relationship to an actual organization.

Big Data

Organizations create and collect massive amounts of data as a result of their day-to-day

operations. Frequently referred to as Big Data, it represents an important asset for the

organization.

Big Data presents both opportunities and challenges for accounting professionals, who are

expected to know how this data is created, collected, stored, and accessed. As the custodians of

the organization’s assets, accountants are expected to understand and implement controls over

the storage and use of the organization’s data.

Big Data and the Management Accountant


Page | 33

As business professionals, accountants are expected to know how to use this vast source of data

to make better business decisions and identify potential risks. Understanding how to use Big Data

to formulate and solve business problems provides an opportunity for the accounting professional

to become a forward-thinking strategic partner in the organization. The challenge for accountants

is to develop the skill set needed to extract value from Big Data through advanced analytics.

Data Visualization

One skill that is becoming increasingly important for analysis of large data sets is

data visualization.

Data visualization is the process of displaying data to provide insights that will support better

decisions. Gartner’s 2017 Magic Quadrant for Business Intelligence and Analytics Platforms states

that “the visual-based exploration paradigm has become mainstream.”

Gartner identifies three platforms as leaders in visualization software: Tableau, Microsoft, and

Qlik. All three products provide relatively easy-to-use data visualization tools. In this lesson, you’ll

also learn about some of the other platforms that are available in the market. As new software is

introduced to the market, there may other data visualization tools that are leading in the industry.

Throughout this module, as you progress through the Huskie Motors case study, you’ll be taken

through a series of videos that discuss data visualization topics such as knowing your audience,

selecting the best visualization in context of the business question, various visualization

software, and much more. Dan Smith, accounting professional turned data scientist and leader

in data analytics, will take you through these important concepts on data visualization.

Video 1: Data Visualization ‐ Why Visualization

Now let’s hear from Dan Smith where he further introduces the concept of data

visualization, discusses why visualizations are important, and emphasizes the role and

function of data visualization in analyzing Big Data. Video Transcript (Est. time: 9:34 min):

In this series of videos. We’ll review the tools often associated with creating data

visualizations and provide a little context of when you would use what tool or what

visualization in what scenario. As the case study progresses, you’ll see situations

where you need to understand for whom you are creating. The visualization for that is

understanding your audience, and understanding your audience means, does this


Page | 34

audience need to explore the data and find their own answers to their own business

questions, or does your audience simply need to be informed of the answer, or do they

want to report only providing a repeatable answer to a well-defined question? We’ll use

these concepts of exploratory, informative, and reporting audience needs as a framework

to talk about all the visualizations concepts associated with those visualizations. We’ll also

dive into the source of the data and where your data visualization tools live. The data

environment concepts, again, that’s both for the data visualization tools and how the

audience receives the information provided by those visualization tools, the visualizations

that you create.

Finally at the end of the case study, we’ll discuss the future of data analytics and the role of

the management accountant in data analysis. With that out of the way, let’s talk about why

visualizations are important. The formal definition of visual analytics is defined as the

creation and study of the visual representation of information or the process of displaying

data to support decision making, but what does that mean? What the creation and study

of a visual representation of information, quote unquote, is really saying is that we’re

simply taking data and using the data, the raw data to create information using a

visualization. Now there’s an important distinction between data and information. Data

are the raw materials for the creation of information data. Our numbers, rows in a table

there, transactions before anyone sees them. Information is processed. Data, data, which

is processed in such a way to make some type of decision or gaining knowledge about the

data or what that data represents.

And what about the need for visualizations? Why are they so important? In modern

analysis, visualizations are closely associated with how humans consume information,

how humans take data and process it as information. You see, humans naturally use

visual indicators to understand their environment. Way back at the beginning of humanity

we would use visual cues to know if a plant was ripe or a saber-tooth tiger was about to

attack us. Well, understanding a bar chart or a line chart is not exactly the same as

understanding the threat of a wild animal visibility into the state of our environment.

Specifically the business environment is nonetheless very important for the modern

human, and because humans are so well adapted to processing massive amounts of

information visually, we can use visualizations to communicate complex ideas and

interactions between groups of data using visual representation of those and ideas and

interactions.

Using visualizations is much more efficient than if we had to communicate that


Page | 35

information through text or speech alone. For example, imagine explaining double-entry

accounting using only your words, no pictures, T tables, ledgers, or whiteboards. I

remember my first accounting professor talked about debits and credits being on the

right hand and the left hand side of a, of a, a T table over the general ledger. He even

stood on the table and had debit and credit written on his shoes to reinforce that visual

concept. You see, he was being a great professor because he understood that we had to

have a shared visual context of what was being explained. Without visualization, without

something tangible, you cannot know what the other person is visualizing in their mind.

It’s very difficult to be on the same page to share a vision if you don’t have something to

see in the first place.

So visualizations, in other words, create shared context between the audience receiving

the information and you the individual that’s communicating this context to them.

Speaking of shared context, let’s do a quick overview of the visualizations that we’re going

to be covering. Now, you may be unfamiliar with some of the terms that I use in this quick

summary, factors, groups, aggregates, distribution trends, etcetera, or unfamiliar with the

visualizations themselves. Don’t worry too much about that. Just make a note. We’ll define

all of these later, either in the videos, in the course content, or through reference material

which I’ll provide to you.

So let’s get started on this quick summary bar charts for comparing single values or

aggregate values between a small number of groups, a small number of groups that all

share a common numeric baseline. Bar charts are very effective when comparing the

composition of groups or factors within a larger group.

A pie chart for a single larger group or a stacked bar chart if you have many larger groups.

Are the visualizations typically used? When we display trends over time, we use lines or

line graphs when we have two separate fields, two separate sets of values, and we want to

explore the relationship between them that’s facilitated by scatterplots or principal

coordinates plots. When we look for the relationship between many groups within those

two values, so multiple factors within a set of two values, we use a colored size or shape

scatterplot or we may use something like a heat map to indicate high vs. low values within

a table. This is known as a heat map just because it looks like there’s hot and cold areas

on your table. You’d say a way of getting a quick visual indicator of hot spots while at the

same time providing raw numerical context.

Going onto more advanced visualizations. When we want to look at statistical measures or


Page | 36

traditionally statistical measures like average mean median values, these are known as

measures of central tendency. We need to understand how the data is distributed around

that central tendency, so we use visualizations like histograms to see the distribution of

data around an average mean, median so that we can see how spread out the data may

be. If we want to compare multiple groups and their central tendencies and distributions,

we would use a box whisker or a candlestick plot. When we have a lot of information that

we want to communicate and perhaps we need to add an interactive component to it,

allowing people to explore that information, we would use a dashboard. A dashboard is a

combination of visualizations oftentimes with either filters which allow us to modify the

underlying data that populates the visualizations or, that is, filter the data.

Either filters or we can include a drill-down component where you have one visualization

interactively modify another visualization. I would be able to click on one so that I could

see more details about that visualization.

That was a very fast summary. We defined visualization as the creation and study of the

visual representation of information. We learned visualization is important because it

assists with informing an audience and the exploration of data. Finally, we introduced

many visualization techniques. Again, don’t worry if you’re unfamiliar with some of the

visuals or the language I mentioned. We’ll go into more detail later and also there may be

some visualizations that you really like or that you’re really interested in that I didn’t

mention, like a Sankey diagram, waterfall chart, or ribbon chart. We’ll dive into the core

visualization concepts used across all of these visualizations, so the underlying shapes

and lines, the mechanisms of how they’re communicating information. This will allow us

to understand the fundamentals of all visualizations, visual analytics, everything from

simple bar charts to complex infographics that you find online on websites. I look forward

to this journey, individual analytics with you.

Next up, we’ll be talking about the tools used to create visualizations. Video 2: Data Visualization ‐ Tools of the Trade

While the case study identifies three platforms as leaders in visualization software,

there are many others to be aware of. Dan will now discuss the various visualization

tools, and the strengths and weaknesses of each.


Page | 37


Let’s begin our discussion on visual analytics by discussing visual analytics tools. In this video we’re going to talk about what, exactly is a visualization tool and why are they becoming more and more prevalent in business. We’ll cover the general concepts applied by visualization tools so you can begin to feel more comfortable using a wide range of tools as opposed to feeling like you have to specialize in just one tool over another. And finally we’ll outline the general categories of types of visualization tools, the spreadsheet- focused ones, the ones that concentrate on visualizations first, the workflow-centric tools, and the code-based tools. In the case study, they talk about how the visual-based exploration paradigm has become mainstream. Now that’s a very consultanty way of saying that people now expect to be able to connect and explore their data in a visual platform. Well, what do I mean by visual platform?

Well, literally anything that can connect to a data source and modify the data in a way that

displays a visualization can be called a data platform. Okay, fine. So what do I mean by

data source, and what do I mean by modify data? Well, we’ll get to all of that towards the

end of the case study when we’re talking about the data and data environment concepts.

But for now, understand that particularly when it comes to Big Data, you’re not going to

be able to pull all the data into a single spreadsheet because the data is simply too large.

But more on that later. Let’s focus for now on the tools themselves. So what, what do I

mean when I say tool? Because any of the tools, Tableau, Power BI, Click, Excel, Jupyter

Notebooks, R Studio, RapidMiner Nine, the list goes on and on.

Any of these tools are able to create pretty much the same visual analytics. They just have

their own way of doing it. And that’s because all these tools apply the same fundamental

concepts. And I’m not talking about just visualization concepts, I’m talking about the

highest-level analytics concepts. You see we tend to use words like tools, concepts, and

technology interchangeably, but technology is the application of concepts using a tool.

Visual analytics tools apply analytics concepts to create useful information. Competency in

a tool is knowing how to build a bar chart in Tableau or in Power BI. Competency in the

concept is knowing when to use a bar chart or should you use a line chart or should you

use something else instead? How does the data need to be shaped so that you can do

that? How do I connect to the data? There’s a bunch of underlying concepts that make

these tools easy. And if you think about it, when you hire a contractor to oversee the

construction of your house, for example, would you care what kind of hammer they used?

Probably not. The tool isn’t as important as the person’s capabilities. Just like when you

first learned to be a management accountant, did you have to learn Excel first? I doubt it. I


Page | 38

don’t remember there being questions on Excel when I passed the CMA. Building an

analytics solution, visualization or otherwise, can be viewed through the same lens. Our

concern is how we are applying the visualization concepts so that they can be understood.

Now, this may seem like an unimportant distinction. Why am I going on and on about

this? Because it’s so important that you understand the difference between a tool and the

concept the tool is applying. That’s how you can see me jumping from Power BI to Tableau

to RapidMiner to all these different tools. It’s not because I’m a genius. No, I’m no smarter

than anyone else. It’s just I know the underlying concepts so I can easily jump from one to

another vs. having to learn specifically how to do something in one tool.

We’ve talked in depth about why concepts are important. What are the concepts? Same

analytics concepts used in data science. You connect to the data, you transform that data

and information, and you represent the data in an understandable way. The specific

concepts will be covered in the visualization concepts video, and the data concepts and

connecting to data concepts will be covered in the data and data environments concepts

video. But at a high level we connect, we transform, we represent. So what are the

categories of these tools? Well, and this is purely my personal experience. I feel there are

roughly four general categories to which visualization tools fall under. The categories that

I found are a spreadsheet-focused, things like Excel, Domo, the, a visualization first, the

ones where they focus on you being able to create a visualization. Those are the famous

ones like Tableau and Power BI.

Then you have ones that are centered more on the workflow, workflow process

management utilities like RapidMiner and Nine. Finally, code-centric tools like Jupyter

Notebooks, Zeplin. Even our studio notebooks like let’s dive into each one of these. So first

are our spreadsheet focus tools and Excel is the bread and butter of accounting. Many

visualization features have been added to Excel. Excel itself is not a terrible visualization

platform, particularly if you’re using pivot tables and pivot charts instead of making new

tabs and new data sets for each visualization. Pivot charts actually work a lot like the other

visualization tools, so if you’re using those, you’re probably comfortable with the

visualization focus tools already. Because of Excel’s market dominance, many apps have

tried to mimic or extend Excel’s functionality. After all if you can’t beat them, join them.

Apps like Domo, Pitch that you can do advanced analytics within Excel, but there’s a lot of

operations happening on the back end.

I’ll put up a little information in the course content to links to some of the platforms. But


Page | 39

I’d wait until you’re finished with all of these videos before you start exploring them so

that you can better understand the data and data environment concepts. I hear people

say that I need to see my data as a common complaint on the other tool categories,

particularly for people that started with Excel. I mentioned before, though, that

sometimes you have terabytes of data. You have so much data, you can’t bring it into your

computer. It’s just not practical to see all of the data. It won’t fit onto your computer. And

even if it could, there’ so much of it that seeing each individual row isn’t going to provide

that much value to you. There are plenty of ways where you can click on a visualization, a

component of visualization, and see what the details are about that point.

That’s called drilling down, which is a component in dashboards. And we’ll talk about that

in depth in a future video. For now, sticking to the tools themselves, one of the primary

ways of creating a dashboard, particularly interactive dashboards, are the visualization-

focused tools. These visualization-focused tools might be more likely called focused

platforms since they tend to encompass both the transformation of data, data workflows,

and the visualizations themselves, however, they’re visualization first and they tend to

abstract a lot of those data transformations. So if I say abstract, it means that the tool is

determining how to perform an operation, not the user, which may not always be ideal,

particularly if you need to communicate how that tool is doing the transformation. Since

you can’t see the back-end operation with visualization tools, though, as their name

implies, they can look gorgeous. Those gorgeous visuals abstract a lot of complex

operations down to relatively simple steps, so the good-looking visuals can be created

quickly because of all that abstraction, you can create interactive data, connected

dashboards.

Those dashboards can be made into reoccurring data, connected reports. They can extract

large data source queries from applications like Hive, Spark, Big Query, Redshift, etcetera,

for those exploratory operations. The major players in visualization-focused platforms are

Tableau, Power BI, Click, lesser-known players like Burst, Pentaho, Spotfire, and if you don’t

have access to one through your workplace and you want to learn, both Tableau and

Power BI have free versions, Tableau Public and Power BI Free License, respectively. In

terms of the capabilities of these platforms, there’s pros and cons to all of them. If you

want to use AR or create some of the more advanced visualizations, Power BI is considered

by some to be a bit better, however, they use an underlying language called Dax, D-A-X, in

order to create functions. So if you need to create custom functions or custom operations

within the platform, there’s a pretty steep learning curve.


Page | 40

Whereas building custom calculations in Tableau, it’s generally easier because they have a

more abstracted, a SQL language that makes it easier to create custom calculations. But

that level of abstraction can make it hard to extend the platform beyond making a custom

calculation. So due to the difficulty in exploring the workflow and the different operations

used to create a visualization or an analysis, tools that are focused more on the workflow,

the transformations that are done to your data set. So if you look at things like

RapidMiner 9, they’re great for exploratory data mining and for creating visuals of the data

exploration. All you really need to know is search the name of the concept you want to

apply and you’ll see a module for it. They can be a little confusing for beginners and they

often require a lot of local processing, but these are a good middle ground between the

abstraction present in visualization tools and the more detailed operations found in code

focus tools. So speaking of the code focus tools, things like Jupyter Notebooks, our

notebooks is that plan. These are visualization tools second, code management tools first.

They are fantastic for exploratory data analysis because you have a clear record of every

transformation that happened to that data. You have a clear record of everything that’s

happened because it’s right there in code. In the notebooks, you can write your research

notes and your findings directly in the notebook and have the code recompile and be

completely reproducible on demand. However, because you need to know the

fundamentals of coding and that’s beyond the scope of this course, we won’t be covering

code-focused tools in these videos. However, I know you are a data scientist and I know

you already know the basics of coding even if you don’t think so.

I highly recommend you watch some introductory videos on BASIC, Python, or R to learn

the concepts. So in conclusion, in this video we covered what is a visualization tool, how

did they apply the same analytics concepts, but focused on different aspects of the

analytics workflow and or/user experience. The analytics concepts they apply are

connecting to data, transforming the data into information, and representing the

information in a usable format. Visualization tools may focus on spreadsheet data, the

visualizations themselves, the workflow to connect, transform, and represent data and

information or the underlying code behind all those data operations. The different tools

all have advantages and disadvantages, and you’ll be able to accomplish much the same

in all of them with varying degrees of difficulty and effectiveness. What’s more important

is how you use the analytics concepts to provide information on solved business

questions as well as explore what they do to find new business questions to solve. We’ll

introduce concepts around communicating results in your various audiences for


Page | 41

visualizations in our next video.

Section 2: Huskie Motor Corporation: Background

Huskie Motor Corporation (HMC) is an automobile manufacturing company with production and

sales throughout the world. Automobile manufacturing and sales is a complex and highly

competitive business. Let’s explore the Huskie Motors background in detail to learn key business

information about the organization.

Huskie Motor Corporation

Although the automotive industry has a broad global reach, only 15 companies produce 88% of

the world’s vehicles. HMC is a new and a smaller player in the automotive manufacturing market.

If it is to survive, the company must fully understand its markets, customer base, and costs to

keep profit margins positive. It has some very popular brands and high customer satisfaction-both

are critical assets at this stage of the game.

HMC: Miranda Albany

Miranda Albany was hired at HMC as a senior cost analyst three years ago, when the company

first began operations after a spin-off from Blue Diamond Automotive, a large auto manufacturing

company. Recently promoted to assistant controller, Miranda is anxious to make a good

impression on her boss.

Communication with HMC Executives

Although Miranda is sure that the data she has collected can help her management team make

better decisions, she does not have the time, or expertise, to figure out how to organize or use the

data effectively.

Miranda communicates with HMC’s executive team on a weekly basis to convey vital information

regarding marketing strategies, sales targets, and production needs. Yet she feels that her

information is often “lost in translation,” as the executive team struggles to digest the numbers.

Miranda believes that data visualization may be a crucial component in helping her effectively

connect with HMC executives.


Page | 42

Video 3: Data Visualization – Know Your Audience

Miranda’s struggle to effectively communicate to her executive team highlights the importance of

understanding the audience’s needs, and how visualizations answer business questions. Dan will

discuss how to select visualizations based on your audience. Video Transcript (Est. time: 13:34 min):

In this video we introduce the concept of understanding your audience interest. That is a

business question and how it determines the types of visualizations you should use. We

may create visualizations for audience interest of exploring data or perhaps informing the

audience or simply reporting a value. In our previous video, we introduced high-level

analytics concepts. In this video we overlay those concepts of connecting to data,

transforming data, and representing the information created by transforming the data,

we overlay those concepts with the data analytics process introduced in the case study as

well as the audience needs. Furthermore, the audience needs, the audience interest for

visualizations also overlaps with analytical maturity concepts. The analytical maturity

concept stages are descriptive, diagnostic, and predictive. Finally, we spend a little time on

what exactly do we mean when we say business question and what does that have to do

with the audience interest. So first let’s talk about the audience. It’s easy to assume that

visualizations and the concepts associated with visualizations are one size fits all.

We will soon learn in an upcoming video that there are rules for what visualizations work

best in a given scenario. So it’s only natural to assume that those scenarios are based on

the underlying data or information only. Who’s receiving that visualization should be

irrelevant, right? And sometimes there are boilerplate best practices. Sometimes there are

universal best practices for visualizations. However in most cases, context matters. The

audience matters.

For example, if the visualization explains information, we discovered in data that is an

informative visualization. The visualization would be very different than the numerous

visualizations we would have used to explore the data in order to discover that new

information that we are now informing someone else about. An informative visualization

is simple. It's to the point. And exploratory visualization is usually quite complex.


Page | 43

Understanding which visualization to use and for whom is an important concept when it

comes to visual storytelling. So visual storytelling is what you’re trying to get across. The

idea of visual storytelling means that your audience is receiving the information that they

need in order to best use that information. So somebody who would need a more

exploratory type of visualization or even an interactive dashboard would be someone like

an analyst, a manager, a consultant, somebody who really understands that data and

needs to understand the entire process of how some insight came about to really have by

and to really understand what’s happening. Alternatively, someone like an executive, a VP,

a director wouldn’t really care that much about how you reached some conclusion. They

would just want to know what that conclusion was. You would still need the underlying

evidence of how you reached the conclusion, of course, but it’s just the facts. Tell me what

matters. Those are simple informative visualizations, and when the exploration has

informed people so much that they now are repeatedly asking the same question, now

that becomes something more along the lines of a report, very similar to informative but

now we’re talking about something that’s reoccurring, something that’s about stability. So

these would be, for even an external audience, would be a reporting-focused audience,

but often it would still be the same people as exploratory and informative, just a very, very,

very, simple type of visualization.

And if you notice that I talked about that progression, that general process of taking data

and creating information for an audience, well that’s similar to the data analytics process

that we outlined in the case study. You’ll see all these steps identifying the business

question connecting to data, cleaning it, etcetera. All of those involve exploratory

visualizations. Once we’ve gotten the data and explored it, cleaned it, explored it again,

and believe we have an indication of the answer to the business question. Well, then we

use more informative visuals. Informative visuals that clearly explain the answer that

we’ve determined through our exploratory effort. Many times the information that we’ve

obtained isn’t a one-off answer. The information can provide an ongoing solution. For

example, many of the financial metrics which we’ve all learned in earning our CMAs or our

basic finance courses when we first started our job as a financial analyst. Those metrics,

those measures, like price-to-equity ratio, like CAPM, like earnings before interest, debt

and taxes, even debt-to-equity ratio.

These are not metrics or KPI, that, KPI that have always existed. Somebody originally

discovered that they are an indicator of business performance so whether informative

visualizations answer a question that is asked repeatedly, often that informative


Page | 44

visualization will become a report. A report again is a visualization, a reporting

visualization. Even a table is a visualization, needs to be extremely precise and

communicates the answer to a business question over and over and over again reporting.

Like financial reporting and visual reporting these are very rules-driven, it’s very data-

driven and requires more capability around consistency and the inference structure of the

data environment rather than necessarily the competency of how effective the

visualization may be at communicating the report. The effectiveness of the visualization

has already been demonstrated in the exploratory and informative steps and

conceptually the idea of reporting informative, exploratory, etcetera, that also aligns with

the concepts of predictive capability, descriptive, informative, the steps of analytic

maturity in analytics maturity is a business theory indicating that as an organization

becomes more capable of using their data to make informed business decisions, i.e.,

analytics, they will gain more value from their data.

The steps in analytics maturity are descriptive, diagnostic, and predictive. Many of you will

have also seen this before with prescriptive at the end, beyond the scope of this video but

it’s out there when you look it up. Going back descriptive analytics, descriptive analytics

merely communicates some historical event and leaves interpretation wholly up to the

end user. Just like a financial report. Consistency and speed are typically the most

important aspects of descriptive analytics. Diagnostic analytics, this goes a step further

than descriptive. It’s not just describing some attribute of data, it informs the audience

with additional information either to help the audience reaching more informed

conclusion or to inform the audience about the analytics conclusion. As you can see, this

is similar to informative visualizations. You will come across this in financial analytics.

When they’re taking a metric found in a diagnostic measure and they’re adding additional

information to provide context as to why that metric may have gone up or down.

Predictive analytics uses statistical models to forecast new information. Given historical

information such as changes in a KPI and the additional data around it, it’s using that to

provide a forecast of what could be a future value.

Obviously, financial forecasting is a good proxy or is a good example of predictive

analytics. Now, let’s talk about business questions. A few times I’ve said understanding the

business question or defining the business question, what does that mean exactly? Pretty

much what it sounds like. It’s the question we need to answer to help achieve our

objectives, the objectives usually solving some problem or fixing a situation. But as we

know, a simple definition, a simple question, can hide a great deal of complexity. A

business question can be interpreted as what problem are we trying to solve? And in


Page | 45

many cases, though, if we can define the problem, somebody has already figured out how

to solve it. Identifying the business question is by far the hardest part of this entire

process. Case in point, I once had a client in the course of six months, this client rejected

two different visual analytics contractors I had placed at their firm.

I personally went to that client at my company's expense, shadowed the third contractor

for a week, and during that time I found that while the client had asked for a highly

technical resource, somebody that was proficient in the visualization platform, they

weren’t really defining their business questions. The client wasn’t defining what they

needed out of the visualization platform. Their process of defining their business

questions was just a kind of have a roundtable discussion with the contractor about what

their problems were in general. The meeting was more of an airing of grievances rather

than a proper request requirements-gathering session. This was my mistake. I was young

and I thought if somebody was asking for specific skills, for example, a Tableau developer,

then they may only need those skills. I looked only at the problem as it was defined, not

the root problem itself. The lesson here is that just because somebody communicates

their problem does not mean they are communicating the problem.

The question that needs answering, this is the real skill of an analyst, is identifying what

question needs answering, not just the problem as it’s stated. Trying to solve a problem vs.

determining the actual problem which needs solving is the difference between a junior

developer and a senior analyst. It’s the difference between a purely technical skill and a

broader awareness of the business. It’s the difference between being reactive and

proactive. That may seem obvious to you but that’s because you’re probably one of those

senior resources, you’ve already accomplished the hard part. In this course it’s assumed

the root business question is already defined. It’s like how introductory physics assumes

everything is in a vacuum. Assuming the root business question is defined is a necessary

assumption for you to learn the technical competencies to answer said business question.

And many of you are taking this course because you feel the need to increase your

technical analytic skills.

Those skills are important. They are the price for entry. Take it from me, management

accountants are probably the best people I’ve ever met at understanding the real problem,

at defining the actual question. So use my hard-earned experience, if you’ve already

mastered the hard part of this process, and that’s understanding the bigger picture, the

bigger needs of the business, you’re already well on your way to being a data scientist, a


Page | 46

data analyst, whatever you want to call yourself. In conclusion, in this video we’re

beginning to categorize our visualizations based on the intended audience for said

visualizations. We may create visualizations for exploring data, informing our audience, or

simply reporting a value. The audience for visualizations overlaps with analytical maturity

concepts. Exploratory visuals are often used in predictive and diagnostic analytics, while

informative visualizations are primarily in descriptive and diagnostic analytics. Finally,

because of the goal of analytics is ultimately to either answer a business question for your

audience or identify new business questions to answer, in which case you are often the

audience, we discussed what a business question actually means. In our next video, we’ll

dive deeper into those visualization concepts we keep talking about. Stay tuned.

Miranda Hires a Consulting Firm

To help her utilize the massive amounts of data at her disposal, Miranda has interviewed

consulting firms that specialize in information technology (IT) and data engineering.

Ultimately, Miranda chose D & A Consulting because of its automotive industry expertise and its

focus on data analytics and visualization.

Video 4: Data Visualization – Visualization Concepts

D&A Consulting’s expertise in the automotive industry will enable them to choose the best

visualizations to effectively communicate their findings. Now we’ll hear from Dan as he discusses

some basic visualization concepts and the various ways data can be visualized. Video Transcript (Est. time: 6:32 min):

Now that we have some awareness around how our visualizations will be received by

different audiences, let’s move our discussion towards the visualizations themselves. This

video concentrates on visualization theory, namely how shapes, lines, and points are used

to display information. Shapes represent a single value or groups of values. Lines

represent change or connection. Points represent the intersection of two or more values.


Page | 47

Just as even the most advanced statistics are still based on addition, multiplication, and

division, even the most complex visualization is formed through some combination of

shapes, lines, and points, and in this video I’ll be white boarding, because white boarding

is how we often start defining questions or creating visualizations. In practice it’s good for

us to see some of those real-world principles in action, and yes, that means that [we’ll]

basically be drawing shapes in this video, and that may make visual theory seem a little

childish at first, but I assure you we are not in elementary school again because we will

quickly move from basic shapes to much more advanced concepts.

Let’s get started. We’ll begin our discussion on visualizations with a little theory, followed

by more practical examples in the upcoming videos. Generally we have shapes like bars,

blocks, or circles, sometimes separated into segments. These shapes tend to be single or

single aggregate value and aggregate value as in sum or total. The shapes, the pattern of

data within the category, or the relationship between the categories is not important. We’ll

talk more about what that means when we cover distributions and relationships

respectively. Next, we have lines. Lines tend to represent a change or difference between

values most often associated with a change over time. Lines are visually effective at

showing that one point is lower or higher than another when on the same axis.

It’s also useful to reinforce size differences or position differences when shapes alone may

be ambiguous due to their ability to indicate direction or trend. Lines are useful to simplify

complex patterns as seen in forecasting and regression. Due to their ability to show

difference and changes between shapes, they’re useful for things like data distributions as

we see in histograms. For points, the position of a point in a visualization represents the

intersection of two values, which is why points are most often used in scatterplots. That

level of granularity is useful for recognizing larger patterns within the category or group,

as well as between the categories or values. As I mentioned, you’ll see lines used to

simplify trends or relationship within a collection of points like with a scatterplot and a

regression line. In fact, you may see shapes or colors used to distinguish patterns

between groups of points.

You can see that we mix the concepts of different shapes and lines and colors within

visualizations, so as a gentle introduction to some of the more complex visualizations, let’s

look at how points, lines, and shapes are used together to visualize more complex

patterns within and between groups of data. Visualizing patterns, the data itself is called

looking at the distribution of your data. Distribution visualization such as histograms and


Page | 48

box whisker plots combine both lines and shapes. There are even points showing outliers

at times, and when I say distribution of your data or frequency of data, it means how often

a given value appears within a given data element, or a group, or category, or column. It

can be the sale price of cars, age of customers, time between purchases, etcetera. The

frequency distribution or shape of data gives us more insight into the underlying data

which would not be apparent if we were only given the average or median data value.

The boxes in a box whisker plot represent the middle 50 percentile of your data with a line

that represents the median or the middle value of your data. Now you’ll also notice that

there’s a line above and below that isn’t a line so much as it is a T shape or whisker. Those

whiskers represent typically the top and bottom 25% of your data, but you may also see

lines between box whisker plots to show how different one box whisker is to another, so

you get a sense of how much overlap there are between the two. And keep in mind,

histograms and box whisker plots represent the same thing, the shape or distribution of

your data. You can look at box whisker plots as a histogram simply turned on its side.

Alright, in this video we covered how simple geometric objects can be combined to

represent complex information. Shapes indicated some aggregation of values. Lines

showed change over time and/or trends, points and intersection of values and shapes, lines

and points all combined show distribution of values with a data set. Moving forward, we’ll

see how the data visualization tools apply the concepts within the context of your case

study and look in detail on specific visualization types, both informative and exploratory.

HMC Staff

Miranda asked Megan Martinez, a senior staff accountant at HMC, and Adam Green, a staff

accountant at HMC, to work with D & A on the project.

HMC: Megan Martinez

Megan has been with HMC for two years and has recently relocated to the corporate

headquarters in Dearborn, Mich. Megan’s corporate transition is almost complete, and she is

anxious to move forward in her current position.

HMC: Adam Green


Page | 49

Adam is a young, aggressive employee who began with the company eight months ago, since he

graduated college. He has a good sense of judgment and is eager to make a good impression on

upper management. Miranda believes the two employees will provide a good mix of experience,

dedication, and teamwork.

Section 3: D&A Consulting Group: Background

D & A Consulting was started by Doug Chan and Arlo Paxton five years ago. Doug and Arlo have

been friends since college, having graduated with accounting degrees from the same university 15

years ago.

Although they initially went to work at different accounting firms, they both followed similar career

paths. After becoming managers at their respective firms, Doug and Arlo decided that their real

passion was in teaching clients how to use data to make better business decisions.

They started their own consulting firm with one primary focus: helping clients better understand

their businesses via the use of data analytics and data visualization.

Video 5: Data Visualization – Visualization Selection

D&A Consulting’s primary focus is to help clients understand their business better by using data

analytics and visualization. Dan Smith will now discuss how to determine the best visualizations in

the context of the business question. Video Transcript (Est. time: 5:20 min):

In this video, we’re going to look at the different types of visualizations and how to select a

given visualization for a given business problem, and provide a simple flowchart to aid

you in visualization selection. All visualizations have a function. You may simply be

comparing two different collections of objects such as total revenue vs. total expenses.

You may be looking at the pattern of a single value over time, such as profit over time, or

you may be exploring a complex relationship between many different elements such as

total revenue vs. expenses by country by month. Regardless, all visualization functions are

ultimately used to answer some form of business question, and you can use a simple

decision tree once you know what types of questions you need to solve.

So let’s dive deeper into defining these patterns in visualization functions. The pattern that

we see in most, if not all, visualization decision tree diagrams, it starts with a question of

are we trying to visualize a comparison, composition, relationship, or distribution?


Page | 50

And are we looking at a single point in time or a change over time, are more informative.

Visualizations are the comparative and composition visuals. Comparison is when we want

to compare differences between our groups as a whole, so the aggregates of profit

between countries or between even sales channels, something along those lines. The

standard bar charts are typically associated with comparisons and are probably the most

common visualization you’ll see. So many of you may already have experience using them

when comparing categories, factors, or groups. Again, those terms tend to be used

interchangeably. When, when comparing those categories over time, particularly when you

have many categories, it’s almost always a good idea to use a line chart. We do often see

bar charts used when it’s a small number of time periods like in financial statements from

year to year, but if we have a lot like weeks or days, a line chart is far more effective at

representing those more granular changes.

Composition, visuals on the other hand, these are typically shapes or combinations of

shapes as well. They show the categories within a larger group, think population per city,

within a state or sales per region within a company. You’ll also see composition visuals

associated with pie charts. We can talk about those. They have their use, but we’ll dive into

that more in the details of our informative visualizations.

Let’s move on to the relationship and distribution visuals. Relationship visualizations tend

to focus on how the individual values of one group change given the values of another.

The last visualization family are distributions. These are scatterplots, histograms, and box

whisker plots we already mentioned. We’ll dive a lot deeper into the relationship and

distribution visualizations when we talk about exploratory data, but let’s talk about theory

again. Why do we use these visualizations when we do? Who decided these rules?

Well, we did. It’s simply the way that humans process information. Decades ago, a famous

study by Cleveland et al tested many, many people with a series of visualizations including pie charts, bar charts, line charts, etcetera, all the things we just talked about. The study

found a few general trends in the way people interpret information visually. Visualizations

that share a baseline that start from the same point which are only viewing a single

distance or direction, for example, a bar chart, which is ultimately representing the distance

of that bar from a baseline. Those tend to be the least confusing, confusing visuals. While

people struggle to interpret the differences between area that is one circle bigger than the

other and even more so, they struggle with interpreting difference between color, shades

of color in particular. So Cleveland has a hierarchy of which visualization to use, when. We

will dive deeper into that in the upcoming videos.


Page | 51

In conclusion, in this video, we saw how taking the basic concepts of visualizations and

how we can use that to find the best visualization for most business questions. Using a

simple flowchart, we reviewed how research established using these visualizations

resulted in more interpretability and less human error. In our next two videos, we’ll take

an in-depth look at the informative visualizations, line, pie charts, as well as the

exploratory visualizations, scattered tree bubble and histograms. Stay tuned.

D&A's Business Purpose

Though data analytics is not a new concept in the business world, the amount of data available and

the number of sources from which it can be captured has skyrocketed. Driven by lower storage

costs and more “user-friendly” analysis, software businesses have vastly increased the amount of

data they collect and store. Yet finding the talent needed to transform that data into useful insights

is what businesses find challenging. D&A Consulting helps companies fill that void.

D&A Consulting

Doug and Arlo are excited about the opportunity to work with HMC. They have assigned their

automotive industry expert, Kevin Lydon, as the project lead, along with a D & A new hire, Jan

Morrison. Kevin has been with D & A nearly as long as the company has been in existence. He and Doug had

met each other on a consulting project where Kevin was working as an IT engineer. As project lead, Kevin is enthusiastic about mentoring Jan on her first assignment. He is equally

enthusiastic about the potential for improvement at HMC. He reassures Jan that this client will be a

great opportunity for her to test her data analytics and data visualization skills.

HMC: Miranda Albany, Continued

She has advocated a “data-driven” strategy for decision making at the company by capturing a

vast number of product-specific details relevant for both production and marketing.

The problem is that the company has grown so quickly that Miranda is having a difficult time

keeping up with the massive amounts of data that continue to accumulate.

To further complicate matters, there is a growing need for reporting detail and in-depth analysis

of product lines given the availability of additional data.

Video 6: Data Visualization – Informative Visualizations


Page | 52

Next, to further understand how to communicate ideas and discover business opportunities

through data visualization within analytics, Dan will now go through informative visualizations and

how they can ultimately tell a new story for the business. Video Transcript (Est. time: 9:46 min):

This video covers visualizations more focused on informative and reporting functions,

namely bar, pie, and line graphs. In our visualization selection flowchart, bar, pie, and line

focused visuals are associated with comparative composition and time series visualizations,

respectively. Before we begin exploring these informative and reporting visualizations,

recall that different audiences require different types of information communication, so our

first thought with visualizations should be to consider if our task is exploratory, informative,

or repeated reporting. In your case study, they mentioned the data analytics process.

Determining the correct visualization begins the same way. We need to first understand the

business question before we can determine the correct visualization. However, you may

find there is no clear business question or you are the only one who is supposed to

determine the business question, but for now let,s start with the assumption that we have

discovered a clear objective and the question we need answered is well defined.

Clear, well-defined question with quantifiable answers means that we should approach

our visualization from an informative or a reporting perspective. And most well-defined

business questions involve comparing the performance of one group to another. So let’s

look very quickly at the theory of why we use the visualizations we do for comparative

visualization. We talked about Cleveland before. In a further analysis using the concepts

that Cleveland uncovered, researchers found clear differences in people’s ability to

interpret comparative visualizations. The diagram on your screen shows distributions of

errors based on the perceived differences in comparative visuals. Looking at distance

from a common baseline. That's the first grouping of perceived errors, the lowest amount.

Next we saw difference in length or radius such as we would see with a, with a pie chart.

And the final one with the most error were visualizations that had a difference in shape

without a common baseline.

So these are things like bubble plots or tree, tree graphs. And this reinforces the concept

that we should use bar charts for comparative visuals whenever possible. And this is to

reduce errors. Whereas bubble and tree map visuals may look cool, but they’re not ideal for

business critical reports. It helps us to find some rules of thumb for our visualizations. So

now we can finally move into some examples of visualizations, such as if our business

question is what percent each region contributed to total sales. Now, because there are


Page | 53

only three regions, we could look for a pie or bar chart. We could pick either one, but if the

question was how much did each country contribute because there are more than four,

only a bar chart is really appropriate here unless we do some type of additional grouping.

Bar charts and pie charts are used when comparing single or cumulative values between

groups. As a general rule, use a bar chart over a pie chart, particularly when you’re talking

about actual values. We probably want to know the absolute value or actual value of a given

country in these examples rather than the percentage of total. If we were

comparing one country to another, then potentially the pie chart would be useful, but we

want to know the actual values so a bar chart makes more sense. So some rules of thumb,

you always want to use a bar chart. If you have more than five groups, if you have more

than 10 groups, consider combining some of the groups or creating a detailed

visualization that we’ll talk about in dashboarding. So now let’s look at when we are

focused not just on comparison but changing from one group to another.

If the business question is “How much did each region/country of their revenue or

contribution margin change from year to year?”, it is possible to use a stacked bar chart or

a hundred percent stacked bar if it’s percent contribution. However, given the time series

nature of the business question and just the gross quantity of them, it becomes very

difficult. It’s been shown in all the research that we’ve listed that lines connecting points

over time are more effective at communicating change. Therefore, a visualization with

connecting lines is preferred. So you can also say and suggest that of an area chart. I’ve

heard some people say this as well. Area charts don’t work very well with negative values.

So you’d see a much better representation of a change over time with the line. If the

business question is comparing group contribution to a whole over time, because this is a

compound question, that is, what is the total amount contributed by group and how

much did that amount change over time?

My preference here is to use two separate visuals, a bar chart to show total amount and

align charge. This showing percent change of total. In this situation it’s okay to use a stacked

bar but more as a visual cue to our audience. We need to consider the groups making up

our total and this is so they have a better frame of reference for the associated percent of

total visualization. Also notice I placed the first visual above the second. In the case, the

second visual is difficult to interpret without context from the first when visual

interpretation depends on another visual. While some cultures may read left to right and

others read right to left, most if not all read top to bottom. So put your simplest, highest-

level visual on top and the details on the bottom.


Page | 54

Moving onto more reporting-scoped business questions. What if you need to know how

much of this year’s variable cost budget has been used year to date? More generally, how is

a given KPI performing against its target value? Now, for the visualization alone, some would

only want to report the value using something complicated like a gauge. People as soon as

it gets into a KPI, they want to stick a gauge on there. However, in my opinion, the next

question when we’re talking about any KPI with a performance target is almost always, well

what was it doing last period? What about last year? So why not just start with a time series

graph and insert a line for the target budget. This way you know both how

it’s performing now and if it’s above or below that target. Whatever data manipulation you

have to do to insert a line will be exactly the same as with a gauge or other visualization,

plus you’re already answering questions about historical performance.

Now as a final piece of advice when talking about informative vs. reporting visualizations,

when the business question is established and requires repeated refreshing-a refresh just

means updating the data behind the visualization-this is almost certainly a report rather

than an informative analysis. Look for keywords like KPI, SLA, metric, defined business rules.

When these are spoken about the needs of the visualization, your organization may have a

dedicated reporting team. This reporting team may specialize in maintaining such reports.

If you are not part of that group, consider reaching out to them to maintain the informative

analysis that you now want to turn into a report. Personally, I would leave that up to the

professionals.

All right. We covered a lot in a brief period of time. Some key takeaways from this video.

Bar graphs work well when comparing values between a few groups on the same scale. Pie

charts may be okay when showing percent of total for a few groups; however, a bar chart

can show the same information answered both as a comparative and composition visual.

Use line charts for change over time. Don’t be afraid to combine visualizations if you need

to tell a more complicated story. And finally, your organization may have a data and

reporting specialist to help standardize your information into a more consistent report. Our

next video, we’ll move into more complicated exploratory analysis and the associated

relationship and distribution visualization.

Video 7: Data Visualization – Exploratory Visuals

As a next step, we’ll now hear from Dan as he discusses best practices and tips for using

exploratory visuals, relating back to the business question, so one can learn about the data and

find new stories to tell.


Page | 55


Now let’s move on to exploratory visualizations. In this section, we will look at more

advanced visualizations such as scatterplots, tree maps, bubble charts, principal

coordinates, plots, histogram, and box whisker plots, and let’s start with some financial

and sales metrics. What if we want to know what is the relationship of material costs to

after-tax profits?

Well, as a starting point, even the most advanced analyst will use scatterplots to start

visualizing patterns. Here you can see the material cost vs. the after-tax profits with a point

for each VIN number, so that’s what did it cost to make the car in terms of raw materials

and how much were we able to sell that car for. This demonstrates how scatterplots show

us the intersection of two related series of values. Think about two values in the same row,

but different columns. In Excel, you have a row of data, you have your column, and then

you have another column. That’s the same way you create a scatterplot. In a pivot chart, we

may be able to observe a general upward or downward trend in data points between the

two factors, or we may be able to observe groups in the data set. In the example on your

screen, you can see Bose, you can see the relationship between material costs and after-tax

profits, as you expect. Generally, we see profits decrease as material costs increase, that big

downward slice in the middle. However, there are two major exceptions. There is a group

in the upper right with really high material cost, but also high after-tax profits. And then you

see a group in the lower left with negative profits but also unusually, extremely low material

costs. So how do we examine this? Well, within the scatterplot we can start adding colors or

other shapes to start trying to pick out patterns. We can look at a model and see that the

CHARE model, C- H-A-R-E, constitutes the entirety of that upper-right-hand cluster. So we

can imagine that this is the luxury brand with high margins and high material costs.

Unfortunately, the lower left is a little more ambiguous. That looks like there's a bunch of

other models

within that grouping. So you see a scatterplot may not always tell enough of a story. In

fact, it rarely does.

So let’s look at some other factors that might be influencing these unusual patterns in

groupings of a cost vs. profit. We saw there were business questions related to channel,

so let’s explore viewing multiple channels against models against their profits. Let’s

explore viewing groups in tree maps. These are similar to pie charts but with squares

instead of slices. So I guess it’s more of like a brownie or a lasagna chart that I don’t know.

If you look at seeing after-tax earnings by channel one, two, three, and model, that’s a lot

of information. There are some groups here and there that may stand out. Um, but it’s not

that descriptive. Similar to a bubble chart, which a bubble is sized by the value. And then


Page | 56

colors delineate the different channels or models. So for each slice you get a bubble. And

the value is the size of that. I’ll be honest with you. These charts aren’t very useful by

themselves. They’re good for exploring things in a dashboard because you can mark them.

However, by themselves they don’t typically tell much of a story. It’s often more effective

to use a bar chart for this type of stand-alone visualization. But when exploring in a

dashboard, they can be rather effective. More on that later.

So as strange as it sounds coming from me for even an exploratory visualization, but also

one that can be very effectively moved from exploratory to informative, I feel a table with

a heat map is a better visualization for this scenario, and often in many scenarios, the

example on your screen, you just see a simple, well, basically, a pivot table. It’s your sales

channel by model, by the average after-tax profits. This represents the same thing you

saw on all those other diagrams. Just it’s more condensed. You can still see the same

information based on the color, but you also see a value. You see a raw number. This is

tangible and in fact, speaking of exploratory visualizations and an example of how you

would combine these so that you can explore but also inform, you can create a simple

dashboard of a drill down, and we’ll describe what a drill down means in the

dashboarding section, where you can look at your scatterplot and see how the different

clusters or values within that scatterplot relate to another visualization. Like in this case,

the table diagram that we created enough about dashboarding and groupings in

scatterplots. Let’s move on to say a business question about does an increase in sales for

one sales channel take away sales in another? So does an increase in sales for one sales

channel take away sales in another. This is a common question in the real world, and it’s

an extremely complex question.

This question is probably better served by machine learning and some complex modeling,

but for the sake of visualization, we only want to compare the percentage of sales in one

channel vs. these percentage of sales in another and how that changes over the course of

months. We want to look at it month to month in this example because we are exploring

the relationship between two groups with multiple occurrences. In this case, the

differential of percentage sales in one channel vs. the another. For a given month, we can

use a special variation of a line chart, a principle components plot, so the specific criteria,

it has to be the same scale. Two groups, some relation, and this tells us how one moves vs.

the other. In fact, there’s some additional research on this showing that a principle

components plot is better interpreted in highly ambiguous scenarios. Where you’re trying

to show there is not much of the relationship vs. places where there’s a strong

relationship. A scatterplot is well interpreted on the fringes. A principle components plot

helps people not make false assumptions of relationships. Scatterplots are very easy to do


Page | 57

that. If you draw a line through it, you kind of draw a picture in your head. It’s very hard to

make a fake assumption or false assumption of relationship when there isn’t one with a

principle components plot.

Off my soapbox for principal components plots and now onto my soapbox for measures

of central tendency. Now, in many cases I’ve seen people use bar charts or even pie charts

to communicate average value between groups. But an average isn’t just a single value. It

isn’t a value that, that means a single thing. So like some total sales, I can sum up my sales.

There’s no misinterpretation about what that value means. It’s a bunch of numbers put

together as a total average or arithmetic mean is a measure of central tendency. It’s

describing a group of values average only suggest the central value, but it doesn’t tell the

whole story. Average can be skewed, meaning the same average can represent vastly

different underlying values. Look at a normal distribution vs. a skewed distribution.

I’m going to show you my process of looking at a distribution for a few different factors as they relate to contribution margin. And when we look at the frequency of our contribution margin, we see a lot of hills. It looks like a mountain range. We see when we see a multi- peaked distribution like this, we refer to it as multimodal. Multimodal means that there are some values that occur very frequently.

Multimodal suggests there are multiple categories within this trust distribution that there

are groups, so we may see effects similar to what we saw in that scatterplot. We may see

clustering groupings of things that vary by country, region, sales channel, etcetera. Our

distribution may not be comprised of a single group of data or information. There may be

multiple blobs of, of information, of groups that we need to tease out. You can see that

some of these patterns emerge when we overlay different categories in contribution

margin distribution, but first I need to take a step back and talk about what is a

distribution.

What do I mean by distribution? You can see a histogram, but what does that really talk

about? Let’s, let’s simplify the scenario and let’s say we have two car models, A and B. Let’s

say for model A, we have contribution margin of 100, 95, 90, 87, 85 or whatever. It’s on your

screen as model B. We have similar contribution margins and this vastly simplified example,

we can see the average for model A is around 83. The average from model B is around 73.

However, if we add some additional values and model A, say something that’s well below

the, the existing average, we’re in model B, we pick just two values that are similar to the

average. Our new average is completely different. If we just look at the average for model A

vs. model B in the scenario, we would think that model B had a far higher contribution

model than model A. Whereas it’s only these outlying values that are driving things, driving


Page | 58

things down for model A, which is one of the primary reasons why histograms and box

whisker plots are so important.

What we’re looking at when we see a histogram are not the numbers themselves. It’s a

frequency distribution of frequency of how often those numbers occur within given

groups. We call those groups bins. In this example, let’s say we have bins of 190, 80, and

so on and we can look at the frequency count in model A and model B, we can see that

there is an outlier. Two outliers in this case, way down at the bottom for those values of

10 and five histograms are so important for an exploratory analysis because we see these

patterns. We can see when there’s outliers, we can see when there’s different patterns in

data. And the example I gave it with some outlying value that could have been from, I [don’t

know], a different country, it could have been from some new sales initiative. Uh, it could

have been from who knows where, but there’s definitely something going on that wasn’t

represented with just the data that we saw. And finally, when you see yourself clicking

through a bunch of different histograms or a bunch of different scatterplots or a bunch of

different stuff and just exploring, exploring, there are other visualizations that you can use,

visualization specifically designed to tease out these patterns. One of the more powerful

visualizations for this. In fact, an analytics technique is a decision or regression tree.

However, that is a little beyond the scope of this course. Exploratory visuals can be a lot of

fun. You know, I love them personally. They help you see your data in a brand-new way.

We learned a lot, talked about histograms, talked about, uh, scatterplots, understanding

how these exploratory visualizations are about looking at the data itself to find new

patterns, and we even did a little dashboarding. And in our next session, we’ll cover the

underlying dashboarding concepts to help you become the best you can at data

exploration and informed visualizations.

Lesson 2: Huskie Motors Operations

Section 1: Huskie Motor Corporation: Operations

HMC is currently selling in 15 countries in three regions: North America, South America, and

Europe. Table 1 provides a breakdown of the countries within each region.

Huskie Motor Corporation: Operations, Continued


Page | 59

Automobile manufacturing and sales is a complex and highly competitive business. Though the

automotive industry has a broad global reach, only 15 countries produce 88% of the world’s

vehicles.

HMC Models

HMC is a fairly new and small player in the automotive manufacturing market. HMC currently

offers three brands: Apechete, Jackson, and Tatra. Each brand has several models as detailed in

Table 2. The models available fall within seven segments of vehicle types: compact, sub-compact,

full-size, mid-size, luxury, minivan, and sports utility.

HMC's Manufacturing Options

HMC offers several series for each model for a total of 34 different series. A breakdown of the

available series offered by model is provided. See Table 3. Each model is available in various body

styles, engines, drive configurations, transmissions, trim, color, and seat types.

Since various engine and transmission builds, see Table 4, come from one division and finishing is

done in another division, these options are described in different tables.

Click each icon to learn more.

Packages and Options

Like many automotive manufacturers, HMC offers a variety of packages and options that buyers can

add to their vehicles. Packages include a specific set of bundled options, or buyers can choose

individual options separately. Table 6 provides a list of the six packages along with the detail of

products and services contained in each package. Table 7 provides information regarding options

that may be purchased ala carte. At least one option, however, is contained in each available

package.

Click each icon to learn more.

HMC Data Movement

Recall the data life cycle from the previous module and consider how Huskie Motor Corporation’s

data ages.

To again explain how HMC gets and uses all their data, let’s take a look at a short video to further

illustrate how data is created and stored in business.

Video 7: Movement of Data


Page | 60

To conclude the topic of data movement, we will now move to a presentation from Dan Smith

where he further illustrates how data is created and stored in business.


First, we’re going to look at data creation and storage. And I’ll start way back in 1967 when

Dr. Melvin Conway identified that any system designed by an organization will match the

communication structures of that organization. What that means in plain English is that a

business will create processes and solutions based around how they are structured-

marketing, supply chain, accounting. They’ll all produce their own applications and

typically their own data as well. And who were the first people to start storing and

recording data all the way back in Sumeria, the accountants. In fact, I would argue that our

current data architecture is largely founded on accounting practices. Accountants use

transactional data created by customers and business operations produce reports that

are then used to inform stakeholder decisions. So what does this look like? Well, there’s three primary domains, business data, the

creation, storage of data, and the business decisions. Now, how is it, where does this data

live? Well the customer activity generates transactional or raw data, that data is stored in a

transactional database, which is then and then shaped into a reporting data set. The

reporting data is used to make business decisions, and the resulting activity from the

business decisions create more transactional data. Now that we’ve seen how that data is

created, let’s look at how it gets from one place to another. You can see the data

movement cycle over here on the right of your slide. When that data is first created, the

information messages are typically transferred through a bunch of layers in applications,

so load balancers, temporary cashes, firewalls, etcetera. The process of getting activity

data into your data system is typically called integration and it moves data into a

transactional data store. Now, once it’s in its raw form or sometimes as part of integration,

the data is shaped or cleaned into a form usable by people in analytic applications. This is

generally called ETL, extract transform load, and the data output is stored in the reporting

database, data warehouse, or data lake. The transformation of data into business-usable

information is called analytics. We will talk about a lot about this in the next module.

Although Guy said, simply put, analytics is what facilitates data informed decision making.


Page | 61

And finally, business decisions attempt to influence customer activity. Making this one

little arrow right here, influencers in the upper-left-hand corner of the business data

creation and movement cycle. The entire reason that business exists and the entire

reason that the data analytics cycle exists to help the business make more informed

business decisions. So let’s look at this process in action.

Lesson 3: The Data Dilemma

Section 1: Data Dilemma

Prior to calling Miranda at HMC, Kevin meets with Jan to go over some basics regarding the project.

Kevin uses a five-step model as a data analytics framework. As a first step, he explains this model

to Jan, because he believes it will help her understand his process.

Data Analytics Process

Kevin: “So, Jan, this should be an interesting client and project. I know you are a bit nervous about

your lack of experience with data analytics design, but I have been studying this stuff for a long

time, both from an applied perspective as well as a theoretical perspective. There is a great book

on this subject, Data Analytics, by Warren Stippich and Bradley Preber. In this book, the authors

describe a five-step approach to the data analytics process:

Step 1: Define the question

Step 2: Obtain the data,

Step 3: Clean and normalize the data

Step 4: Analyze the data and understand the results, and

Step 5: Communicate the results. I think that understanding and following these steps will help you tremendously as you find your

footing on this engagement. Our client, HMC, would like to have the project finished within a four-

to six-week time frame. How do you feel about that?”


Page | 62

The Data Dilemma

Jan: “That seems like a pretty short window, Kevin. Do you think we can finish within that time

frame?”

Kevin: “Well, I do not think we can set a realistic time frame without knowing exactly what our

challenges are and what the data looks like. Remember, we have to be able to trust our numbers.

We do not know yet whether the data is clean. We will have to do some validity testing first.

Remember, follow the steps, think it through, and keep a calm head.”

The Data Dilemma, Continued

Jan: “What exactly is our goal for HMC?”

Kevin: “Basically, HMC captures very detailed transaction-specific data. For example, HMC has

extensive cost data, marketing information, and plan data for each car sold. The amount of data it

has is overwhelming for both the HMC management team and the executive board. Our job is to

help Miranda and her team figure out how to use the data to better understand costs and

profitability by vehicle model. They also need detailed, relevant feedback regarding sales volume

and sales location for planning purposes. Miranda would like to be able to predict sales at least

three quarters out so that the management team can better plan production schedules.”


Kevin: “An equally important goal is to help them understand the benefits of data visualization

and give them some ideas about how to present the data to their executive board. As you know,

Jan, data visualization software, such as Tableau and Qlik, allows us to turn large volumes of raw

data from various sources into easily comprehensible graphical representations of information.

Data can be accessed live or extracted from some other source. Data can also be presented in

summary fashion while preserving the underlying detail, which can be instantaneously viewed as

desired. This type of technology could be instrumental in helping HMC improve and maintain its

competitive advantage.”


Jan: “How does the management team store the data, and how will we be able to integrate it with

the visualization software?”


Page | 63

Kevin: “I believe that the plan and forecast data are on Excel spreadsheets. The actual sales data

will likely come from its enterprise resource planning (ERP) system. We will need to confirm that at

our initial client meeting. We should be able to get transaction-specific information at the vehicle

identification number (VIN) level, since the VIN is a unique identifier. I am not sure how much

information is being collected at the individual vehicle level, but that could be a rich data source

for us if we can get it.”


Jan: “Is the data they have ready to be used, or will we need to verify and organize it?”

Kevin: “We will see just how clean it is once we start pulling it in. We will initially build the analysis

using a representative sample of HMC’s data. The sample represents about 25% of the total

company data. The four- to six-week time frame should be feasible if we receive clean data. Our

contact at HMC, Miranda, is assigning two accountants from the controller’s department to pull

the sample and help clean the data. Once the client approves of the proposed analytics and the

dashboard, we can roll it out using all the data.”

Data Sampling and Analysis

Jan: “Will using a sample be enough to convince the client of the value of the analysis?”

Kevin: “Absolutely. The sample allows us to determine which type of analyses we can provide, by

looking not only at the data, but at the format of the data as well. We can use the sample data to

create demonstration dashboards for the client. The analyses we perform with the sample can

then be recreated for the full data set.”


Jan: “I am not sure I understand what you mean by a ‘dashboard.’”

Kevin: “Basically, a dashboard is a screen that consolidates visualizations, graphs, charts, and so

on to concisely display the metrics and key performance indicators for a business. Summarized

data can come from a variety of sources and can even be presented in real time.”


Page | 64

Jan: “I cannot wait to get started! This is going to be a tremendous learning opportunity for me.”

Video 8: Data Visualization – Dashboarding

As Kevin helped Jan understand the purpose of a dashboard, we will again turn to Dan Smith as he

further defines dashboarding and introduces concepts and best practices when using dashboards. Video Transcript (Est. time: 18:22 min):

With the popularity of data-connected visualization tools like Tableau and Power BI, data

visualization is becoming almost synonymous with data dashboarding. Well, some, myself

included, would dispute that association. It’s still very important that we include

dashboarding when learning about data visualization. In this video, we will define

dashboarding, explore dashboarding concepts, and recommend some dashboarding best

practices.

But first, what is dashboarding? When someone thinks of a dashboard, they usually think

of a car or other device full of gauges, numbers, meters, things that tell us the immediate

state and relevant information required to understand how something is doing related to

data and information. Dashboarding can be defined as an information management tool

used to organize and display information typically with an interactive element and

connected to an underlying data source for rapid updating. And let’s, let’s think about why

in your case study a dashboard seemed like such an appealing prospect for Huskie

Motors and consider also if it’s the right choice for the overarching problem statement

and the requirements of our stakeholders. In our case study we’re asked to determine

how data will be represented for several different stakeholders. Let’s explore how we

could use a dashboard to display this information, the dashboarding concepts in each,

and if we should use a dashboard to display this information.

Our first task is related to the dashboard concept of displaying KPI and metrics. We’re

tasked with displaying the overall performance analytics, a question such as how Huskie

Motor Corp. is performing globally, how are the various brands performing in various

channels, and what are the least profitable models. Here we can apply a core concept of

dashboarding displaying KPIs and metrics. I mentioned before that lines are very effective

for displaying KPIs. Presumably there are a set of KPIs, key performance indicators, that

they can use to define performance. I would assume these were defined by somebody,

and we can include these in a dashboard that was connected to a data source that had


Page | 65

these KPIs defined. And let’s think of our audience for these high-level performance

metrics. Anyone who, who is interested at metrics at this level are likely going to be

executives, senior vice presidents, maybe senior directors, depending on the

organizational structure and title responsibility. In other words, metrics at this level are

foreign audience unlikely to have the time or inclination to go digging around their own

data? They have trusted advisors to give them guidance on the pertinent pieces of

information. In other words, these are informative visualizations, probably more reporting

than anything, so a dashboard may be appealing and a dashboard and the purest

definition of data connected and immediate results, but it’s not going to be an interactive

dashboard or if it is interactive, it’ll be for someone reporting results with the dashboard,

not for exploration. By our reporting results, I mean sitting in a presentation and clicking

on things in a predefined script, so the dashboard will probably be for a presentation, not

something that somebody is going to be regularly accessing online.

Our next concept, filtering and drilling down the next series of questions. In your case

study, the financial analytics here, we’re looking at prices per model and how they

changed over time or which model has the most variability in variable cost, etcetera. While

these are well-defined questions, there is a degree of slicing and dicing exploration

available. If someone wants to look at different packages or regions or country or models,

sometimes there will be complicated questions which may be hard to represent in a single

visualization. Therefore, the financial analytics questions have more opportunity for

exploration. So a more interactive dashboard is probably useful in this scenario. And by

the way, by slicing and dicing, I mean a dashboard where a user may wish to view the data

in a different way than represented by a static dashboard or table. For example, they may

wish to highlight an area of one visualization to learn more about the data point

contained in that visual. That would be a drilling-down operation that’s often used in

scatterplots and other complicated visuals like a map and tree charts to give more

information about a given data point or points. That would be like a dicing of data.

And filters are often used to change the dashboard. In contrast to the drill down, filters

typically exclude all the data for portions of the dashboard. They exclude the underlying

data. A filter is exactly the same as a filter in a pivot chart or pivot table. And a pivot table

is also an excellent example of slicing and dicing data. You just pull in different elements

and you see data in a different aggregation or context than you would, and it’s an original

form. And fundamentally, if there is a demand for data exploration, just a bit of caution,

you will never be able to satisfy all the data exploration to band with simple visualizations,


Page | 66

you have to include some interactivity. So a dashboard makes a lot of sense for the

questions from the finance team as a, for the operational analytics questions, those are

much more complex. They’re asking what are the top values overall in multiple groups.

Now, asking what are the top sellers over is easy. You can just sort it. But when you start

comparing the top sellers from one group to a top seller in another group, your analysis

becomes much more difficult because you’re not just ordering a column in a table. You

have to select the top individuals or regions or models or whatever it is from a group and

adding to that complexity, the calculation for the number of model options. How many

days are the various models on the lot sale? Uh, you have the potential for a very

complicated dashboard. The questions ask for by the operational analytics folks. Those

will likely require custom calculations and/or data transformations. Uh, we’ll talk about

those concepts shortly. But this is something when people start asking those types of

questions that you would probably want to call in a data expert, to help you create that

type of dashboard and those visualizations forecasting our final set of options where we

look at the forecast.

I know I'm giving a lot of warnings here. Primarily because this is an introductory to

visualizations course. So I put out cautions when people are, what they think are

engagement, where they’re building a visualization, but it’s actually a mathematics data

problem, when we start talking about forecasting, this is another one of those yellow flags.

So as a caveat, I would not, as a warning, I would not suggest a business, particularly one

the size of Huskie Motor Corporation make a significant business decision based solely on

the forecast that is built into a utility of a visualization tool. Not to get into the details, but

there’s a lot of additional information that can be included in that type of mathematical

model. That being said, the built-in forecasting utilities can still be informative for

identifying patterns over time in your data and creating a degree of confidence in those

types of pattern predictions. In a forecast, we are able to show confidence intervals above

and below the predicted values. This is the case of a visualization so we can visualize our

confidence intervals and that’s similar to a box whisker plot where you have the predicted

values 80% of the time vs. the actual 80% of the values. The high/low markers are

equivalent to the whiskers and a box whisker plot if you will.

Moving on from forecast, let’s talk about connecting to data. Well, you always hear the

term data-connected dashboard. Let’s expand the dashboard conversation to introduce

what a dashboard is doing, a dashboard is doing under the hood. The first concept is

connecting to data. A dashboard works best when the dashboard can update as its


Page | 67

underlying data set. Also updates, we’ll cover this more in our data platform video, but for

now understand the visualization. A tool does not and in most cases should not change

the underlying data. So here we can connect to an online Google sheet. Here we can

connect to an O-data online data source or a JSON data source. So these are web data

sources. They’re separate.

Contrast that to something like Excel where Excel houses the data. In many cases,

especially the cases we’re accustomed to working with, Excel is the data you can type in

and modify the data. In many cases you have to modify the data to create a visualization.

So you end up with a lot of disconnected data tables, pivot tables, and pivot charts are

more like other visualization tools as long as you don’t touch and edit the underlying data.

In fact, I would say if you can use a pivot table effectively, you can probably use most of

the other visualization tools with minimal training, aggregates, and values. This concept is

probably the most important concept when building a dashboard. We’re at least using a

visualization-focused dashboarding tool and that is the concept of categorical values and

numerical values. Tableau calls them dimensions and measures, Power BI columns,

measures or aggregates, and measures in a pivot table or called rows or columns and vs.

values whenever you want to call them. Your categories are what determines the number

of slices in a pie chart, whereas measures determines the size of each slice category

determines the number of bars. Measure is the height of the bar. Again, categories or

factors. These are the same concepts as when you use an Excel table column to make

separate rows in a pivot table measures the same concept as values in a pivot table or

some mathematical operation.

Typically, sum is applied to the values corresponding to the pivot table. Row values in the

data video, I’ll talk about more about this concept. Even including the SQL structured

query language aggregation code that’s being created by the visualization platform. That

sounds a little scary. I’ll walk you through it. It’s not going to be as bad as it sounds.

Calculated values, all dashboarding tools let you create custom calculations for

aggregating your values. These can get pretty complicated. In this example, I’m relatively

simple when I’m creating a new column to create an address and I’m merging a bunch of

other columns concatenating them as what that’s called to create an address value for

geocoding. Yeah, that’s a simple example. And the hand calculated values are one of the

hardest aspects of learning a dashboard tool because calculated values, in particular,

calculated values like rolling average or cumulative totals, tend to either obscure or be

very specific to that platform to get the exact calculation I use to create the value you see

on the screen and honestly calculated values, especially the nuances of them. They aren’t


Page | 68

something you need to understand for this class. But as you learn dashboarding and

because it’s an important element of visualization, keep in mind that the measures you

see visualized in the dashboarding tool are calculated not only using the underlying data

but they will also change based on the visualization, visualization used and the categories

used in the visualization. Yeah, I know, I’m sure that didn’t make any sense. It takes a

whole class so you don’t need to worry too much about this concept. I’m telling it to you

so you’re aware of it. This is a potential stumbling block, and it may be something that you

struggle with cause I struggled with it when I started. So as you’re learning and, and

honestly in general, it’s a good idea to have a, your data team how, validate your

calculations or have predefined calculations, even have a data set. If you have a

particularly complex calculation you want to put into a visualization tool, have some data,

people will create a new data set for the analysis instead of relying on the built-in

dashboard capabilities.

Alright. Moving on. Dashboarding best practices. The first best practice is to know you do

not always need a dashboard. Yeah. Think about it. How many of you had a boss that

would not look at a presentation unless it was shown to them at a meeting? You could

write an email saying, next week's winning lottery numbers. We’re in the attachment on

that email and they still wouldn’t open it. It’s reasonable to assume if they will not click an

attachment to open a presentation, they are not going to click a link in your email, log into

our reporting server, assume they remember the username, and then navigate to the

dashboard themselves in a dashboard. You want to keep, infer it. Keep your informative

visuals at two or three major points at most, so you’ll be tempted to cram everything on

one dashboard. Look at that information density. Oh, you’ll say, and no one will know

what you mean. Or maybe you will. Somebody will be confused, and you just go up to

them and you say, Oh, you just need to, you know, click this. Then this, then this, so

highlight that. Pick this filter and voilà. Here’s your answer. No, and informative

dashboard allows exploration, but it should not require it. The line chart dashboards are

great examples. They may seem simple, and they answer a question very simply and

succinctly, but there’s little room for misinterpretation and you can still drill down and

filter. If you want to compare that to say like the geographic dashboard that we saw

earlier. It looks cool, don’t get me wrong, but it doesn’t say much without digging around

in it and it doesn’t even follow some best practices about high-level visualizations basically,

form over function. We should be doing function over form, my next best practice.

Make your own darn dashboard. I’ve seen people spend weeks working with consultants


Page | 69

to get a dashboard, and they’re still not happy. I’ve seen that time and time again, and it’s

because there’s always another question. That’s the whole point of exploration. You want

to find new questions. So if you find yourself working hours and hours with a consultant

or contractor or someone to build the dashboard, consider it spending that time watching

free training videos, all the platforms provide them, that way you can build the dashboard

you want. And if there are still complex data operations, like joining tables, rolling time

series aggregations, work with a data expert or reporting expert to create a new data set

or fix that one calculation rather than trying to read your mind or predict the future about

what questions you’re gonna ask because you don’t even know what they are yet, it’s

really not that hard. I promise they’re there. They’re easy to make. [inaudible] That’s what

they’re designed for.

Next best practice. Keep data operations to a minimum. Just because you can do complex

calculations doesn’t mean you should. Dashboards and tools like Tableau, Power BI, even

Excel, can be unclear how a value is calculated. When I was talking about that before in

the calculated values. Then when you throw in different terminology, what they’ll call it,

partition window table calculations, those are calculations that are aggregated over

several categories or dimensions within the visualization. It gets confusing and isn’t

transparent. So again, leave data to the experts. A data team-approved data set with

verified calculations that’ll ensure proper governance. It will ensure procedures are

followed. Give you the peace of mind that you’re not making a small and opaque

calculation error that could lead to $1 million incorrect decision.

In conclusion, in this video, we use dashboarding as an information management tool used

to organize and display information. Typically, a dashboard has an interactive element, and

it’s connected to an underlying data source for rapid updating. In our next video, we’ll give

you a little more visibility into the underlying data environment so you’re able to

communicate more effectively with your data stakeholders and you are better able to

understand what’s happening under the hood of popular visualization tools.

2.64 Section 2: The Data Dilemma, Continued

After reviewing the project proposal and meeting with Jan, Kevin calls Miranda to introduce

himself and confirm their initial meeting the following week.


Page | 70

Kevin: “Hi, Miranda. This is Kevin from D & A Consulting. I just wanted to touch base with you to

set up our initial meeting.”

Miranda: “Hi, Kevin. I am really looking forward to your team’s visit. We need help as soon as

possible. I have so much data coming in that I barely have time to look at it, much less analyze it.

Also, our data is located in various databases, spreadsheets, and our ERP system, making it

difficult to integrate and fully utilize. We have data visualization software available on our server,

but we have not really tapped into its capabilities just yet. I am told that we can easily pull together

large amounts of data from various sources with user-friendly, adaptable output using

dashboards. Is that something you can help us with?”


Kevin: “Absolutely. What kind of data visualization software do you have?”

Miranda: “Tableau. It is supposed to be very user-friendly. I am hoping you can get a dashboard

up and running in a couple weeks.”

HMC Dashboard Needs

Kevin: “Do you know what you want on the dashboard?”

Miranda: “I know that we want to be able to see profitability by brand and model, since solid

profit margins are crucial if we want to stay in business. We also want to keep a pulse on sales by

country and region. Ultimately, we want to do a better job of planning our production schedule,

but as you know, this requires up-to-date information on many moving parts.”


Kevin: “I cannot say whether a couple of weeks will be enough time until we take a look at the

data and map out the specific decisions you hope the dashboard will enable you to make. I will

have a better idea after our initial meeting next week.”


Page | 71

Miranda: “Okay, fair enough. We’ll see you on Monday at 1 p.m.”

HMC and D&A Meeting

The following week, Kevin and Jan meet with Miranda and her team for their initial meeting.

Kevin: “So, let’s get down to business. Let’s talk about the dashboard. Help me understand what

questions you are trying to answer? Or what story are you trying to tell? Who are the users going

to be? Will different users need different dashboards?”


Miranda: “I have a list of questions that I would like to be able to answer using our data. There are

several areas where I believe we can gain greater insight from looking at our data. Specifically, we

need overall performance analytics to tell us how we are performing globally. We need to know

which of our models is profitable, where we are selling well, and, perhaps, how sales channels are

driving sales volume. It is also important that we are able to see information from a top-level

perspective with the ability to drill down into the detail. The profitability information is crucial for

our executive team: chief executive officer (CEO), chief financial officer (CFO), chief operations

officer (COO), and chief marketing officer (CMO). This team is ultimately responsible for the

direction of HMC and profitability for shareholders.

“I would like for our management accounting group and financial reporting group, including the

CFO, to have a financial analytics dashboard that would give them information about contribution

margins, total costs, and sales volume. Both groups will also need to monitor changes in costs and

contribution margins for all of the models we offer. These are the most important metrics and,

therefore, the ones we need to work on first. We will expand our analysis for the management

accounting group to include other efficiency measures as soon as we can get these initial

dashboards up and running.

“Our CMO and the sales team will need operations analytics to help them understand turnover

and demand. They will need to know which models are selling and how long it takes to sell them.

They will also need to understand how packages and options impact sales. They will need to have

a handle on which of our package/option offerings are popular and which ones are profitable.


Page | 72

“Finally, our budgeting and production teams will need to utilize forecast analytics to predict sales

and margins at least three quarters in advance.

“Do you think it is possible to make the dashboards ‘real time’?”

HMC Data

Kevin: “I think it is possible, but first we need to understand the data that you are currently

collecting. What can you tell me about the extent and magnitude of your data?”

Miranda: “Well, we track details down to the VIN level in our ERP system. So, we have a lot of data.

For each vehicle sold, we know the sales amount, marketing expense, and all variable and fixed

costs. We also track a lot of nonfinancial data.”

2.71 The Data Dilemma, Continued

Kevin: “What kind of nonfinancial data?”

Miranda: “Well, there is vehicle data such as brand, model, model year, series, segment, body

style, drive configuration, engine type, and transmission type. We also have the detail for each

vehicle sold as far as color, trim, and so on, as well as any package or options purchased.”

Jan: “Wow! I can see why the data set is so large!” Miranda: “That is just the vehicle information. We also capture the region and country of sale, the

number of days that any given car was on the lot prior to sale, and the type of marketing

campaign in place at the time of sale. In addition to that, we track sales channel information.”


Page | 73


Jan: “Can you give me a little more detail about what you mean by sales channel information?”

Miranda: “Sure. We identify sales using three sales channel dimensions. Sales channel 1 identifies

whether the sale was made through our dealers, fleet, or retail operations. Sales channel 2

identifies the type of customer account. We have commercial accounts, employee/partner

programs, government accounts, nonemployee accounts, and rental accounts. The third sales

channel identifies whether the sale was cash, financing, or lease.”

Kevin: “Okay, so ideally we will want to use both the financial and nonfinancial data in our

analyses.”

Miranda: “Exactly! Other potentially informative data sources that we have are social media

platforms like Facebook, Twitter, and Instagram. We have just started collecting data from these

sources about our vehicles, but we are still trying to figure out the best way to analyze it.”


Kevin: “Eventually we can bring that data into the analysis and dashboard as well, but for now let’s

stick to the data we already have.”

“So, the actual sales data is in your ERP system. Where is the budget data?”

Miranda: “We keep track of the budget data in an Excel spreadsheet we call ‘plan data.’”

Video 9: Data Visualization – Data Ecosystems

In the case, it is evident that HMC data comes from and is stored in a variety of places, making it

difficult to organize and analyze it. In Dan’s next video, he will go through the concept of data

ecosystems and walk us through the process of aggregating data to see how data is connected


Page | 74

and transformed so it can ultimately be visually presented.


As you become more familiar with creating data visualizations, you will quickly understand

if you haven’t already, that data is just as important as the visualizations themselves. In

fact, data’s probably more important than the visualizations. Furthermore, having taught

hundreds of people, how do you use visualization tools and even more advanced

analytics? I know that understanding how the data operations and data connections are

enabled by those tools, that provides a more conceptually understanding. Understanding

the data concepts allows the students to use their skills across multiple tools rather than

just the one they’re currently learning. So I’m going to let you in on a big secret. Those

complicated calculated values and dimensions that I described in the previous video.

Those are just SQL structured query language or sequel, they’re just SQL aggregations.

They’re so complicated because they have to figure out how to abstract relatively simple

code, but it’s still using a visualization to create code. This is a slide that I use when I teach

people data concepts. For many of you, it may look too technical because it shows a little

code. That’s a common reaction. Please just bear with me. I hope to show you that much

of the SQL, much of the structured query language generated by dashboarding and

visualization tools, that code is similar in function to an ordinary pivot table. And by

extension a pivot chart is just like a dashboarding tool.

So the example that you see on your screen is just a table from a database. I’m glad that it

is a different data set because it shows you that these concepts extend across any data.

So to be consistent, though, with the language in the case study, the two tables here,

these are the difference in what they mean when they say normalized. That is, there is a

unique row for each individual data value. We see the same types of coverage and

education repeated multiple times and our normalized table in this example, if you need

context, it doesn’t really matter what the data pieces are when you’re talking about code.

But here we’re just talking about the coverage and education for different insurance

premiums for folks. And if our task here is to show what the sum of claims for each type

of coverage and we do not want to know the sum of claims for coverage where education

does not equal graduate, if that’s the business question, we would create an aggregate

table with one row for each type of coverage and its associated total claim amount. If this

was in Excel, the simplest solution would be a pivot table. We put coverage for each row

and move our claim amount to the aggregate value. It even defaults to some of that value

filtering the education. We put education into our filters and select all except graduate.


Page | 75

Our SQL statement is exactly the same as the pivot table. It reads like a sentence select

coverage, some claim amount from normalized table where education does not equal

graduate group by coverage.

Looking at the corresponding colored squares between our pivot table and the SQL

statement, we can see that selecting a field or column and placing it in your group by

statement is the same as putting it in a row. Meaning that we have a value or group or

row for each of the associated values within our coverage column. One row for basic, one

for extended, and one for premium for some claim amount. The green box in our SQL

statement makes the same calculation as our aggregate value. Some claim amount in the

pivot table, some sum, the claim amount, and finally where education does not equal

graduate in the purple box is the same as the filter in the pivot table. Although you can’t

see it seeing the side-by-side comparison like this. I would argue that the SQL statement is

actually clearer. It’s more transparent, it’s more understandable in what the data

transformations and filtering are actually doing. It's rather than the uncertain filtering and

source for the pivot table, we don't, we don’t know what the sources for that pivot table by

looking at it, we have to click and sort through all sorts of check boxes for the filter. We

may not even know which exact calculation is being performed on the value.

So remember when I talked multiple times that abstraction was both a strength and

weakness of visualization tools. This is a common example. So you like the hidden filter. In

this case, the education filter in the pivot table may lead to mistakes. If someone was

unaware of that filtering logic, it’s easy to assume that the calculation, your aggregate

table would have all the education in it. Whereas with the SQL statement, you see it right

away. And to reinforce the point, just to bring it up again, the pivot table that you see

there, if this was connected to a database, if we were using power query, this is exactly

what it’s doing. If we had an ODBC connection, open database connection to a database,

which we connected a pivot table to, it would create this SQL statement. That’s all these

things are doing. They’re abstracting the creation of SQL code. When it’s very simple,

there’s not a lot of harm. It’s much faster to use the visualization tools, but when it gets

complicated, it’s typically faster and better governed to have the SQL statement. But

either way, understanding these basic SQL concepts makes it very easy to understand

how to use the different tools.

So that’s on the transformation side. Let’s talk about the data connection side, the

separation of concerns. Why? Why is it so important that we pass this code to a data

source anyway? We’ve used Excel for decades where we just transform the data in the


Page | 76

spreadsheet. Why all of a sudden is that not okay? Well, there’s a few reasons. I already

mentioned the data is often too big to fit in a single spreadsheet, but you know, memory

is pretty big now and most data sets that we use still aren’t more than a few gigs. So now

why is everyone saying we shouldn’t pull our data into Excel? First it boils down to

governance. Accountants are at their core data managers. We respect financial data and

as you know in any general ledger, you don’t erase information. You don’t erase a

transaction, you reverse a transaction. You want to keep that data unchanged. If you’re

creating a financial statement in a spreadsheet and you just backspace that number,

you’ve just changed it. You’ve deleted information, you’ve deleted data, the ability to easily

audit all the transformations to the data as we created that information and the

opportunity to share that, create an information rapidly, securely into a wide audience.

That’s why you keep the data separate from the visualization tool and the transformations

we need to be able to audit it. We need to have a consistent mechanism of transforming it

and we need to leave that data unchanged. With Excel, it’s extremely hard to ensure that

data pooled into a spreadsheet remains unchanged. Even with all its cells locked down, it’s

still possible to copy that information to a new sheet, make a new report, and then

distribute all that information. If all that data remains in the original sheet, there’s no

assurance. Somebody who’s not gonna update the original report as well or the other

report as well. You have two versions of the truth, so when all the data stays in one

database, instead of getting shipped around through email attachments, the analysis,

we’ll always update any changes to the database or any updates I should say to the

database. We’ll also update any analysis and if there’s an error, we can either update an

error in the data or update an error in that singular analysis sheet that we have that

everyone else is just connecting to.

So see that wasn’t that hard. Right. Okay. Yeah, that’s a little difficult. I know. I know. It’s a

lot to take in. Unfortunately. It’s just impossible to talk about data visualization without

talking about data. However, it’s my strong belief that a conceptual understanding, just

the simple conceptual understanding. If you watch this video a few more times and just

really get your head around those basic SQL statements and the concept of keeping data

separate from the transformations to that data, you are going to have such an easy time

picking up all the different data visualization tools. It’s kind of that light switch that gives

you the skill set to, to just understand what these things are trying to accomplish as

opposed to just feeling like you’re clicking something and just memorizing a process

rather than learning a solution. If you can connect to a data source in Tableau, you can do

it in Power BI. If you understand that the dimensions in Tableau are the same as a group


Page | 77

by statement in SQL, and if you understand that the measures in Tableau are the same as

some or select average in SQL, then you’re going to understand that in Power BI as well,

you can understand it in any platform.


Kevin: “Okay, if you give Jan and me a sample, meaning at least 25% of the most recent data, we

will make sure that the data is ‘clean’ by running some data validity tests, and then we can start

putting together some analytics. Once we have a better feel for the data, we can give you a better

estimate of how long it will take to develop the dashboards.”

Miranda: “Great! My team will make sure you get all the data you need, and they will be available

to help as well.”

Video 10: Data Visualization – Data Wrap‐up

We’ll now move to the last video in Dan Smith’s Data Visualization video series where he

summarizes key takeaways pertaining to data, data analytics, and data visualizations. Video Transcript (Est. time: 3:12 min):

Congratulations if you’re watching this, it means you’ve completed all the videos on data

analytics, visualizations. We covered a lot of material over the course of these videos. We

started with why visualizations are important for analytics. Following that, we looked at

some of the tools of the trade, the leaders of creation software and platform for creating

data visualizations.

Next, we looked at the different types of audiences and how visual analytics answers the

different types of business questions that members of that audience may have.

Exploratory visuals for audiences looking to get a better grip, a better understanding of

data, and find new ways in which they can use that data to help the business informative

visualizations for the higher-level questions or for demonstrable proof of how those

visualizations of how those exploratory visualizations can help the business. And finally,

reporting visualizations to keep track of ongoing activities and answer the repeated


Page | 78

business questions which we’ve already demonstrated can help the business.

We looked at the low-level concepts of visualizations such as how shapes can be used to

represent an aggregate value such as a sum, how align represents changes over time, and

how a point represents the intersection of two or more values such as in a Cartesian plot,

a scatterplot. We saw how combining those low-level concepts of shapes, lines, and points

can create much more complex visualizations like histograms and tree charts, bubble

charts, ribbon charts, etcetera. We walk through a flowchart which allowed us to select the

correct visualization based on our business question, such as are we looking at a

comparison of groups or a composition of the whole? Are we looking at time series or a

single point in time or do we have multiple factors which we want to include in our

visualization?

From there, we took a closer look at the types of visualizations that we use for exploratory,

informative in reporting business questions and audiences. Then we looked at some of

the best practices on how a dashboard can provide our end user, our audience with data

connected visualizations and add an exploratory component to those visualizations. And

finally, we explore the data component of data analytics visualizations, looking at some of

the data operations performed by data visualization tools and the data environment in

which those tools live.

I hope you all enjoyed learning these data visualization concepts as much as I have

enjoyed teaching them. Again, thank you so much for watching. I'm Daniel Smith.

Goodbye.

HMC Conclusion

You have now completed the case study, Huskie Motor Corporation: Visualizing the Present and

Predicting the Future and finished the series of 10 videos with Dan Smith.

Module 2 Wrap‐up You have completed Module 2 of this course and should now be able to:


Page | 79

Define data visualization.

Describe how data visualization can impact the way data is communicated.

Identify various data visualization tools and their different uses.

Recognize the importance of choosing the right visualizations based on your audience.

Module 3: Applying Data Analytics and Visualization

Course Roadmap

Click Next to proceed to Module 3, Data Analytics and Visualization Practicum.

Introduction

Welcome to Module 3, Applying Data Analytics and Visualization. The goals of this module are:

To explore the data analytics and visualizations discussed in the Huskie Motors case study,

and

To apply acquired knowledge of data analytics and data visualization to complete scenario-

based exercises.

In the previous module, you were taken through a scenario of Huskie Motors Corporation’s

dilemma of the abundance of data available to them and the need to use it to make business

decisions. In this module, you’ll begin in Lesson 1 with a tutorial of how some of the Huskie

Motors analytics can be performed. Through a series of five videos, you will be walked through

how the data can be loaded, transformed, and analyzed, and ultimately visualized for effective

communication and decision-making. Following each video, there will be a set of knowledge

check questions that you’ll be required to answer before moving on.

After that, in Lesson 2, you will have the opportunity to apply the knowledge gained by

performing in-depth, scenario-based exercises on data analytics and data visualization. Before

you begin Lesson 2, you’ll need to access the data files that correspond to Module 3 from the

Resources link. You can download them directly onto your computer. Read the question

presented in each exercise, use the data files to figure out the best answer, and then come

back to the question and select your answer. You are allowed 2 attempts per question. If you

still do not get it correct, you will be able to click the ‘Show Answer’ button to view the correct

answer. All exercises in this lesson are based on the Huskie Motors case study reviewed in


Page | 80

Module 2.

When you are ready, click next to continue.

Module Menu

Module 3 consists of two lessons,

Lesson 1: The Data Analytics of Huskie Motors Corporation, and

Lesson 2: Data Analytics & Visualization Practicum


Lesson 1: The Data Analytics of Huskie Motors Corporation

Huskie Motors Data Analytics

Dr. Ann Dzuranin, co-author of the Huskie Motor Corporation case study, will present a series of

videos showing how to perform some of the data analytics required by Miranda and her team in

the case study.

Dr Dzuranin will take you step by step, using Microsoft Excel, through a demonstration of how to

work with the large amounts of financial and non-financial data that the Huskie Motor

Corporation is tasked to analyze.

In the video series, you will see how to:

Identify, extract, load, and verify the data;

Analyze the data to identify the most profitable brands;

Calculate profit margins, contribution margins, cost trends over time and more.

Huskie Motors Data Analytics: Video 1

Let’s now take a look at a video on identifying the issue and the data, extracting and loading the


Page | 81

data, and then verifying the data.


Hello, welcome to the Huskie Motors data analytics Video 1, ”Extract, Load, and Verify

Data.” Evaluating source data. I am Ann Dzuranin, and I will be discussing the material in

this video.

In video one, the issue that we have identified is that we need to load Huskie motorcars

2015 sales transaction data and confirm gross sales by region. The data that we’ve

identified we need is the 2015 sales transaction data. To extract that data, we contacted

the client and asked them to provide us with that data. The client has provided us with a

file, and the file is in a .TXT format, so it is in text file format. This is not an unusual way

that data will be downloaded and I should say extracted from databases, for example. We

need to take that data and import it into our evaluation or analytic software, in this case is

going to be importing that text file into Microsoft Excel. Once we have it in Excel, we’re

going to have to verify that all the data that the client had told us was in that file is actually

there and has actually loaded correctly into our Excel file. We’ll confirm that by comparing

what we’ve loaded into our Excel file to check figures that were provided by the client.

So let’s get started. First thing that we need to do is take a look at the text file that the

client has sent to us. So if you notice here we have what’s called the HMC client data file.

When you try to open that, it will open up a notepad and in this format it’s not really very

useful for us to work with. So instead we’re going to close that. We’re going to go ahead

and open up Excel and open a new workbook. Once I’m in Excel, now I go to the data tab

and I go over here where it says, all the way to the left, where it says get data. You click on

that get data, and notice you have a lot of choices from which you can bring data into

Excel. We know that we have a text file and so we are going to click on ”From Text\CSV” file

format. This will take us to our computer where we need to then go to our files and find

where we had saved that client data. Go ahead and import that.

Now if you have 2000 Excel, 2016, or a newer [version] you will have the option to actually

transform the data before you load it into your Excel spreadsheet. This can be a really

useful tool to use and I’ll show you on how this works. So if I look at my data right now, I

can click on, I can kind of page through it. It only shows me the first 200 rows just so I get

a feel for what’s come over. If I click on transform data, what that will do is, essentially

Excel will look at each of the data points and identify what type of data that is. So for each

column, for example, we would identify the VIN number as a text. ABC stands for a text


Page | 82

and that makes sense, right? Because it’s not a continuous number. The same thing for

brand model. Notice model year is just a number and that makes sense also because we

actually have, that’s really a category. It’s not the year when it was sold. Same thing for

seat and then series is a category segment, body style, etcetera. And so as we go through

these, you know, look and see if everything kind of makes sense, and it does make sense

to me, you know, month, day, and year. We don’t need those to be deep fields because we

have a sold date field here. Again, as we go through it makes sense that sales volume is a

number. It makes sense that the options and things like that are categorical or text fields.

And if we want to be sure though that all of our financial information is accurate.

So if we haven’t been through this, we haven’t been through this file before. Um, this file is

essentially sales for all of 2015 for Huskie Motor Car Company. And it starts with VIN

numbers who each row is a specific transaction, right? So it represents a sale of a unique

vehicle to an into a, uh, an individual. And all of the information that’s in here, um,

essentially describes everything about the vehicle that was sold. If we scroll all the way to

the left, we will eventually get to our financial data. And I wanted to point out a couple of

things that happen when we click on that transform, bringing over this text file into an

Excel. Um, we see that we have, um, currency is for gross sales and marketing, net sales,

labor tooling materials with and look at option costs. Again, the option costs. And we can

see that Excel has identified this as a text field, which initially doesn’t make a lot of sense.

Same thing for packaged cost, but notice what’s happening because the data set when it

was downloaded from the client’s database, but if there was a Z, instead of putting zeros, it

put dashes. And when it brings it over as a currency field with a dash in it instead of a zero,

Excel doesn’t know what to do with that. And so what Excel does is it says, no, this is not

currency because clearly we have text fields within here. We can in this transform

mode change this data type to be currency. And what will happen is we have the choice to

either add another column. So essentially duplicate this as the new column type or I, we

can replace it.

So I’m going to go ahead and click replace so you can see what happens.

So notice now that wherever there had been blanks or dollar signs with dashes, the word

error comes up. If I do the same thing for package costs and change that to currency, I will

get the same issue and we’ll talk about that in another, in just a minute. I just want to

quickly cruise through, the rest of these looks like tariffs, also was identified not as

currency. And so we’re going to go ahead and change that data type to currency and say,

okay, and everything else looks alright. So now that I’ve done this, I can go up here and

say close and load. And what will happen is all of those places where it said error, I’m

going to get a notification over here, a query that says, hey look, you’ve got 72 errors.


Page | 83

What you would do at this point is you would click on that and you would look and see

what the errors are. So it just brings over the rows where there are errors. And if you

scroll across, you will eventually find where the error is. So remember it was an option,

costs and package costs as well as tariffs. If they didn’t add an option, it should be zero. If

they didn’t add a special package, it would also be zero.

So what you would do at this point is you would actually go talk to the client. You would

confirm whether or not these should be zeros or whether they should actually have dollar

amounts in them. For our purposes today, we are just going to go ahead and load this

anyway and because now what’ll happen is it will just come through and it will just be

instead of an error when it loads into my Excel spreadsheet, it will just show up as a blank.

Okay. All right. So we have our data in Excel now and now the next thing we had wanted

to do, I’ll take you quickly back to the PowerPoint. What we want to do is check to see if

our file came over correctly. So what we want to do is confirm the number of transactions

are 1,308 that the gross sales total, $38,675,912, that the total number of blue cars sold

was 340 that our first sale happened on January 17, 2015, and then our last sale

happened on December 20, 2015. We can use pivot tables as a kind of quick and

convenient way to do a lot of these verifications. So for example, if I click anywhere in my

worksheet and click on pivot table, it being Excel will highlight my entire table range. I’m

going to go ahead and put this in a new worksheet since this has a lot in it already.

Now when I get to this worksheet, the first thing I wanted to do, remember, was

confirmed the total number of transactions. So a really quick way to do that would be to

say, okay, I could just count for that values, the number of VIN numbers in that column,

because each transaction is a unique VIN number. And so when I do that, I see that I get

account of 1,308, which if we glance quickly back is exactly what we want. So it looks like

the number of transactions is correct.

The next thing I would want to do is confirm sales. So what I can do for that is I can

continue to work in this same pivot table if I wanted to, but for an audit trail I’m going to

go ahead and do each confirmation on a different page. And I’m going to call this one

count of transactions. And then I’ll go ahead back to my first sheet and I’m going to

actually rename this so that we know exactly what it is. And this is 2015 sales. And now I’m

going to go ahead and insert another pivot table on a new worksheet. And for this one I

want to look at total gross sales. So I’m going to find my gross sales and I get my, some of

gross sales. I can change the format of that by going into the values, clicking on the down

arrow and picking value field settings. And then number format. I’m gonna go ahead and


Page | 84

change that to currency with zero decimal places. So it’s easier to read. Click OK. So

$38,675,192 and if we look back at our check figures, that is accurate.

The next thing we can using a pivot table is how many blue cars were sold. So we’re going

to go ahead back into our sales transactions. I’m going to go ahead and click anywhere in

there. Insert pivot table. Okay. And I am going to find color passed it. We’d go color. So

these are the various colors that they offer for their cars. And remember I want it to count

how many they I had of each color. Essentially what I really want I accidentally, that went

into rows. I’m going to drag it over to values and notice that I get a count of the VIN

numbers for all of the cars I could if I wanted to also bring color as a filter and then only

choose blue. And you can see that my account of blue cars sold is 340. I can go back to my

PowerPoint and confirm that that is accurate.

Now to confirm the first day of sale on the last day of sale. Probably the easiest way to do

that in the sales table is to make sure that your sold date column is sorted from oldest to

newest, meaning that the first one should be the first sale of the year and the last one

should be the last sale of the year. So the 12/20/2015 was our last sale 1/17/2015 was our

first, and that checks out with what we have over here. So what we’ve done is we have

gone through and we have confirmed the total number of transactions by counting the

number of VINs that we sold. We confirm the total gross sales of $38,675,192 by creating a

pivot table that summed gross sales. We confirmed the number of blue cars sold was 340

by creating a pivot table where we had color as the rows and then we filtered it for only

blue. And then finally we went and looked at our data, made sure that our sold column

was sorted from oldest to newest, and confirm that the first sale date was the 17th of

January and that the last sale was made on the 20th of December.

Remember when we first talked about what we were, our issue was today, one of the

things we wanted to identify was the total gross sales by region. And so after we verified

the data, we needed to provide that, those results. So we verified our data, we feel

comfortable that the data that the client gave us is complete, and accurate, and we went

ahead and totaled gross sales. I did this by using, again, another pivot table. So if we want

to pop quickly back into our Excel spreadsheet that we just created and we want to insert

a new pivot table here, I can pick region and another way that you can quickly find what

you’re looking for, you can start to type in the name of what you’re looking for. So I know

I’m looking for gross sales. I just typed in G-R and took me right to gross sales. And again I

click that down arrow, changed my field settings so that my number format is currency

with zero decimal places and it’s okay. And then we can see Europe, North America, and


Page | 85

South America and our total, which agrees with what we have in our visual here.

So the key takeaways. First, we had to identify your issue, which we did. We were looking

to verify our data and then, um, show the results for sales by region. We identify we were

using 2015 sales data. We extracted it via the client and loaded it into our Excel database.

They, we were given a text file by the client. We imported that into Excel. We looked at

how the transformation options are available in 2016 version of Excel and… and higher.

We talked about the issues of you have blank or missing data. We then verify that we had

the data. We were supposed to have confirmed our results and provided the summary of

sales by region.

So thank you for watching Video 1, “Extract, Load, and Verify Data.”


In the next video, we’ll see how to transform and analyze the data in order to evaluate Huskie

Motor Corporation’s financial performance by brand.


Hello, welcome to the Huskie Motors data analytics Video 2: “Transform and Analyze

Evaluating Brand Performance.” In this video, we are going to identify a business issue

facing Huskie Motors. We will identify the data we need to evaluate that issue. We’re going

to transform the data by creating a measure that’s going to help us address the issue. We

will then analyze the data and then communicate our results.

In this video we’re going to be evaluating the HMC financial performance by brand, and

the data that we’re going to use for that are the 2015 through 2016 sales transactions. The

transformation that we’re going to do is to create a measure that’s going to help us

evaluate financial brand performance, and we’re going to use the profit margin ratio

calculation as that measure. Once we create that ratio, we’re going to go ahead and

analyze the data by preparing a pivot table to summarize profit margin by brand, and that

will allow us to communicate the results as to which brand has the highest profit margin

ratio for 2015 to 2016.

As a review, I’d like to cover what the profit margin ratio calculation is. So we take for a

profit margin ratio, we take operating income and we divide that by sales. There are

probably other measures that you might be thinking would be also good measures of


Page | 86

financial brand performance, but the reason we’re choosing profit margin ratio is because

that helps us eliminate any type of volume differences between the brands. So the profit

margin ratio allows us to break down by brand what their profit margin ratio is regardless

of how many cars they sell of each brand.

So let’s go ahead and open up the Excel spreadsheet that we’re going to be working with

in this video. This is the Huskie Motors sales transactions from 2015 to 2016, and if you’re

not already familiar with the data, I’ll give you a brief overview of it. Each row in this data

set represents a single sale of a vehicle. And so you can see that the VIN number should

be unique in each row of this data. There are approximately 2,674 transactions that we’re

going to be working with. The other information we have are typical things that you would

think about if you were buying a car: the brand name of the car, the model of that car, the

model year, perhaps there’s a series number for that vehicle. Is it a full-size car or a

compact? For example, what is the body style? And then we have sold date information.

We also have, sales volume will be one for each transaction since you can only sell unique

vehicle once, drive configurations, engine, etcetera.

If we continue to page through or scroll over, I should say, we will also see that we have a

lot of information about options that can be added to it. We have the region in which the

car was sold, the country in which it was sold, how many days it was on the lot, etcetera.

So if we’ve continued to scroll over, we will eventually get the financial information, and

that starts with the column of gross sales. I do want to point out that there is also a

column called net sales. And what that column represents is the difference between the

gross sale amount and any variable marketing expenses that were incurred. So these

variable marketing expenses are based on specials and deals that may be happening at

the time that car was sold. So cash rebates, low-interest financing, special lease terms,

those are types of variable marketing campaigns being used. And then this number

represents what that cost was.

We're going to go keep scrolling to the end of our financial data, and you can see I already

created a column for profit margin ratio and we’re going to recreate that so that I can walk

you through how I did that. It’s fairly straightforward so I’m just going to type in profit

margin ratio. So we said that the calculation is operating income divided by sales. So the

equivalent in this data set is going to be our net revenue, and we’re going to take net

revenue instead of after-tax revenue. The reason we’re going to do that is after-tax

revenue is driven by the tax rate that is in the specific countries in which the cars are

being sold. And so we don’t want to muddy the waters by also having to account for

differences in tax rates. So we’re going to go ahead with net revenue, and we’re going to


Page | 87

divide that by gross sales. Now the reason we’re using gross sales instead of net sales is

because gross sales is a better measure of sales. Since net sales is really just net of a

marketing expense. Now if this was a retail company and net sales represented the

difference between gross sales and sales returns, well then you would want the net sales,

but that’s not the case here. So we go ahead and we take our net revenue divided by our

gross sales, and we get our profit margin. Now I would recommend going and changing

this so that it is a little bit easier to read so you can change up to percentage. I actually

probably would add another decimal place to that. Now we can use the autofill feature by

just double-clicking in the corner here and that will autofill all the way down to the end of

our data. What our profit margin ratio is. I do want to point out a word of caution on

autofill. Autofill will only work if there is a corresponding data point in the column

immediately to the left of where you’re filling. If there was a blank, for example,

somewhere in this profit margin ratio column then the autofill would stop there. So what's

a good way you can tell if you’ve got everything? Well, if you notice down here after I did

autofill, it gave me a count. So it gave me a count of how many rows it autofilled that

formula to. And we know from looking at the number of rows we had, right in the

beginning there that we have 2,674 so it looks like it autofilled correctly.

All right, so now we have our profit margin ratio, and what we want to do now is create a

pivot table so that we can evaluate which model has the highest profit margin ratio for

sales from ’15 to ’16. So what we can do if you click anywhere in your worksheet and go up

to insert pivot table, notice that Excel will automatically find the whole parameter of all of

your data. So you just have to be clicked within a cell for that to work. We’re going to go

ahead and put this in a new worksheet because we already have a lot in this worksheet.

So click OK. And now notice that you have your fields here that you can start to drop. And

you’ve got it pretty much got your canvas here that you can start to drop in what you want

in your pivot table. So we’re interested in profit margin ratio by brand. So if I click on

brand over here, you can see that it drops in the Apechete, the Jackson, and the Tatra. We

also want profit margin. So if we look down to the bottom, you can see, and these go in

the order in which you have your columns in your worksheet. Notice I have two profit

margin ratios because remember I had already created one and then we created a second

one as a demonstration. It doesn’t matter which one we use. I’ll go ahead and take the

second one we just created. So now notice what we get. It defaulted to give me the sum of

profit margin. And that doesn’t really make a lot of sense, right? We don’t want to add up

all the profit margin ratios. That really doesn’t give us any kind of number that makes any

sense for us to evaluate. What we really want to do is look at the average. So if I click on

the down arrow and I go to value field settings, you can see here where you get choices of

how you want to summarize those, the fields. So in this case we were going to look at the


Page | 88

average of the profit margin, and I’m just going to click OK.

Now if I want to change this so that it’s in a nicer format and it has the percentages,

instead of, the way it looks right now, I can go back to value field settings and click number

format. And here if I pick percentage, I’m just going to use one decimal place and then I’m

going to click OK, click OK again. And now I have a much neater-looking table here. Grand

total doesn’t necessarily make a lot of sense, either. So if you want to take out grand total,

you can just click on it. And then go to remove grand total. We can also change what we

want to call this if we wanted to call it something different as well. So now you can see

that we have created a pivot table of the average profit margin, and as we can see, the

Apechete is performing the best followed by the Jackson and then by the Tatra.

Now before we call it a day, let’s think about what we talked about in the very beginning

that there might be other measures that you may have been tempted to use to evaluate

performance. Let’s just say for sake of argument that we want to also look at net revenue.

So if I click on net revenue and I’m looking now at total net revenue, some of net revenue,

and I’m going to go ahead and change the format on that. So that’s a little bit easier to

read. So we can see putting in net revenue that the Apechete also has the highest sum of

net revenues. So the highest total net revenue followed by the Jackson and then the Tatra.

But notice that the Tatra has very close revenue to the Jackson, and yet if we look at which

one is more profitable, we can see that it’s the Jackson of those two and that the Tatra

actually has a negative profit margin ratio. So if we go back to our slides, essentially what

we have is the results here on this slide shows you the results that we just calculated in

the pivot table where the Apechete has the highest profit margin followed by the Jackson

and then the Tatra.

So what are our key takeaways? Well, what we did in this video is we identified the issue of

what is the financial performance of the brands. Can we identify which brand is

performing best in the years ’15 to ’16, we took the data, which was the sales data from

2015 and ’16. And we transformed that data by creating a measure of profit margin ratio.

We then took that profit margin ratio and conducted a pivot table analysis, which helped

us to identify that the Apechete was in fact the best-performing brand.

So thank you for watching Video 2: “Transform and Analyze.


Page | 89


Now that you saw how some data analytics are performed using Microsoft Excel, we’ll move on to

some exploratory data visualization using Tableau to understand the Huskie Motor Corporation’s

variable costs. Video Transcript (Est. time: 22:20 min):

Hello, welcome to Huskie Motors Data Analytics Video 3, “Exploratory Data Visualization,”

understanding variable costs. I am Ann Dzuranin, and I will be discussing the material in

this video.

What we’re going to cover in Video 3 is first to identify the issue that we’re going to

examine, which in our case for this video is going to be evaluating Huskie Motor cars’s

variable cost. We have to identify what our key questions are. So what are we going to

explore in regard to Huskie Motor cars’s variable cost. Specifically, we’re going to be

examining which models have the highest variable cost, which costs comprise the largest

percentage of total variable costs, and how have variable costs changed over time.

The data that we’re going to be using is the data we’ve been working with in the other two

videos, the 2015 to 2016 sales transaction data for Huskie Motors. We’re going to explore

the data in two ways. We’re going to use Excel, and we’re going to use pivot tables and

charts to help us investigate the identified questions. And then we’re also going to use a

visualization software called Tableau to investigate the identified questions.

So let’s go ahead and get started. The first thing we’re going to do is get our Excel file

opened up. So this is the sales data for Huskie Motors. It’s the same data we worked with

in the last couple of videos. We’re going to be using this data to create three visualizations.

We’re going to explore it using both pivot tables and some pivot charts. What we want to

do is identify model variable cost. So this is our goal, to create this data, pivot table, and

this visualization. So that’s our first step. So let’s go ahead and start a new worksheet, and

we will go ahead and do that. So we want to know by model what total variable costs are.

To do that, we can go back to our sales data and click anywhere in the data sheet and go

to insert pivot table. So Excel will automatically highlight all the area, and create the table

that you’re going to be using. We’re going to go ahead and say to put it in a new


Page | 90

worksheet. So we get to our new worksheet. And now we have our canvas here to work

with to develop our pivot table. And from that pivot table we will develop a pivot chart.

So we know that we want to look at variable costs by model. So let’s go ahead and bring

model down to rows and then let’s bring years sold into columns. We’ve pretty much set

up how we want our rows and columns to look. We want each model listed as an

individual row. And then we want to look at the cost in 2015 and 2016. So now what I

need to do is grab that cost information and bring it over to values. So I’m going to go

ahead and scroll down until I come to total variable cost. And I’m going to bring that down

to values. It automatically creates a sum, and that’s okay. We’re interested in sum this

time, but we do want to make those fields, the number format look a little bit better so we

can better read these numbers.

So I’m going to go ahead and choose currency, and we don’t need decimal places for this

analysis, and I’ll hit OK. And then once I hit OK again, it’ll format my numbers. It looks

much easier to understand. So pivot tables are really useful. We’ve summarized all of our

total costs by model for each year, but you know, it does get to be a little confusing to now

try to look through here and make sense of, okay, which one is highest, which one is

lowest, which ones changed the most? So often visualizing the data is going to enable you

to get to those answers a lot more quickly.

So we can go ahead and work with what we have and we can actually create a pivot chart

if we would like to. So notice I’m in pivot table analyze and I have this choice over here for

pivot chart. If I click on that, it’s going to open up a box for me here to decide what type of

chart that I want. I’m okay with the column chart for now. So we’re going to go ahead and

click OK. And now we have our column chart. So as you can see, I’m going to close our

field tables over here. As you can see, we now can much more quickly identify which

model in 2015 had the highest total variable costs. Just by visually looking. We can see that

it’s Jespie and we hover on that point, we can see how much that is in total for that year.

And then if we wanted to see which one had the highest for 2016, we can again visually

pick that out pretty quickly, and it is the model called Pebble. We can also see visually

which ones are going up or down, right? And which ones have stayed the same. So very

quickly we can get to the information that we would like to understand better just by

taking that pivot table and changing into a pivot chart. That’s the power of using data

visualization for exploratory analysis.


Page | 91

The other question that we wanted to answer was which costs comprise the largest

percentage of total variable cost. So for that we want to look at the model variable cost,

but we want to look at it broken down instead of by model. We want it broken down by

the type of cost. So we know from the data set that our costs, that go into our total

variable costs include labor, materials, overhead, freight, warranty, package cost, and

option cost. So let’s go ahead and create a pivot table with those and then we can use that

to create this pivot chart. So again, I’m going to go back to the sales data set and I’m going

to click anywhere within my worksheet and go to insert pivot table. Now you might be

thinking, why don’t I just start with a pivot chart? And you could do that. You could just go

right to pivot chart, but it often is, it’s good to have the numbers in one place and then you

can manipulate the chart from those numbers vs. continually having to change the chart.

I’m going to go ahead and click OK to put this on a new worksheet. All right, so remember

now we want to look at the breakdown, by year of variable costs of the components of

variable costs. So what I would like to do is find each of my components and bring those

over. So let’s start with, let’s see, we said that it was labor, materials, option costs, package

cost, freight, overhead, and warranty are all in my variable cost. So notice now it’s just

putting a sum here. I also wanted it to be by year as well, right? So, what I would want to

do then is say, maybe I want these as rows. So I just drag that from columns, the values

from column to rows. What that did, this value represents all of the values that I’ve put

over here. And then I’m going to take years sold, which I believe is up here somewhere.

Right? And bring that over to my columns. So now I have all of my variable costs

components and how much they are for each of the two years and how much they

comprise of grand total. Now we could go back into each one of our fields here and go

ahead and change the format of those. Or we can change the format here. The benefit of

doing it within the values is then it stays within that. If you change your pivot chart around,

add things or move things, if we just change it here, the way we would do that, if we were,

in any spreadsheet on the home sheet, in the number section, obviously that will work but

again, if we change or add a column to this, that column would not be formatted at the

same way. Okay.

So we’ve got all of our components. And again, now how do we want to visualize this to be

able to see what comprises most of the variable costs. I think even just from the grand

total, we can see that that’s going to be materials which makes sense in the production of

cars. But let’s go ahead and click in your table, anywhere in your pivot table, on your top

ribbon pick pivot table analyze, which should be highlighted once you’re in that pivot table.

And go ahead and pick pivot chart. And again, we have a clustered, column is the default


Page | 92

that Excel is suggesting for us. We could choose another one. If we would like, it doesn’t

really always give us the best choices. So typically what I would do is stick with the

clustered bar. And so then you can see what we’ve done is we’ve brought this, I’ll bring this

over here and close our fields. So again, you know, we can kind of confirm quickly that

materials does make up the majority of our variable cost, and we can see followed really

closely by labor and option costs, warranty costs. We can also see some, a little bit of a

trend on which ones are going up and which ones are going down.

Okay. So now that we’ve, taken a look at this in Excel, why don’t we think about how we

might look at this in a different, well actually before we even move from Excel, let’s go

ahead and do that comparison. The year-to-year comparison. We just want to see, how

things- our third question was how have variable costs changed over time? So there’s a

couple of ways you could do that, right? We can look at it in total. And so let’s just go

ahead and create a new pivot table that looks at how variable costs have changed

from ’15 to ’16, let’s say. So let’s go ahead and again, click anywhere in our data, go to

insert pivot table, and now we can see that we can go ahead and hit a new worksheet. So

what do we want to know here? Here we wanted to know by years sold, don’t want to sum

the years sold. What is our total variable cost?

Okay. So, if I wanted to go ahead and change this so that it was formatted in currency, I

can go ahead and do that. I’ll do zero decimal places. So now we have basically our total

variable costs and we can see what it is each year. Go ahead and click OK, so it changes

the format. What if I wanted to create a pivot chart of this so that I could compare year to

year. So again, click anywhere in your pivot table and then click on pivot chart and you will

get your option here for your pivot chart. And so we can see that, very quickly that total

variable costs have gone up from 2015 to 2016. I want to point out a couple of things in

the default visualizations for Excel. Notice that we’re not starting at zero. That can always

be a very confusing type of visualization to create. So, this would not be considered best

practice for visualization. But again, we’re just using this for exploratory. We’re not using it

to explain or communicate our results yet. We’re just using this as a way for us to better

understand what’s happening. So it’s okay for that reason. So we can see that we’ve got an

increase in total variable costs, but what drives variable costs, sales volume, right? So we

really need to compare this to sales volume so we can go ahead and create another

visualization that gives us sales volumes.

So let’s go ahead, back to our data again, click anywhere in the data, click on insert pivot


Page | 93

table. Now in this case, I don’t want to put it in a new worksheet because I want it in the

worksheet I was just working in so I can compare the two. So I’m going to click on existing

worksheet location and then click in that location box. And what you need to do there is

go back to the sheet that we were on, that had that new visualization and it was this one

and click on a cell in there for it to go to and say, okay. All right, so now we’ve got another

option here for another pivot table. What I would like to do here again is look at sales

volume now by year. I’m going to go ahead and pick my year sold. I want a sum of my

years sold. And let’s go ahead and look at sales volume.

Now we had a sales volume number, but there’s another way that we can do that. We can

actually take the VIN number into values and it’ll automatically do a count for us. Okay, so

it’s counting the VIN numbers for us. So now we have the sales volume. I’m going to go

ahead and click within that pivot table and create a pivot chart and click OK. So just briefly

looking at this, and again, our scales are different though, right? But we can see that at

least we know that sales volume went up. In addition, right to our total variable costs. But

wouldn’t it be nice if we could see all of this in one visualization. I’m going to show you

how to use a visualization software to create an image like that and also to get a better

understanding of what your total variable costs might be.

So I’m going to go ahead and use a program called Tableau. Now I’m going to open up the

Tableau file that I’ve already started for this. So in this Tableau worksheet this looks very

similar as you can tell to a pivot table, right? We’ve got areas where we can drag our rows,

areas where we can drag our columns to create our visualization. The power of a

visualization software is that generally it has many more visualizations to choose from.

And in addition, you can manipulate things, a lot more easily. So let me show you as an

example. So one of the things we want to see was total variable costs. We’re going to see a

breakdown of total variable costs, right? So if I were to take my individual cost, I could do

that by again just clicking on each of those items. Now in this particular program, it will

alphabetize the data vs. having it in the order it was in the worksheet. There’s pros and

cons to that I suppose.

So we know that we want, freight, and if you double-click, it’ll bring it over. We want labor.

We want materials, we want option cost, overhead, packaged costs, and warranty. Now

notice we get something that we’re not really sure if this is exactly the visual we would like

to look at. We also want to be able to look at it, by year as well, right? So if I drag my sold

date up to my filter and pick years, then I can choose which years I want. In this case I


Page | 94

want both. So I'm going to hit OK. I can also drag that over to be in columns, let’s say. So

that then I’ve got, all of my information in here that I would like, now I can just go to the

show me button and pick a visualization that I would like to see. So what I’ve done is I’ve

picked this column chart and I can see in 2015 which items made up the majority of my

costs. And in 2016 which items made up the majority of my costs. We knew from our

previous analysis that it was going to be materials, but again, very quickly, visually we can

see that. Now, back to our question of, well, what if I wanted to see, 2015 total variable

costs and compared to volume. So let’s go ahead and start a new sheet and let’s think

about how we would want to do that. Well, we know that we would want to look at total

variable cost. We also know we would like to look at it by year. And we also know, that we

would be interested in seeing what our total sales volume was as well. So we can see I

have total variable costs and I have sales volume, and it’s not really all that much more

meaningful to me except I can see they’re both going up. But if I wanted to, I could, take

my sales volume number and I can use that as a dual access. So if I click within this second

visualization, I can pick dual access. Now what this does is allows me to look at sales

volume and variable costs. These are two different measurement scales, but it puts them

on the same plane so that I can see if my variable costs are rising at the same rate as my

volume.

And so after doing that, I can feel pretty comfortable, right? That they are moving in sync,

which is what we would expect variable costs to do. So again, just to summarize, this was

our total variable cost by modeling year. This was the one that we prepared in Excel, and

again confirming that we had the Jespie was the highest. Jespie was the highest in 2015,

and Pebble was the highest in 2016. We then looked at a breakdown of variable cost in

Excel so that we could see which of the variable costs were contributing the most of the

total. And that clearly came out as some of the material costs. We also split it by years that

we could see which ones are increasing or decreasing. And then we wanted to look at the

change over time and we did it initially in Excel, but then decided that really to evaluate

whether or not your variable costs are rising at the same rate as your volume, you need to

see both together. And so we used the visualization software Tableau to take both sales

volume and variable costs for both years. And using a dual access, we were able to plot

them both together in the same graph.

For our key takeaways today, we identified our issue, which was to evaluate Huskie

Motors’ variable cost. We identified the questions, specific questions, which models had

the highest variable cost. It was the Jespie in ’15, and the Pebble in ’16. Which costs


Page | 95

comprise the largest percentage of total variable costs. And that was materials for both

years, changed over time. And although they’ve increased from ’15 to ’16, they’ve done so

in sync with sales volume. And then we explore the data in both Excel and Tableau.

So thank you for watching Video 3, “Exploratory Data Visualization.”


In the next video, we’ll perform some statistical analysis to evaluate total variable costs and

prepare a histogram. Video Transcript (Est. time: 23:02 min):

Hello, welcome to Huskie Motors data analytics video for statistical analysis,

“Understanding Variable Costs.” I am Ann Dzuranin, and I will be discussing the material in

this video. So let’s go over what we’re going to cover in the video. The first thing we’ll do is

identify the issue. We’ll then identify the data we need, we’ll analyze the data and then

evaluate our results.

So the issue that we’re tackling today is the use of descriptive statistics to evaluate our

total variable cost. The data that we’re going to use is the 2015-2016 sales transactions.

We’re going to prepare descriptive statistics and prepare a histogram of variable cost as

well. And then we’ll summarize our key findings. So let’s go ahead and get started. We’ll

look at our Excel file that we’re going to be working with today and that’s the sales data

from 2015 to 2016. And what we’re trying to understand better are our total variable cost.

And we’re going to do that by preparing some descriptive statistics and a histogram of the

distribution as well. So this is what our goal is going to be to create these descriptives and

to create a histogram that gives us an idea of what our distribution looks like.

So what we’re gonna do first is create this, these descriptive statistics. Each of these

statistics can tell us something really important about our data. So let’s just go ahead and

get started in and recreate this. If we go to the 2015-’16 sales data, and I’m going to scroll

over until I get to my total variable cost, that’s what I’m analyzing in this video, total

variable cost. So I’m going to go ahead and copy this column of data and put it in a

separate worksheet so that I don’t have it, all in one large spreadsheet. Go ahead and


Page | 96

copy that and put it into a new sheet. So this is the data that we’re going to be working

with in this video. A total variable cost. Each row represents a specific car that was sold

and the variable costs associated with that vehicle. Now descriptive statistics can be done

using the data analysis tool in Excel. So if I click on data in the toolbar, I can see in my

ribbon up here that in my data ribbon, that there is a section called analysis and I have a

tool called data analysis. It’s possible that you don’t have that in your Excel. It’s possible

you need to add it. So if you don’t have it and you need to add it, what you’re going to do

is click on file and then go all the way down to options. Once you’re in the options screen,

you want to go down to add-ins. Once you’re in add-ins you want to go to the bottom

where it says manage Excel add-ins and click go. Now you should see a pop-up box here

that gives you the ability to add, some analysis things to your data ribbon. So make sure

you have analysis tool pack checked, that’s the first box. And then click OK. It should pop

up right in your data tab. If it doesn’t, then just refresh. Go out of your data tab and go

back to your data tab and it should be there. So once you have the data analysis tool, go

ahead and I’m going to click on that and show you what’s in here.

So this data analysis tool essentially summarizes all of the statistical functions that Excel

has available for to use. We’re going to be working with today two of these, we’re going to

work with the descriptive statistics tool and then also the histogram tool. So go ahead and

click on descriptive statistics and click okay. And now you get, a box here for you to fill in

the information to calculate the descriptive statistics. So your input range is going to be

the range of data that you would like to, get the descriptives on. So I always like to take

the column name because then it returns that in my results. I’m going to click on the first

row and I am going to highlight all the data in that row. Because I picked the label, I’m

going to make sure I click labels in first row. If you don’t do that, then Excel is going to tell

you there’s non-numeric data in the range. Because there are words at the top. We’re

going to go ahead and keep that in this worksheet. So I’m going to click on output range

and then click in that box and I’m just going to go ahead and scroll back up to the top here

where I want to put my results. Then I’m gonna click summary statistics. Otherwise I won’t

get any printout at all. Once I’ve done all that, I can click okay. And now what you see is a

report of variable cost. One thing that I like to do, because sometimes it can get confusing,

there’s many decimal places and no commas. I at least like to go to my formatting here for

numbers and click common style, just makes it easier to see the, the statistics, and read

the numbers more easily. Okay.

So, what I’d like to do is kind of point out what are the important, results in this variable


Page | 97

cost our most important descriptives. So if we look at these, I’m going to go ahead and

highlight them, as I talk about them. So the mean is the first one I want to talk about. We

all understand what a mean is, right? It’s a sum of all the observations divided by the

number of observations that we have. But one thing to keep in mind with a mean is that it

also can be influenced by outliers. So you have, you know, one or a couple of very high

numbers, it’s going to pull that mean toward that high number. Conversely, if you have a

few, extremely low numbers compared to the rest of yours, it’s going to pull the mean

down. So you always want to keep that in mind. The other measure of location that we’re

going to talk about is the median. So the median, essentially, is what is determined by

taking all of the values arranged from smallest to largest. And then the median is that

middle value. So the median is not influenced by outliers. And that’s an important thing to

keep in mind. If there are outliers in your data, then you probably want to go with the

median instead of the mean. The mode is also in here, I wouldn’t exactly highlight that as,

as something that’s important for this data set use of the mode is, it’s essentially the

observation that occurs most frequently. It’s really best used with small data sets that

have a small number of unique values and not really useful if you have a data set that has

a lot of repeating values. So those are the measures of location that that we’re going to be,

using or interpreting in our data. Measures of dispersion are also important. And so

measures of dispersion include the standard deviation and the variance. So standard

deviation and variance and range would also be considered, to be measures of dispersion.

Of these three the one that’s going to be most frequently used is going to be the standard

deviation. The standard deviation is an indication of how the observations in our data set

are spread out from the mean. So if we have a low standard deviation, then that would

imply that the observations are all close to the mean. If we have a large or high standard

deviation, that would indicate that the observations are more spread out, we get to the

standard deviation by calculating the variance. So the variance is the average of the

squared deviations from the mean. But the reason that we generally use the standard

deviation is the standard deviation, it’s more practical to use because it’s the same

measure dimension as the data. So in other words, 5,573.89 is 5,573 and 89 cents of

variable cost. Whereas the sample variance really is not that interpretable. The range is

also an indication of dispersion. Obviously the larger the range, the more possibility you

have for your data being dispersed. However, it’s really not a good measure; as good a

measure as using standard deviation.

The other things that we can, glean from our descriptive statistics are the measures for

the coefficients for Kurtosis and skewness and what these measures do, I'm going to


Page | 98

highlight them in different colors. What these measures tell us skewness and kurtosis and

skewness tell us is the shape of our data. So skewness is an indication of how symmetrical

our data is, how, what’s the symmetry or lack of symmetry of the data. So if we have data

that is symmetrical, then that would mean that is all spread around the mean. If we have

data that is not symmetrical, it’s either going to be positively skewed or negatively skewed.

And with that suggest is that there’s more data at one end or the other of the distribution,

kurtosis is a measure of how peaked our data is. So what we mean by that is we look at,

and when we look at the histogram, we would have seen this. That gives us an idea if we

have one area of highly peaked data and then all the rest is, spread from there. So it gives

us an indication of whether or not there’s a somewhat flat dispersion or a wide degree of

dispersion. That compared to whether or not there is a peaked amount of dispersion.

Okay.

So, the other interesting thing to look at here is if we look at the minimum, the maximum,

these are going to be useful for when we’re trying to determine what our bins are going to

be or what our groupings are going to be for our histogram. The minimum is the smallest

observation in the data, and the maximum is the largest observation. And then finally

count as an important thing to look at because it tells us very quickly whether or not we

have all of the data observations that we are supposed to have. So we know, that we are

supposed to have 2,674 observations, and we do have that. So it gives us some

confidence that we’ve got all of our data. All right, so now we’ve done, we have these

descriptives. What we can tell is that the average or the mean variable cost for a vehicle is

$17,841, the median is $15,516. We can also tell if there’s a standard deviation from that

mean of about $5,573. So in determining whether or not a standard deviation is large or

small is relative to the amount of your mean. So I would say that if we’ve got a mean of

$17,841 and our standard deviation is $5,573, there’s a pretty good amount of deviation

from the mean. So what we want to do next is we want to go ahead and do a histogram of

our total variable cost to get a feel for where we have buckets of data, of observations.

The way you want to do a histogram, generally you don’t want to have more than eight

bins in a histogram. We’re going to go ahead and create our bins over here, which is

essentially the columns of data that we’re going to have on the column bar chart. So to

create bands, you generally want to start close to where the minimum is and then

increase it by equal amounts until you get to the maximum. So that’s where the

minimum/maximum come in handy.

I’m going to go ahead and start my bins at 9,000 and then I’m going to increase those by


Page | 99

5,000 for each bin. So if I put in 9,000 and then 14, which is the increase, I can just drag

that down until I get up 39,000. These are the bins that I’m going to use to create my

histogram and the groupings of variable cost observations. So let’s go ahead and create

that histogram. If we go back to our data tab, and we click on data analysis, and then we

click on histogram and click OK. What Excel is asking us to do is tell us, okay, what is it you

want me to put in your histogram? So that’s our variable costs. Again, I’m going to pick the

label and I'm going to pick all the data in there. The Bin range is going to be those bins

that I just established. So I want to see my groupings. I’m starting at 9,000 and going up by

5,000. And I’m going to put the output range here on this worksheet so that we can see it

right next to what we’ve created. Now down here is where you’re gonna want to chart

your output so you actually do get the histogram. Go ahead and click OK. Now you can

see what I get. I get my bins repeated and then the frequency, how often I see an

observation that is in that range. So 9,000, that essentially is that minimum one of 8,912.

Notice that I’ve got the highest amount in the bin between 14 and 19,000, which makes

sense because our mean is 17. Our median is 15. So it makes sense that this would be the

highest, the highest amount of observation. So it’s basically saying that variable costs

between 14 and 19,000 occurred 889 times. So it gives us a really good indication of

where, we have our, most of our observations distributed. It also gives us maybe a hint

that perhaps this is an outlier. We might want to go back and check and be sure that data

is accurate. And then when we’ve got only three that are above the 39,000, we might want

to make sure that data is accurate as well. So I’m going to go ahead and call this sheet

descriptives.

So our goal though is to understand how total variable costs are, per model. And so we

know now that we’ve got a pretty significant standard deviation, we know what our overall

mean is for all of our vehicles that we sold, but we really want to get a little bit more

detailed. So we really want to understand better what’s happening by each model. So to do

that, we can create a pivot table. And what we’re going to do is go back to our data and

again, as we did and in other videos, we’re going to click anywhere in that data, go up to

insert pivot table, and we’re going to go ahead and put that on a new worksheet. And then

click OK. So what I’m interested in is looking at by model what my total variable costs are.

But I want to think of it in the same way that I just did my, other variable cost analysis. I

want to look at the average or the mean and the standard deviation. So if I click on total

variable cost, it brings that down into my values as a sum. And I’m not interested in sum

right now, I really want to understand better what’s happening by looking at the average.

I’m going to click on average, and I’m going to change my number format so it’s easier to


Page | 100

read to currency with no decimal places and click OK. So now I have my average. I also

want to look at the standard deviation. So I’m gonna drag that variable costs down again.

Again, it’s going to give me a sum, and I don’t want it to be a sum, I want to change that to

be the standard deviation. So I have that option there. Again, change the number format

to currency with no decimal places. And click OK and click OK again. So now I can see by

model what the average variable cost is and what the standard deviation is. This is helpful

information because it can help me to determine if I have a specific or a group of models

that maybe have a much higher standard deviation than others indicating that there’s a

very large variance in what’s happening with my total variable cost. That would help me

take that to that next step, where I want to really start diving into that model and looking

at the individual variable costs that are going into those models to see what’s happening

there. Perhaps there’s a correlation between the types of options, that the buyer puts in

and the variable cost. So there’s still a lot of research to be done, but this will help me at

least narrow it down to where I should start my investigation.

So although it’s all great in a table, it really would be useful to see this in a visualization. So

again, if I click anywhere in my pivot table and go to pivot table, analyze and click on pivot

chart, I will get the option to create a chart out of that data. So notice that Excel is going to

give me the option to do a clustered column, which doesn’t really help me all that much.

Obviously I can see by looking at which models have the a higher standard deviation, but

they might also have a higher average total variable cost. Another useful way to look at

this would be to use a combo chart. The default that it’s giving me is bars for average, total

variable and align for standard. Let’s look at this as to, as a stacked column and then click

OK. So I’m going to make this bigger so we can see it a little bit better. So now we can see

by this visual, we can right away see the total of this or the total average variable costs and

then we can see the standard deviation of that. So again, we could see this one, they were

side by side as well, but we can clearly see that the Chare has the highest variable costs as

well as the highest variance, the highest standard deviation. So we may want to

go back and look at the Chare model. Look at the Island model and perhaps look a little

more deeply into, either Pebble or the Jespie. This is just one step in our analysis journey,

but it definitely helps us to narrow it down.

To summarize what we’ve done in going through, we identified our issue, our data, and we

analyzed it. This was the analysis of our descriptive statistics. This was our histogram

where we were able to see that. In our descriptives, we see that we have a mean of 17,000

in variable costs per model. The median was 15,000, but we have a pretty large standard


Page | 101

deviation, as well. It’s about, a little more than a third of the mean. So we then took that

information that we wanted to see. Well, let’s see what our distribution looks like, right?

Let’s see what our dispersion looks like. And we can see the majority of the observations

are between the 14 and the 24, which coincides with what our descriptive statistics told us.

And then we took it one step further and we wanted to see by model what the

descriptives looked like. And, so we really wanted to get an understanding of which

models seem to have the highest variable cost and combined with the highest standard

deviation. And we can see that the models that stand out to us are the Chare, the Island,

and the Pebble. The ones with the highest standard deviation is definitely the Chare. So

our next step will likely to be to go back and look at the models that seem to have the

highest variable cost, average variable cost and the highest standard deviation, and see if

we can dig down a little bit deeper to see what costs within those variable costs are

perhaps driving that or maybe some correlations with what options the buyer bought.

So our key takeaways, we identified that we were looking at trying to better understand

total variable cost using the 2015 to ’16 sales data. We did descriptive statistics on total

variable costs, and then we broke that down by model, so that we could evaluate which

specific model had the highest average variable costs and the largest standard deviation.

So thank you for watching this video for statistical analysis.


In the final video of this data analytics video series, Dr. Dzuranin will show you how to use Tableau

to bring it all together to create a dashboard with key performance indicators (KPIs) of the Huskie

Motor Corporation. Video Transcript (Est. time: 24:41 min):

Hello, welcome to Huskie Motors data analytics Video 5, “Explanatory Data Visualization,”

creating decision useful dashboards with key performance indicators. I’m Ann Dzuranin,

and I will be discussing the material in this video. What we will cover in this video is we’ll

first identify what the issue is. And the issue is that we want to create decision-useful

dashboards with key performance indicators. So what we need to do is identify the

dashboards that are going to be needed and then the relevant key performance


Page | 102

indicators that we would like to have on those dashboards. So when reading through the

Huskie Motors case, we’ve identified that the Huskie Motors management is interested in

understanding and being able to quickly evaluate region performance, brand

performance, model performance, and overall performance.

As we discussed in Video 2, deciding on performance measurements can, there can be a

variety of those, but what we’ve determined is that, we’ve determined that the KPIs that

we’re most interested in are going to be gross sales, sales volume, net revenue, and profit

margin. The data we’re going to use to create the dashboards is going to be the 2015 to

2016 sales transactions for Huskie Motors. We’ll use data visualization software. In this

video I’ll be demonstrating the software Tableau to analyze the data and evaluate that

with the visualizations that we create. We’ll then communicate our results by preparing

interactive dashboards, allowing management to be able to filter the dashboards to see

down to the most granular level of information that they’re interested in.

So let’s go ahead and get started. As I mentioned, the software I’m going to use to

demonstrate the visualizations in this video is called Tableau. Any visualization software

can be used to do something like this. But we’re going to work with Tableau today. I’ve

already loaded Tableau data into the Tableau file and that’s that 2015-’16 actual sales data.

So we can go right to the first worksheet and start to create visualizations that we then

use in our dashboards. So we identified that one of the things we’re interested in is gross

sales. So lets just look at gross sales right now by region. So I can go over and I can quickly

pull region into my columns and then I’ll find gross sales and drag that into rows. So now

you can see I’ve got Europe, North America, and South America, my three regions, and I

have gross sales, and this represents total gross sales, but I actually want to look at gross

sales separate 2015 and 2016, so the other thing that I need to bring over is going to be

the sold date. So I bring over the sold date into columns I’m going to get then the 2015-

2016 data.

Tableau, like a lot of visualization software packages, will give you a default that they think

is best. That’s not always the case. I want to look at this in column, bars so that I can kind

of look and see visually from one year to the next what’s happening with my data. I can

quickly change the name of my sheets so that I remember what I’m working with, and

we’re going to go ahead and call this gross sales by region. The other thing I want to think

about is how is it that I want to be able to view this data? Am I interested in region and

year, just years, when I have a color scheme that’s identified years as two different colors.


Page | 103

So ’15 is blue and ’16 is this orange color. That may be good if I’m just concerned about

year-to-year comparisons. I’m actually concerned about region to region and then within

that year. So I actually would prefer to have region as my color. So what I’m going to do is

take region and drag that into my colormark. Now I have colored my regions. So now any

of the visualizations that I use if I choose to use region is my color scheme then I can

quickly identify, you know, orange is North America, red is South America, blue is Europe. I

could also, if I wanted to, see the exact number of sales instead of having to look across at

the axis, I could also just drag gross sales into the label and that’ll put it on top of all of my

bars. Now that’s great, except it’s a little bit large. So I can format that number first of all,

so that it’s currency. And then I can also change the display unit. So then I just have it say

in millions. So now we can see very quickly what our sales are. We could actually even

remove this gross sale access over here if we wanted to as well to take away some of the

additional ink on the page that we may not need. I also might be interested in actually

seeing this with the highest gross sale, region first and then filtered from there. So what I

can do is go to region and I can sort this, and I can sort this based on the field, which

would be the sum of gross sales. And I’ll do it in descending order so that it’s from highest

to lowest. So now I can quickly see that North America, even though I could tell that from

the other version of it, at least this way I can see them in order as to which is contributing

the most to sales, gross sales by region.

So the power of doing visualizations is not only to be able to change the type of chart that

you’re using, but also to be able to make it interactive. And the best way to make

something interactive is to be sure that you have the information in here down to the

most granular region. So, for example, right now I have gross sales by region. I know

eventually I’m going to want to look at gross sales by brand and by model. So what I can

do is be sure that I have those in here as filters. So if I drag brand to filters right now, I

want all of them in there and I’m going to click OK. If I drag model to filters again, I'll

choose all right now and click oK. So what this allows me to do is if I only wanted to see

how the brand Tatra was doing in all the regions, I could just pick Tatra and click OK. And

now I have my visualization has changed and I can just see how the Tatra brand is doing.

So that can help to make something really powerful. And so this way, if I want to just

quickly use my visualization in a dashboard, now we’ll be able to filter down all the way to

the granular level for what we’re looking at right now is model. So this is a demonstration

of how we actually work with visualizations.

Let’s go ahead and look in a completed Tableau file that has put, where we’ve put together


Page | 104

the dashboards for the areas that we identified that we would like to have dashboards

ready to help us make decisions. So I’m going to switch over to the completed Tableau file

and we’ll start again with region performance. So what I’m going to do is put this in view, a

display mode. So this is a visualization that was created for region performance. So

remember we said we’re interested in, at revenue, we’re interested in sales volume, we’re

interested in profit margin. So the visualizations that were created for this particular, um,

dashboard show us net revenue by region, the increase and decrease in sales volume by

region, and then our profit margin by region. So very quickly in this dashboard I can see

that I’ve got profit margin, North America is decreasing. That could be driven and probably

is largely driven by my decrease in sales volume. I have a slight decrease in, um, South

America from ’15 to ’16. So that one I can see, although I have an increase in net revenue,

my profit margin is going down a bit. Now my sales volume line up. So this could be some,

somewhat due to perhaps increased expenses that’s driving my net revenue down. And

then when I look at my Europe, I can see that Europe actually is improving. So my profit

margin went from 10% to 13.6% from ’15 to ’16. And I could see my net revenue reflects

that. A large increase as well. And I can see that my sales volume is up there as well. I keep

in mind you can also use this to just, um, filter based on one, um, aspect of it. So I just

clicked on Europe, and then I have my other visualizations changed to only show me

Europe. So if I don’t want to look at what else is happening, if I’m just interested in looking

at Europe today, I just go in, I click on Europe, and that’s the only ones that pop up for me.

If I click on that again, it’ll bring them all back. And the same is true if I just clicked on

North America or I just clicked on South America.

So this is our dashboard that we’re going to use for region performance. Now we also said

another dashboard we’re interested in is brand performance. So I’m going to show you a

brand performance dashboard. So this dashboard, again, we’re looking at profit margin,

net revenue, and sales volume. I could also have brought gross sales over into the

visualization. Um, but you probably want to keep your visualizations to three and maybe

four visuals within the dashboard. Makes it easier for the decision maker to kind of zero in

on what they’re interested in. So now I’m looking at it by brand and I can see my brand

Apechete is decreasing on profit margin. Jackson has increased and my Tatra, now this is a

problem, ’cause I can see that that brand actually has a negative profit margin. It’s

increased a little bit from ’15 to ’16, but something is going on here that I really need to

look further into. If I look at brand performance, net revenue, again, I can see that the

Apechete, even though net revenue has gone up a bit, our profit margin is going down.

When I look at my Jackson, my net revenue has gone up, as has my profit margin. And


Page | 105

when I look at the Tatra, my net revenues going up a bit, but you know, that’s still in the

negative area here. So I really need to start thinking about what’s going on with the Tatra.

Um, and another thing that you can do in visualizations is you can create what’s called a

dual axis. So I’m interested in looking at how sales volume has changed from one year to

the next within my brands. And I can see, I can look at sales volume and I could have

another chart that just looks at the sales volume change, or I can put the two together. So

that gives me kind of an indication of the magnitude of the increase or decrease in my

sales volume. So if we look at the Apechete, for example, we had a sales volume in 2015

of 371 and that increased to 395. That represents a 6.47% increase. If I look at what

happened in Jackson, Jackson went up just a bit at 2.54%, and then I can see that sales

volume for Tatra is pretty high overall. But it did have an increase, and that increase was

about 4.2%. It may have helped a little bit of what’s happening over here. But I still have

some things I’ll really need to think about what that Tatra. The other dashboard that we

said we’re interested in are the other decision-making tools we’re interested in using is

something that can help us understand what’s happening with the model itself. And so we

created a dashboard here that combines both brand and model. And what’s interesting

about this is so there’s three types of visualizations that we’ve used here. Again, looking at

profit margin only. And so I’ve looked at profit margin ratio by brand and model. And what

I’ve done here is created a chart that if I look at the darkest blue, that’s the highest profit

margin, median profit margin ratio, and the darkest red is going to be the lowest. So if I

look at the Apechete right away, I see that I’m all in the blue, right? So I’ve got all positive

profit margins. When I look at the Jackson, I can see I’ve got a couple of brands, a couple

of models that I’m a little bit concerned about because they are showing a negative profit

margin, right? So they’re in losses. And when I look at the Tatra, I can see that I’ve got

some really strong models here, but I’ve got two models that are really dragging down

what’s happening overall at the Tatra. If I want to see just one model and see how that

profit margin is impacted in what region those are in, for example. So let’s say I really

want to see what’s happening with this, Jespie. And so if I click on 2015 for Jespie. I can see

really quickly how that’s affected in my regions, right? So even though my profit margin is

negative in 2015, you know what, let’s just look at 2016 instead, even though my profit

margin is negative I’m still doing okay in South America with that one. But you know, I’m

barely, I’m barely breaking, you know, positive margin in Europe, and I’ve got a huge

negative profit margin ratio in North America. So if I wanted to see which models were

doing best, so that would be the advantage. The advantages at 36.3% profit margin,

clearly, it’s driving a lot of North America. It’s doing well in Europe, although not nearly as

well. And it’s doing well in South America as well, although again, definitely performing


Page | 106

best in North America. So what’s interesting about being able to do this is, again, you can

quickly filter, and see how everything changes. Right now, I’m looking at both years. I do

have a filter up here where I can change that and I can only look at ’16. So that helps to,

allow me to very quickly just look at what’s happening in one year. And so if I look at just

that one year, then I can see again, the Mortimer is actually doing worse than Jespie in

that year. And I can also quickly see, that the one that’s doing best is still the advantage.

So this is a nice way to bring together both brand and model performance, to be able to

understand better what’s happening within our regions, our brands, and our models, and

from year to year.

Now, the last dashboard that I’m going to illustrate for you is that overall performance

dashboard. So this dashboard, what we want to do is really be able to make this very

interactive for the user. So they’re able to look at overall performance and then drill that

down by model or by brand. So what we’ve brought into this visualization is net revenue

by region, profit margin by region, and then growth sales by country so we can get a

better understanding of what’s happening within those regions. So the regions are still

color-coded as they were in my example where we’ve got North America as this orange

color, Europe is blue, and the red is South America. So if we look at this visualization on

the bottom, this is what’s called a tree map and what this represents, the larger the area,

right, the higher the amount of sales. And the color represents the region. And then I have

added the detail here where you can see what that number is in millions. So we can see

just by looking at this that the U.S. comprises right North America I should say, comprises

the largest portion of all of our sales or half of it, looks about half of our sales. And within

North America, the U.S. is by far the largest. If we look in South America, we’re actually

fairly close between Columbia, Venezuela, and Brazil. Then followed by Chile, Argentina,

and Bolivia. And when we look in Europe, we can see that we’ve got the U.K. and France

are equal, Germany, followed by Spain and then Poland and Sweden, which are the same.

So this is all well and good. I can see how this is all happening by country. If I just wanted

to click on the U.S. and see what the U.S. net revenue is and what the U.S. profit margins

are, I just click in U.S. and then I can just look at what’s happening there. If I’m interested,

for example, in just how France is performing, I click on France. I can see what my gross

sales are. I can see what my net revenue is year to year, and I can see what my profit

margin is as well. If I wanted to use this down and go down to an even more granular level,

remember we were concerned about what was happening with the Tatra. So if I picked

just Tatra, now my visualizations all change and it just shows me the results for just that

brand. So it’s that brand, but all the models within that brand and I can see that that


Page | 107

brand does not do very well in Europe and in South America, although this revenue is high

and the sales are high in the U.S., we are running at a loss as far as profit margin. So we

really need to think about kind of what’s happening there as well.

So, then if I just click on the U.S. and that will again show me what’s happening just within

the U.S. for the Tatra. Click back on there and bring it all back. We know that the Tatra has

some really high-performing brands and some really poorly performing brands. So if I go

and look at, for example, we knew that the, I think it was the Jespie was not doing so well.

We can see that this Jespie is what’s driving a lot of the negative profit margin, right?

Because we can see we’ve got a loss year to year for that model and we can see also

what’s happening over here in North America, right, with the loss, although we can see

that it is performing better in Europe in ’15, but declining in ’16 and improving in South

America. So if I look down and wanted to see, you know, well, hey, where is it doing well in

South America, I go down here and I can see that really Argentina is where I have sales.

And again, this is all relative, this profit margin is relative to what’s happening in Argentina

in South America. If I just clicked on Argentina, I can see that Argentina has had an

increase in sales of the Tetra in net revenue I should say from year to year. And I can see

that it’s had a very large increase in profit margin as well. So being able to make this

dashboard interactive, and I’ll go ahead and change this back now to all brands and all

models. When I have everything in here, it gives me that overall performance evaluation.

And then if I want to really dig into what’s driving some of my more concerning issues, for

example, my declining profit margin, then I can dig down deeper by looking at what’s

happening with each of the brands. And that automatically changes all my visualizations,

making it a very powerful way to explain what’s happening with your operations.

So we’re going to go back and kind of summarize what we’ve done here. And what we’ve

done is gone through and created dashboards for each of the areas that we want to

evaluate performance and identify the KPIs using the 2015-16 data and using those

visualizations to prepare that interactive dashboard. We put together first region

performance where we summarized what was happening in our region by looking at net

revenue, increase or decrease in sales volume, and then mean profit margin, median

profit margin by region. We then delved a little bit deeper by looking at what was

happening with our brands. We have three brands in our business. We’ve got the

Apechete, the Jackson, and the Tatra. And we looked again at what was happening with

profit margin, revenue, and then sales volume. And for this one, instead of just looking at

the increase or decrease, we looked at volume as well as what the percentage change was

from year to year in each of the brands. We took that one step further and looked at


Page | 108

brand and model performance. And there we created a visualization where we could look

at profit margin by growth, brand, and model. And we could make this, and we made this

interactive so that we could look at it either by both years, which is what this static view is

by both years. Or we could have changed and looked at one year at a time. And we could

look and see very quickly which profit margins are somewhat concerning. So we’ve got the

Jespie, the Mortimer, and the Crux, and these were for both years. And then we can look

at it by region over here. And we saw that we could look at just ’16. We could look at just

one particular brand or model, and this was color-coded so that the more the lower the

profit margin, if it was negative, it was in the red. And if it was positive it was in the blue,

and the darker the color, the more extreme either way. And then we fit, we wrapped it up

by looking at overall performance where we made this interactive so that we could, if we

wanted to, look at it just overall. So this all models, all brands, all models for all areas. We

looked a little bit deeper to see within the regions how the countries are performing. And

then we showed how we could change all of the visualizations at the same time by simply

going through and picking different models and brand combinations so that we could see

how those were doing.

So the takeaways, again, we identified our issue, which was to create useful dashboards

for decision making. We identified the KPIs we were interested [in] for that. We use the

data from 2015 and ’16 to analyze that data and communicated our results in a variety of

interactive dashboards that would allow the decision maker to change those dashboards

to help identify whatever areas they were interested in examining further.

So thank you for watching Video 5, “Explanatory Data Visualization.”

Huskie Motors Conclusion

This concludes the Huskie Motors data analytics video series. Up next, you will have the

opportunity to perform some data analytics yourself, as you attempt the exercises based on

the Huskie Motors case study. You’ll now need to access the data files from the Resources

tab and save them to your computer. You’ll use these files to perform the analytics required

for the exercises in the following lesson.

Please note: To perform the exercises, you will be asked to load raw data into Microsoft Excel.

Based on the version of Excel you are using, please refer to the following links for guidance


Page | 109

on loading raw data:

For Office 365: https://support.office.com/en-us/article/data-import-and-analysis-options-

3ea52160-08bc-45ac-acd9-bc4a11bcc2a2

For Excel 2010 – 2016: https://support.office.com/en-us/article/text-import-wizard-c5b02af6-

fda1-4440-899f-f78bafe41857#ID0EAAEAAA=Office_2010_-_Office_2016

Module 3 Wrap‐up

You have now completed Module 3, Applying Data Analytics and Visualization. In this module, you

were able to

Explore the data analytics and visualizations described in the Huskie Motors case study, and

Apply this knowledge to complete the exercises.

You will now advance to the final module of this course, Module 4 Conclusion and Final

Assessment

Module 4: Conclusion and Final Assessment

Course Roadmap

Click Next to proceed to Module 4, Conclusion and Final Assessment.

Summary

Welcome to Module 4, Conclusion and Final Assessment! In this module, we will conclude the Data

Analytics & Visualization Fundamentals Certificate course by reviewing the overall course learning

objectives, and then moving to the Final Assessment.

Course Objectives Review

Let’s revisit the learning objectives for this course. In the preceding modules, you learned to:

Recognize the impact of technology and analytics on the accounting profession.

Demonstrate how data analytics can influence organizational strategy.

Identify ways data visualizations effectively enable appropriate business decisions.

https://support.office.com/en-us/article/data-import-and-analysis-options-3ea52160-08bc-45ac-acd9-bc4a11bcc2a2

https://support.office.com/en-us/article/data-import-and-analysis-options-3ea52160-08bc-45ac-acd9-bc4a11bcc2a2

https://support.office.com/en-us/article/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857#ID0EAAEAAA=Office_2010_-_Office_2016

https://support.office.com/en-us/article/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857#ID0EAAEAAA=Office_2010_-_Office_2016


Page | 110

If you are unsure of your understanding of any of these objectives, please review the modules

before attempting the final assessment.

We Are Here to Support You

IMA is here to support you as advancements in technology and analytics transform the finance

and accounting profession. As mentioned by Jeff Thomson, IMA President and CEO, in Module 1,

IMA is committed to preparing you today for changes that are happening now as well as those

that are coming in the near future.

We offer many online courses, podcasts, webinars, and live events with relevant topics in the

realm of technology and analytics, and you can expect to see more added. This compilation of

resources is curated by IMA to help you navigate through this dynamic time and prepare you for

success.

Click below to navigate to IMA’s Technology & Analytics Center to access the resources and

recommendations to further educate yourself on these topics.

Final Assessment Introduction

Next, you will progress to the Final Assessment. Remember to click the “Submit” button, located

at the top-right of your screen, after choosing your answer. If you pass the Final Assessment with

a score of 70% or higher, you will earn your Data Analytics and Visualization Fundamentals

Certificate, CPE, as well as a digital badge that you can upload to your professional profile on

social media.

Good luck!

Course Wrap‐Up

To access your professional certificate, digital badge, and NASBE CPE certificate, go to the “My

Transcript” tab in IMA’s Learning Management System. To download the certificates, click on

“Certificates.” To download your Digital Badge, click on “Digital Badge.”

We need your feedback! Please click the link displayed to complete a brief survey on your learning

experience. If you haven’t already taken the survey, please click the link below.

Thank You

Thank you for taking the IMA Data Analytics & Visualization Fundamentals CertificateTM course.

IMA Data Analytics & Visualization Fundamentals CertificateTM – Course

Script

Page | 111

Please contact us if you have questions about this course or if you are interested in other IMA

courses.

Documents

IMA Data Analytics & Visualization Fundamentals