20
MBA Super Notes © M S Ahluwalia Sirf Business Version 1.0 Data

MBA Super Notes: Statistics: Data

Embed Size (px)

Citation preview

Page 2: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

MBA SUPER NOTES

Statistics

Page 3: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Disclaimer !

Copyright © 2014, by M S Ahluwalia Trademarks: Super Notes, Sirf Business and the MSA logo are trademarks of M S Ahluwalia in India and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. M S Ahluwalia, is not associated with any product or vendor mentioned in this book. Limit of liability/disclaimer of warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. This book should not be used as a replacement of expert opinion. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that internet websites listed in this work may have changed or disappeared between when this work was written and when it is read. This document contains notes on the said subject made by the author during the course of studies or general reading. The author hopes you will find these ‘super-notes’ useful in the course of your learning. In case you notice any errors or have any suggestions for the improvement of this document, please send an email to [email protected]. For general information on our other publications or for any kind of support or further information, you may reach us at http://SirfBusiness.blogspot.com.

Page 5: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Data

5

1

Types of data

Primary and Secondary

Qualitative and Quantitative

Discrete and Continuous

Nominal, Ordinal, Interval

and Ratio

Cross-sectional, Temporal and

Spatial

Variables • When we measure the attributes of an object, we obtain a value that varies between objects.

• Example: consider the people in a class as objects and their height as the attribute. The attribute height varies between objects, hence attributes are collectively known as variables.

Data • Name for values of a variable under study, taken collectively

• Basic source for descriptive and inferential statistics

*Explained further in the following pages

Page 6: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Primary and secondary data

6

1

Primary data

• Collected for some specific purpose or study

• Methods such as personal investigation or questionnaire used

Secondary Data

• Has its roots in primary data

• Data disseminated through media like reports and agencies

Same data may be Primary as well as secondary depending upon the frame of reference. Example: For the person who collects data through surveys it is primary data. But once a report containing the data is published, for anyone using the data from the report it would be secondary.

Collection of primary data

• Personal investigation

• Questionnaire

• Step 1: Design questionnaire

• Step 2: Collect data by conducting survey

Collection of secondary data

• Publications by governments, regulatory bodies

• Publications by industry associations

• Research on the internet

Page 7: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Qualitative and quantitative data

7

1

Qualitative/Categorical data

• Characterized in terms of names or labels

• Nominal level data are also known as Qualitative data

• Ex: variables with yes/no or male/female as responses

Quantitative/Numerical data

• Characterized in terms of numeric values

• Ordinal, Interval and Ratio level data are also collectively known as Quantitative data

• Ex: variables such as height, income, marks

Qualitative/categorical

Discrete

Nominal Ordinal

Quantitative/numerical

Discrete or continuous

Interval Ratio

Types of data

Page 8: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Discrete and continuous data

8

1

Discrete

• Arises when counting is involved

• Assumes only specified values in a given range

• Ex: Number of fruits in a box

Continuous

• Arises when measurement is involved

• Variable assumes all possible values in range

• Ex: Height

Page 9: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Nominal, ordinal, interval and ratio data

9

1

Nominal data

• Data are measured at the nominal level when each case is classified into one of a number of discrete categories

• Ex: Colors, Roll numbers, Subject codes

Ordinal data

• Data are measured on an ordinal scale if the categories imply order

• Data that is nominal and has order

• The difference between ranks is consistent in direction, but not magnitude

• Ex: Military rank, Hotel ratings

Interval data

• Quantitative data that can be measured on numerical scale

• Zero point does not mean the absence of something

• Ex: Temperature

Ratio data

• Quantitative data that can be measured on numerical scale

• Zero point means the absence of what is being measured

• This is the most common scale of measurement

• Ex: Height, sales

Page 10: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Cross-sectional, temporal and spatial

10

1

Cross-sectional data

• Values of a variable recorded over at the same point or period of time for many subjects

• Ex: age of all students in a particular year, stock prices of a set of companies in a given year

Temporal/Time-series data

• Data about a subject over a period of time

• Ex: Inflation of a country in last 5 years, students taking admission in last 3 years

Spatial data

• Data being viewed by geography

• Ex: population of state capitals, revenues of a company across geographies

Page 12: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Graphical descriptive statistics

12

2

Graphs in statistics

• Graphs and to a lesser extent, tables, give a visual summary of a variable

• Ideally there is an indication of the central (or “average”) value of the variable as well as an indication of the amount and pattern of variability (“spread”)

• The level of data restricts the type of graphs and/or tables that can be used

Page 13: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Classification and tabulation

13

2

Frequency distributions

Classification • Bringing together items that are similar in some respect

Tabulation • To condense the data in a tabular form to make it easier to comprehend

Frequency distribution

• Tabular representation of the number of times an item occurs by grouping them into numerically ordered categories

• Data recorded at either Ordinal, Interval or Ratio levels are summarized by frequency distributions

• Some information is lost in the process of generating the distribution

• Number of intervals used is decided by the user

• Class width is determined after the number of intervals is decided upon (preferably widths should be same)

• There should be no overlap

Relative frequency distribution

• Tabular representation of the proportion of times an item occurs

Page 14: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Stem and leaf diagram

• Stem and Leaf plots are a useful way of ordering data so we can study their characteristics. It simultaneously organizes the data for further analyses, and presents the data in both table and chart form

• It is essentially an ordered array, frequency distribution and histogram all in one

• The S & L display contains all the same information, but also allows us to easily look at critical aspects of the data, such as where the highest and lowest values are, where most of the values lie, where the middle of the values is etc.

• The drawback can occasionally be with deciding stem values

• It is preferable to order the leaves, to make the representation more insightful, though not necessary

Stem and leaf diagram

14

2

Sample stem and leaf diagram

Stem unit: 100

Corresponding values (rounded off to nearest 10)

4 7 8 470, 478

5 1 1 5 5 9 510, 510, 550, 550, 590

6 2 3 5 7 8 9 620, 630, 650, 670, 680, 690 7 0 1 4 5 700, 710, 740, 750

8 5 7 9 850, 870, 890

Page 15: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Visual representation of frequency distributions (1/2)

15

2

Histograms

• A graphical representation of the frequency distribution

Frequency polygons

• The class frequencies are plotted above the midpoint of each class interval and connected by straight lines

• An alternative to the histogram best suited for comparing two or more frequency distributions

• The frequency polygons do not have the restriction of limits between 0 and 1

Ogive

• A graph of cumulative relative frequency distributions

• It has value = 0 at the lower limit of first class and value = 1 at the upper limit of highest class

• Used to predict the approximate proportion of observations more or less than a given value

• Ex: # of calls < 60 sec

0

1

2

3

4

5

6

<20 20-40 40-60 >60

Nu

mb

er

of

calls

(th

ou

san

ds)

Duration in seconds

Call duration

0

1

2

3

4

5

6

<20 20-40 40-60 >60

Nu

mb

er

of

calls

(th

ou

san

ds)

Duration in seconds

Call duration

0.00

0.20

0.40

0.60

0.80

1.00

<20 20-40 40-60 >60

Nu

mb

er

of

calls

(r

atio

of

tota

l)

Duration in seconds

Call duration

Page 16: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Visual representation of frequency distributions (2/2)

16

2

Cross tabulation

• It summarizes the data for two variables simultaneously

Line graph

• Visual representation of a set of data joined by straight lines

Lorenz curve

• It represents the extent of inequality in a data-set

• Area between the line of perfect equality and the line of actual equity holding indicates departure from equality

Scatter diagram

• Graphical representation of relationship between two quantitative variables

0

2

4

6

8

Q1 Q2 Q3 Q4

Sales

Sales

0

0.5

1

1.5

2

2.5

3

3.5

0 1 2 3

Page 17: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Pie charts

• Graphical representation of the proportion of times an item occurs

• A circle divided into a number of sectors each representing relative magnitude of various components

• Effective when the aim is to display the relative size of the categories

Bar charts

• Graphical representation of the number of times an item occurs

• The frequency for each category is represented by a bar

• Can be raw or relative frequencies

• Bars are separated by spaces

Graphical representation of qualitative data

17

2

59% 23%

10%

8%

Sales

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

0

1

2

3

4

5

6

7

Q1 Q2 Q3 Q4

Sales

Sales

Page 18: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Types of Bar charts

18

2

Component Bar chart

• Also known as grouped or percentage bar chart

• A rectangular equivalent of a pie chart (when only one x axis value)

• Used for comparison of percentage composition

Subdivided bar chart

• Each bar is further divided into components

Multiple bar chart

• Multiple bars for each point of series

Pareto chart

• Bars are arranged in order of height (frequency) from largest to smallest

• Facilitates identification of most frequent occurrence or causes of an event or phenomenon

0%

20%

40%

60%

80%

100%

Q1 Q2 Q3 Q4

Sales

East

South

West

0

2

4

6

8

10

12

14

Q1 Q2 Q3 Q4

Sales

East

South

West

0

1

2

3

4

5

6

Q1 Q2 Q3 Q4

Sales

West

South

East

0

1

2

3

4

5

Q4 Q1 Q3 Q2

Sales

Sales

Page 19: MBA Super Notes: Statistics: Data

MBA Super Notes © M S Ahluwalia Sirf Business

Do you have any questions or some feedback to share?

Send an email to

[email protected]

Thank You!

19