Mb0040 Statistics for Management Final

Embed Size (px)

Citation preview

  • 8/3/2019 Mb0040 Statistics for Management Final

    1/16

    MBA SEMESTER 1

    MB0040 STATISTICS FOR MANAGEMENT

    Assignment Set- 1

    Q 1. (a) Statistics is the backbone of decision-making. Comment.

    Ans:- Due to advanced communication network, rapid changes in consumer behaviour,

    varied expectations of variety of consumers and new market openings, modern

    managers have a difficult task of making quick and appropriate decisions. Therefore,

    there is a need for them to depend more upon quantitative techniques like

    mathematical models, statistics, operations research and econometrics.

    As you can see, what the General Manager is doing here is using Statistics to solve a

    problem and to increase profits.

    Decision making is a key part of our day-to-day life. Even when we wish to purchase a

    television, we like to know the price, quality, durability, and maintainability of various

    brands and models before buying one. As you can see, in this scenario we are collecting

    data and making an optimum decision. In other words, we are using Statistics.

    Again, suppose a company wishes to introduce a new product, it has to collect data on

    market potential, consumer likings, availability of raw materials, feasibility of producing

    the product. Hence, data collection is the back-bone of any decision making process.

    Many organisations find themselves data-rich but poor in drawing information from it.

    Therefore, it is important to develop the ability to extract meaningful information from

    raw data to make better decisions. Statistics play an important role in this aspect.

    Statistics is broadly divided into two main categories. Figure 1.1 illustrates the two

    categories. The two categories of Statistics are descriptive statistics and inferential

    statistics.

    S.Konar (Roll No.521135105) 1

  • 8/3/2019 Mb0040 Statistics for Management Final

    2/16

    Divisions in Statistics

    Descriptive Statistics: Descriptive statistics is used to present the general description

    of data which is summarised quantitatively. This is mostly useful in clinical research,

    when communicating the results of experiments.

    Inferential Statistics: Inferential statistics is used to make valid inferences from the

    data which are helpful in effective decision making for managers or professionals.

    Statistical methods such as estimation, prediction and hypothesis testing belong to

    inferential statistics. The researchers make deductions or conclusions from the collected

    data samples regarding the characteristics of large population from which the samples

    are taken. So, we can say Statistics is the backbone of decision-making.

    S.Konar (Roll No.521135105) 2

  • 8/3/2019 Mb0040 Statistics for Management Final

    3/16

    Q.1. (b) Give plural meaning of the word Statistics?

    Ans:- Plural of Word Statistic:

    The word statistics is used as the plural of the word Statistic which refers to a

    numerical quantity like mean, median, variance etc, calculated from sample value.

    In plural sense, the word statistics refer to numerical facts and figures collected in a

    systematic manner with a definite purpose in any field of study. In this sense, statistics

    are also aggregates of facts which are expressed in numerical form. For example,

    Statistics on industrial production, statistics or population growth of a country in

    different years etc.

    For Example: If we select 15 student from a class of 80 students, measure their

    heights and find the average height. This average would be a statistic.

    Q 2. a. In a bivariate data on x and y, variance of x = 49, variance of y = 9

    and covariance (x,y) = -17.5. Find coefficient of correlation between x and

    y.

    Ans:- We know that:

    Given

    S.Konar (Roll No.521135105) 3

    http://resources.smude.edu.in/slm/wp-content/uploads/2010/02/clip-image0426.gifhttp://resources.smude.edu.in/slm/wp-content/uploads/2010/02/clip-image0405.gifhttp://resources.smude.edu.in/slm/wp-content/uploads/2010/02/clip-image0383.gifhttp://resources.smude.edu.in/slm/wp-content/uploads/2010/02/clip-image0365.gifhttp://resources.smude.edu.in/slm/wp-content/uploads/2010/02/clip-image0345.gif
  • 8/3/2019 Mb0040 Statistics for Management Final

    4/16

    Hence, there is a highly negative correlation.

    Q 2. b. Enumerate the factors which should be kept in mind for proper

    planning.

    Ans:- Planning a Statistical Survey

    The relevance and accuracy of data obtained in a survey depends upon the careexercised in planning. A properly planned investigation can lead to best results withleast cost and time. Steps involved in the planning stage.

    Step-1: nature of the problem to be investigated should be cheerily defined in anunambibiguous manner.

    Step-2: Objective of investigation be stated at the outset objective could be to:

    Obtain certain estimates.

    Establish a Theory.

    Verify an existing statement

    Find relationship between characteristics

    Step-3: The scope of the investigation has to be made clear. The scope of

    investigation refers to the area to be covered. Identification of units to be studiednature of characteristics to be observed accuracy of measurement, analytical methods,

    time cost and other resources required.

    Step-4: Whether to use data collected from primary or secondary source should be

    determined in advance.

    Step-5: The organization of investigation is the final step in the process. It

    encompasses the determination of the number of investigator required their trainingsupervision work needed, fund required.

    Q 3. The percentage sugar content of Tobacco in two samples was represented

    in table 11.11. Test whether their population variances are same.

    Table 1. Percentage sugar content of Tobaccoin two samples

    Sample A 2.4 2.7 2.6 2.1 2.5

    Sample B 2.7 3 2.8 3.1 2.2 3.6

    S.Konar (Roll No.521135105) 4

  • 8/3/2019 Mb0040 Statistics for Management Final

    5/16

    Ans:-

    Required values of the method I to calculate sample mean

    X d = X - 2.5 d2

    2.4 0.1 0.01

    2.7 -0.2 0.04

    2.6 -0.1 0.01

    2.1 0.4 0.16

    2.5 0 0

    Total 0.2 0.22

    Required values of the method II to calculate sample mean

    X d = X 3 d2 3

    2.7 0.3 0.09

    3 0 0

    2.8 0.2 0.04

    3.1 -0.1 0.1

    2.2 0.8 0.64

    3.6 -0.6 0.36

    Total 0.6 1.23

    S2

    =1

    [ d2 -(d)

    2

    ]1

    n1

    -1n1

    = 1 [ 0.22 - 0.04 / 5 ]

    S.Konar (Roll No.521135105) 5

  • 8/3/2019 Mb0040 Statistics for Management Final

    6/16

    4

    = 0.053

    S2

    =1

    [ d2 -(d)2

    ]2 n2 -1 n2

    =1

    [1.23-0.053

    ]5 6

    = 0.244 not significant

    Q 4. a. Explain the characteristics of business forecasting.

    Ans:-Characteristics of business forecasting

    Based on past and present conditions

    Business forecasting is based on past and present economic condition of the business.To forecast the future, various data, information and facts concerning to economiccondition of business for past and present are analysed.

    Based on mathematical and statistical methods

    The process of forecasting includes the use of statistical and mathematical methods. Byusing these methods, the actual trend which may take place in future can beforecasted.

    Period

    The forecasting can be made for long term, short term, medium term or any specific

    S.Konar (Roll No.521135105) 6

  • 8/3/2019 Mb0040 Statistics for Management Final

    7/16

    Estimation of future

    The business forecasting is to forecast the future regarding probable economicconditions.

    Scope

    The forecasting can be physical as well as financial.

    Q 4. b. Differentiate between prediction, projection and forecasting.

    Ans:-Prediction, projection and forecasting

    A great amount of confusion seem to have grown up in the use of words forecast,prediction and projection.

    Forecasts are made by estimating future values of the external factors by means ofprediction, projection or forecast and from these values calculating the estimate of thedependent variable.

    Q 5. What are the components of time series? Bring out the significance of

    moving average in analysing a time series and point out its limitations.

    Ans:-Components of Time Series

    The behaviour of a time series over periods of time is called the movement of the timeseries. The time series is classified into the following four components:

    S.Konar (Roll No.521135105) 7

  • 8/3/2019 Mb0040 Statistics for Management Final

    8/16

    i) Long term trend or secular trend

    ii) Seasonal variations

    iii) Cyclic variations

    iv) Random variations

    Method of moving averages

    Moving averages method is used for smoothing the time series. That is, it smoothes the

    fluctuations of the data by the method of moving averages.

    When period of moving average is odd

    To determine the trend by this method, the procedure is described in

    Procedure for determining the trend when moving average is odd

    By plotting these trend values (if desired) you can obtain the trend curve with the help

    of which you can determine the trend whether it is increasing or decreasing. If needed,

    S.Konar (Roll No.521135105) 8

  • 8/3/2019 Mb0040 Statistics for Management Final

    9/16

    you can also compute short-term fluctuations by subtracting the trend values from the

    actual values.

    When period of moving averages is even

    When period of moving average is even (such as 4 years), we compute the movingaverages by using the steps described in below

    Procedure for determining the trend when moving average is even

    Merits and demerits of moving averages method

    Merits Demerits

    This is a simple method. No functional relationship between the

    S.Konar (Roll No.521135105) 9

  • 8/3/2019 Mb0040 Statistics for Management Final

    10/16

    values and the time. Thus, this method

    is not helpful in forecasting and

    predicting the values on the basis of

    time.

    This method is objective in

    the sense that anybody

    working on a problem with

    this method will get the same

    results.

    No trend values for some years in the

    beginning and some in the end. For

    example, for 5 yearly moving

    average, there will be no trend values

    for the first two years and the last

    three years.

    This method is used for

    determining seasonal, cyclic

    and irregular variations

    besides the trend values.

    In case of nonlinear trend, the values

    obtained by this method are biased in

    one or the other direction.

    This method is flexible

    enough to add more figuresto the data because the

    entire calculations are not

    changed.

    The period selection of moving average

    is a difficult task. Hence, great care hasto be taken in period selection,

    particularly when there is no business

    cycle during that time.

    If the period of moving

    averages coincides with the

    period of cyclic fluctuations in

    the data, such fluctuations

    are automatically eliminated.

    Q 6. List down various measures of central tendency and explain the

    difference between them?

    Ans:- Measures of Central Tendency

    Several different measures of central tendency are defined below.

    1 Arithmetic Mean

    S.Konar (Roll No.521135105) 10

  • 8/3/2019 Mb0040 Statistics for Management Final

    11/16

    The arithmetic mean is the most common measure of central tendency. It simply the

    sum of the numbers divided by the number of numbers. The symbol m is used for the

    mean of a population. The symbol M is used for the mean of a sample. The formula for

    m is shown below:

    Where X is the sum of all the numbers in the numbers in the sample and N is the

    number of numbers in the sample. As an example, the mean of the numbers 1 + 2 + 3

    + 6 + 8 = 20/5 = 4 regardless of whether the numbers constitute the entire population

    or just a sample from the population.

    The table, Number of touchdown passes (Table 1: Number of touchdown passes),

    shows the number of touchdown (TD) passes thrown by each of the 31 teams in the

    National Football League in the 2000 season.

    The mean number of touchdown passes thrown is 20.4516 as shown below.

    Number of touchdown passes

    Although the arithmetic mean is not the only "mean" (there is also a geometric

    mean), it is by far the most commonly used. Therefore, if the term "mean" is used

    S.Konar (Roll No.521135105) 11

  • 8/3/2019 Mb0040 Statistics for Management Final

    12/16

    without specifying whether it is the arithmetic mean, the geometric mean, or some

    other mean, it is assumed to refer to the arithmetic mean.

    2 Median

    The median is also a frequently used measure of central tendency. The median is the

    midpoint of a distribution: the same number of scores are above the median as below

    it. For the data in the table, Number of touchdown passes (Table 1: Number of

    touchdown passes), there are 31 scores. The 16th highest score (which equals 20) is

    the median because there are 15 scores below the 16th score and 15 scores above

    the 16th score. The median can also be thought of as the 50th percentile3. Let's

    return to the made up example of the quiz on which you made a three discussedpreviously in the module Introduction to Central Tendency4 and shown in Table 2:

    Three possible datasets for the 5-point make-up quiz.

    Three possible datasets for the 5-point make-up quiz

    For Dataset 1, the median is three, the same as your score. For Dataset 2, the

    median is 4. Therefore, your score is below the median. This means you are in thelower half of the class. Finally for Dataset 3, the median is 2. For this dataset, your

    score is above the median and therefore in the upper half of the distribution.

    S.Konar (Roll No.521135105) 12

  • 8/3/2019 Mb0040 Statistics for Management Final

    13/16

    Computation of the Median: When there is an odd number of numbers, the median is

    simply the middle number. For example, the median of 2, 4, and 7 is 4. When there is

    an even number of numbers, the median is the mean of the two middle numbers.

    Thus, the median of the numbers 2, 4, 7, 12 is 4+7/2 = 5:5.

    3 mode

    The mode is the most frequently occuring value. For the data in the table, Number of

    touchdown passes (Table 1: Number of touchdown passes), the mode is 18 since

    more teams (4) had 18 touchdown passes than any other number of touchdown

    passes. With continuous data such as response time measured to many decimals, the

    frequency of each value is one since no two scores will be exactly the same (seediscussion of continuous variables5). Therefore the mode of continuous data is

    normally computed from a grouped frequency distribution. The Grouped frequency

    distribution (Table 3: Grouped frequency distribution) table shows a grouped

    frequency distribution for the target response time data. Since the interval with the

    highest frequency is 600-700, the mode is the middle of that interval (650).

    Grouped frequency distribution

    Proportions and Percentages

    When the focus is on the degree to which a population possesses a particular

    attribute, the measure of interest is a percentage or a proportion.

    S.Konar (Roll No.521135105) 13

  • 8/3/2019 Mb0040 Statistics for Management Final

    14/16

    A proportion refers to the fraction of the total that possesses a certain

    attribute. For example, we might ask what proportion of women in our sample

    weigh less than 135 pounds. Since 3 women weigh less than 135 pounds, the

    proportion would be 3/5 or 0.60.

    A percentage is another way of expressing a proportion. A percentage is equal

    to the proportion times 100. In our example of the five women, the percent of

    the total who weigh less than 135 pounds would be 100 * (3/5) or 60 percent.

    Notation

    Of the various measures, the mean and the proportion are most important. The

    notation used to describe these measures appears below:

    X: Refers to a population mean.

    x: Refers to a sample mean.

    P: The proportion of elements in the population that has a particular attribute.

    p: The proportion of elements in the sample that has a particular attribute.

    Q: The proportion of elements in the population that does not have a specified

    attribute. Note that Q = 1 - P.

    q: The proportion of elements in the sample that does not have a specified

    attribute. Note that q = 1 - p.

    Q 6 b. What is a confidence interval, and why it is useful? What is a confidence

    level?

    Ans;-

    Confidence Intervals

    In statistics, a confidence interval (CI) is a particular kind of interval estimate of apopulation parameter and is used to indicate the reliability of an estimate. It is anobserved interval (i.e. it is calculated from the observations), in principle different fromsample to sample, that frequently includes the parameter of interest, if the experimentis repeated. How frequently the observed interval contains the parameter is determinedby the confidence level or confidence coefficient.

    S.Konar (Roll No.521135105) 14

    http://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Interval_estimationhttp://en.wikipedia.org/wiki/Population_parameterhttp://en.wikipedia.org/wiki/Interval_estimationhttp://en.wikipedia.org/wiki/Population_parameterhttp://en.wikipedia.org/wiki/Statistics
  • 8/3/2019 Mb0040 Statistics for Management Final

    15/16

    A confidence interval with a particular confidence level is intended to give the assurancethat, if the statistical model is correct, then taken over all the data that might havebeen obtained, the procedure for constructing the interval would deliver a confidenceinterval that included the true value of the parameter the proportion of the time set bythe confidence level. More specifically, the meaning of the term "confidence level" isthat, if confidence intervals are constructed across many separate data analyses of

    repeated (and possibly different) experiments, the proportion of such intervals thatcontain the true value of the parameter will approximately match the confidence level;

    this is guaranteed by the reasoning underlying the construction of confidence intervals.

    A confidence interval does not predict that the true value of the parameter has a

    particular probability of being in the confidence interval given the data actuallyobtained. (An interval intended to have such a property, called a credible interval, can

    be estimated using Bayesian methods; but such methods bring with them their owndistinct strengths and weaknesses).

    The confidence level sets the boundaries of a confidence interval, this isconventionally set at 95% to coincide with the 5% convention of statistical significance

    in hypothesis testing. In some studies wider (e.g. 90%) or narrower (e.g. 99%)confidence intervals will be required. This rather depends upon the nature of your

    study. You should consult a statistician before using CI's other than 95%.

    You will hear the terms confidence interval and confidence limit used. The confidenceinterval is the range Q-X to Q+Y where Q is the value that is central to the study

    question, Q-X is he lower confidence limit and Q+Y is the upper confidence limit.

    Familiarise yourself with alternative CI interpretations:

    Common

    A 95% CI is the interval that you are 95% certain contains the true populationvalue as it might be estimated from a much larger study.

    The value in question can be a mean, difference between two means, a proportion etc.The CI is usually, but not necessarily, symmetrical about this value.

    Pure Bayesian

    The Bayesian concept of a credible interval is sometimes put forward as a more

    practical concept than the confidence interval. For a 95% credible interval, the value ofinterest (e.g. size of treatment effect) lies with a 95% probability in the interval. Thisinterval is then open to subjective moulding of interpretation. Furthermore, the credibleinterval can only correspond exactly to the confidence interval if prior probability is socalled "uninformative".

    S.Konar (Roll No.521135105) 15

    http://en.wikipedia.org/wiki/Credible_intervalhttp://en.wikipedia.org/wiki/Bayesian_statisticshttp://en.wikipedia.org/wiki/Credible_intervalhttp://en.wikipedia.org/wiki/Bayesian_statistics
  • 8/3/2019 Mb0040 Statistics for Management Final

    16/16

    Pure frequentist

    Most pure frequentists say that it is not possible to make probability statements, suchCI interpretation, about the study values of interest in hypothesis tests.

    Neymanian

    A 95% CI is the interval which will contain the true value on 95% of occasions if astudy were repeated many times using samples from the same population.

    Neyman originated the concept of CI as follows: If we test a large number of differentnull hypotheses at one critical level, say 5%, then we can collect all of the rejected nullhypotheses into one set. This set usually forms a continuous interval that can bederived mathematically and Neyman described the limits of this set as confidence limitsthat bound a confidence interval. If the critical level (probability of incorrectly rejectingthe null hypothesis) is 5% then the interval is 95%. Any values of the treatment effectthat lie outside the confidence interval are regarded as "unreasonable" in terms of

    hypothesis testing at the critical level.

    S.Konar (Roll No.521135105) 16