Comparing categorical data - Haese Mathematics categorical data Chapter18 Contents: A Categorical data B Examining categorical data C Comparing and reporting categorical data D Data

  • View
    220

  • Download
    3

Embed Size (px)

Text of Comparing categorical data - Haese Mathematics categorical data Chapter18 Contents: A Categorical...

  • Comparingcategorical data

    18Chapter

    Contents: A Categorical data

    B Examining categorical data

    C Comparing and reporting

    categorical data

    D Data collectionE Misleading graphs

    IB MYP_3

    magentacyan yellow black

    0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100 0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100

    Y:\HAESE\IB_MYP3\IB_MYP3_18\369IB_MYP3_18.CDR Wednesday, 28 May 2008 11:52:44 AM PETER

  • HISTORICAL NOTE FLORENCE NIGHTINGALE

    OPENING PROBLEM

    Florence Nightingale (1820 - 1910) was borninto an upper class English family. Her

    father believed that women should have an

    education, and she learnt Italian, Latin, Greek

    and history, and had an excellent early preparation in

    mathematics.

    She served as a nurse during the Crimean War, and became

    known as the lady with the lamp. During this time she

    collected data and kept systematic records.

    After the war she came to believe that most of the soldiers

    in hospital were killed by insanitary living conditions rather

    than dying from their wounds.

    She wrote detailed statistical reports and represented her statistical data graphically.

    She demonstrated that statistics provided an organised way of learning and this led to

    improvements in medical and surgical practices.

    A construction company is building a new high-rise apartment building in

    Tokyo. It will be 24 floors high with 8 apartments on each floor.

    The company needs to know some information about the people who will

    be buying the apartments. They prepare a form which is published in all

    local papers and on-line:

    Marital status:

    married single

    Age group:

    18 to 35 36 to 59 60+

    Desired number of bedrooms:

    1 2 3

    The statistical officer receives 272 responses and these are typed in coded form.

    Marital Status

    Age group

    Married (M) Single (S)

    18 to 35 (Y) 36 to 59 (I) 60+ (O)

    1 2 3 bedroomsApartment size

    HANAKOCONSTRUCTIONS

    Please respond only if you have in owning yourown residence in this prestigious new block.

    some interest

    HANAKO CONSTRUCTIONS NEW APARTMENTS

    U70 400to million

    Phone number: :::::::::::::::::::::::::::::::::::::::::::::

    Current address: :::::::::::::::::::::::::::::::::::::::::::

    Name: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::

    370 COMPARING CATEGORICAL DATA (Chapter 18)

    IB MYP_3

    magentacyan yellow black

    0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100 0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100

    Y:\HAESE\IB_MYP3\IB_MYP3_18\370IB_MYP3_18.CDR Wednesday, 28 May 2008 11:53:29 AM PETER

  • The results are:

    MY1 MI3 MI2 MO2 MY2 MO2 MO2 MY2 MO2 MI2

    SY1 MO2 MY1 MI3 MO2 SO1 MI3 SO2 MO2 MO2

    MI3 SO3 SO2 MI3 MI1 MO3 SI3 MO2 SO2 SO1

    SO1 MI3 MO2 SO1 SY2 MO1 MY1 MI2 MO1 MO1

    MO2 SO1 SO2 MI3 MO1 MI3 SI1 SI2 MO2 MO1

    SO1 MO2 MI3 MI3 MO1 MI2 MO2 MO2 MO1 MO1

    MO2 MI3 SY2 MO3 MO1 MI3 MI3 MI3 MO1 SO3

    SO1 MO2 SI2 SO1 MO3 MI3 SI2 MO1 MI3 MO1

    MO2 MO1 MI3 MY2 MY3 MI3 MI1 MY1 SY2 MI3

    SO1 MY2 MI3 MO1 SI3 SI1 SY3 MO1 MO1 SO1

    MY1 MI3 MI3 MI3 MY2 MO3 MO2 SO2 MI3 MO1

    MO1 MI1 SI2 MO3 MI1 MI3 MI3 MY3 MO2 MO1

    MO2 MY2 SO2 MY2 SO1 SI2 SO3 MO3 MI3 MI3

    SO2 MI3 MI3 SO1 MY2 MI3 SY2 MO1 MI2 MI3

    SO1 SO2 MI3 MO3 SO2 SY1 SO2 SI1 MY2 SI1

    MI2 MI3 MI3 MY2 MY2 MI3 MO2 MO3 MO1 MI3

    MO1 SO1 MO1 MO2 MO2 SO2 MI3 SO1 MI3 SI1

    MI2 MY2 MI3 SI1 MI3 MO2 MI3 MI3 MO1 MO2

    MI3 SI1 MI3 MI3 SY2 SO2 MO1 SI2 SO2 SO1

    SO1 MI2 MO2 MO2 MO1 MI3 MI3 MI3 MO3 MO2

    MI2 MI3 MO1 MI3 SO1 SO2 SI2 SO1 SI2

    SO1 MI3 MI3 MO3 MO2 MY1 MO2 MI3 MO3

    MI1 SY2 MO3 SO1 MY2 SI2 MI2 MI3 SI1

    MO1 MO2 MO3 MI3 MO1 SO1 MI2 MI3 MO2

    MI3 MI3 MI3 SO1 MI3 MI3 SY2 SI3 MO2

    MI1 SO1 MI3 MY2 SY3 MI3 MI2 SO2 MO2

    SO1 MI3 MI3 MY1 MI1 MO2 MY1 MI2 MO3

    MI1 MI3 MI3 SI1 MO3 MO1 SI1 SO1 SI1

    Things to think about:

    What problems are the construction company trying to solve? Is the companys investigation a census or a survey? What are the variables? Are the variables categorical or quantitative? What are the categories of the categorical variables? Can you explain why the construction company is interested in these categories? Is the data being collected in an unbiased way? Why were the names, addresses and phone numbers of respondents asked for? Can you make sense of the data in its present form? How could you reorganise the data so that it can be summarised and displayed? What methods of display are appropriate here? Can you make a conclusion regarding the data and write a report of your findings?

    COMPARING CATEGORICAL DATA (Chapter 18) 371

    IB MYP_3

    magentacyan yellow black

    0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100 0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100

    Y:\HAESE\IB_MYP3\IB_MYP3_18\371IB_MYP3_18.CDR Wednesday, 28 May 2008 11:53:31 AM PETER

  • Statistics is the art of solving problems and answering questions by collecting

    and analysing data.

    The facts or pieces of information we collect are called data.

    One piece of information is known as one piece of datum (singular), whereas lots of pieces

    of information are known as data (plural).

    A list of information is called a data set. If it is not in organised form it is called raw data.

    VARIABLES

    There are two types of variables that we commonly deal with:

    A categorical variable describes a particular quality or characteristic. The data isdivided into categories, and the information collected is called categorical data.

    Examples of categorical variables are:

    Getting to school:

    Colour of eyes:

    the categories could be train, bus, car and walking.

    the categories could be blue, brown, hazel, green, grey.

    A quantitative variable has a numerical value and is often called a numericalvariable. The information collected is called numerical data.

    Quantitative variables can be either discrete or continuous.

    A quantitative discrete variable takes exact number values and is often a result of

    counting.

    Examples of discrete quantitative variables are:

    The number of people in a household: the variable could take the values

    1, 2, 3, .....

    The score out of 30 for a test: the variable could take the values

    0, 1, 2, 3, ......, 30.

    A quantitative continuous variable takes numerical values within a certain

    continuous range. It is usually a result of measuring.

    Examples of quantitative continuous variables are:

    The weight of newborn babies: the variable could take any positive value

    on the number line but is likely to be in the

    range 0:5 kg to 8 kg.

    The heights of 14 year old students: the variable would be measured in

    centimetres. A student whose height is

    recorded as 145 cm could have exact heightanywhere between 144:5 cm and 145:5 cm.

    CATEGORICAL DATAA

    372 COMPARING CATEGORICAL DATA (Chapter 18)

    IB MYP_3

    magentacyan yellow black

    0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100 0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100

    Y:\HAESE\IB_MYP3\IB_MYP3_18\372IB_MYP3_18.CDR Wednesday, 28 May 2008 11:53:33 AM PETER

  • CENSUS OR SAMPLE

    The two methods of data collection are by census or sample.

    A census involves collecting data about every individual in a whole population.

    The individuals in a population may be people or objects. A census is detailed and accurate

    but is expensive, time consuming, and often impractical.

    A sample involves collecting data about a part of the population only.

    A sample is cheaper and quicker than a census but

    is not as detailed or as accurate. Conclusions drawn

    from samples always involve some error.

    A sample must truly reflect the characteristics of the

    whole population. To ensure this it must be unbiased

    and large enough.

    Just how large a sample needs to be is discussed in

    future courses.

    In a biased sample, the data has been unfairly influenced by the collection process.

    It is not truly representative of the whole population.

    STATISTICAL GRAPHS

    Two variables under consideration are usually linked by one being dependent on the other.

    For example: The total cost of a dinner depends on the number of guests present.

    The total cost of a dinner is the dependent variable.

    The number of guests present is the independent variable.

    When drawing graphs involving two variables,

    the independent variable is usually placed on the

    horizontal axis and the dependent variable is

    placed on the vertical axis. An exception to this

    is when we draw a horizontal bar chart.

    Acceptable graphs which display categorical data are:

    The mode of a set of categorical data is the category which occurs most frequently.

    dependent variable

    independent variable

    Vertical column graph

    42

    68

    10

    Horizontal bar chart

    42 6 8 10

    Segment bar chartPie chart

    COMPARING CATEGORICAL DATA (Chapter 18) 373

    IB MYP_3

    magentacyan yellow black

    0 05 5

    25

    25

    75

    75

    50

    50

    95

    95

    100

    100 0 05 5

    25