61
UNIT 4 Data Collection

data collection

Embed Size (px)

Citation preview

Page 1: data collection

UNIT 4Data Collection

Page 2: data collection

• While deciding about the method of data collection, we should keep in mind two types of data

• Primary Data

• Secondary Data

Page 3: data collection

• Primary Data:• which is collected fresh and for the first time

and thus happen to be original in character.

• Secondary Data:• Which have already been collected by

someone else and which have already been passed through the statistical process.

Page 4: data collection

ComparisonFactors Primary Data Secondary Data

Collection Purpose For the problem in hand

For some other Problem

Collection Process Require high involvement

Require Less Involvement

Collection Cost high Relatively Low

Collection Time Long Short

Page 5: data collection

Collection Of Secondary Data

Internal

Ready to use

Secondary Data

External

OfflineInternetIntranet

From general business Sources

Government Sources

Syndicate Services

Require further process

Published Material

Computerized database

Page 6: data collection

• Internal: when data available within the organization.

• Ready to use data: can be used in the form in which it is available.

• But some times it requires further statistical processing to sort out the information required.

Page 7: data collection

• External: When data is collected from some outside source.

• Published material: usually published data are available in,

• Various publication of central ,state or local government

• Publication of international bodies.• Technical and trade journal• Books, Magazine & Newspaper• Reports and publications of various association

connected with business & industry, banks, stock exchange.

• Reports prepared by Research Scholars.

Page 8: data collection

• Computerized Database: • Intranet: require computer terminal &

telecommunication network• Advantage is the data can be accessed by a

limited number of user or by the authorized user only.

• Internet: also require computer terminal & telecommunication network

• Data is available for the entire group of users.

Page 9: data collection

• Offline: data is available in CD, DVD, Pen drives etc.

• Advantage is that they can be used at any location and without the help of any telecommunication network.

• Syndicate Services: companies that collect and sell common pools of data of known commercial value designed to serve a number of clients.

• Marketing research firms that collect, package and sell their data to many clients (each client receives the same information).

Page 10: data collection

Characteristic of Secondary Data

• Reliability

• Suitability

• Adequacy

Page 11: data collection

• Reliability: it can be tested by finding out such thing about the data.

• Who collected the data?• Who were the source of data?• Were they collected by using proper method?• At what time were they collected.• What level of accuracy was desired & was it

achieved?

Page 12: data collection

• Suitability: • the data are suitable for one enquiry may not

necessarily be found suitable in another enquiry.

• It is better to scrutinize the definition of various terms and units of collection used.

• Objective, scope and nature of the original enquiry must be studied.

Page 13: data collection

• Adequacy:• If the level of accuracy achieved in the data is

found inadequate for the purpose of present study, it will consider as inadequate.

• Data also consider inadequate if related to an area which may be either narrow or wide than the area of present study.

Page 14: data collection

Collection of Primary Data

Page 15: data collection

Primary Data

Indirect

Qualitative

Direct

Quantitative

Focus GroupProjection Technique

Depth Interview

Association Completion

Interview/Survey

Expression

Observation

ScheduleQuestionnaire

PersonalInterviewing

TelephonicInterviewing

UnstructuredStructured

Other methods

Page 16: data collection

• Qualitative: used in case of Exploratory Research.

• Direct: Disclose the purpose of research.

• Focus Group Interview: is conducted by a trained moderator in a non structured and natural manner with a small group of respondents from the appropriate target market.

• Depth Interview: is an unstructured direct personal interview in which a single respondents is asked by a highly skilled interviewer to uncover underlying motivation, attitude, feeling of respondents on a topic.

Page 17: data collection

• Indirect: Not disclose the purpose of research.

• Projection Technique: Participants are asked to project their feelings and thoughts onto other things.

• For example: If Coca-Cola was an animal, which animal would it be?

Page 18: data collection

• Association: used to extract information regarding such words which have max association.

• in this the respondent is asked to mention the first word that comes in mind, without thinking, as the interviewer read out each word from the list.

• Frequently used in Advertising research

Page 19: data collection

• Completion: an extension of association technique.

• Informants may be asked to complete a sentence like “person who wear khadi are……..”

• To find association of khadi clothes with certain personality characteristics.

• Analysis of replies from the same informant reveals his attitude towards the subject & the combination of these attitudes of all the sample members is then taken to reflect the view of population.

Page 20: data collection

• Expressive: in this respondents are asked to comment or explain what other people do.

• Like, Why do people wear designer cloths?• Answer may reveal respondent’s own motivation.• Also the subjects are asked to act out a situation in

which they have been assigned various role and the researcher may observe various traits.

Page 21: data collection

• Quantitative: Used in case of Descriptive Research.

• Interview: involves the presentation of oral verbal stimuli and reply in terms of oral verbal response.

• The interview has been called ‘a conversation with a purpose’, and more formally ‘a purposeful discussion between two or more people’

Page 22: data collection

• Telephonic Interview: contacting the respondents over phone.• Merits:• More flexible than mailing method.• Faster• Cheaper than personal interview• High rate of response• No field staff is required.• Recall is easy, callback are simple and economical.• Demerits:• Less time is given to respondents for considered answers.• Restricted to respondents who have phone facility.• Extensive geographical coverage may get restricted by cost

consideration.• Questions have to be short and to the point.

Page 23: data collection

• Personal Interview: requires a interviewer asking questions generally in a face to face contact to the other person.

• This method usually carried out in a structured way, involves the use of a set of predetermined question.

• Merits:• Obtained more in-depth information.• Non response is very low.• Language of the interviewer can be adopted

according to the ability or the educational level of the person interviewed.

Page 24: data collection

• Demerits:• Very expensive• More time consuming especially when sample is

large.• Selecting, training and supervising the field staff is

more complex.• Presence of interviewer on the spot may over

stimulate the respondents.

Page 25: data collection

• Observation:• Most commonly used method in behavioral science.• Information is sought by investigator’s own direct

observation without asking from the respondents.• Advantage:• Subjective bias is eliminated• the information obtained relates to what is currently

happening• This method is independent of respondents willingness to

respond.• Disadvantage:• It is an expensive method.• Information provided by this method is very limited.

Page 26: data collection

• Structured Observation: it is characterized by a careful definition of the units to be observed, the style of recording the observed information, standardized condition of observation, and the selection of useful data only.

• Mainly used in Descriptive Studies.• But when there is no planning in advance about all

the mentioned things, it is termed as Unstructured Observation.

• Mainly used in Exploratory Studies.

Page 27: data collection

• Participant & Non Participant Observation:• This distinction depends upon the observer sharing or

not sharing the life of the group he is observing.• If the observer observes by making himself, more or

less, the member of the group he is observing so that he can experience what the members of the group experience, the observation is called Participant.

• When the observer observe as a detached person without an attempt on his part to experience through participation what other feels, it is known as Non Participant.

Page 28: data collection

• Questionnaire:• It is a list of questions sent to a number of persons to

answer. It secures standardized results that can be tabulated and treated statistically.

• Purpose:• To collect information from the respondents who are

scattered in a vast area.• To achieve success in collecting reliable data.

Page 29: data collection

Types of Questionnaire• On the basis of structure• Structured• Unstructured• On the basis of Questions• Open ended• Close ended• Mixed• Pictorial

Page 30: data collection

• Structured : are those in which there are definite, concrete questions. The questions are presented with exactly the same wording and in the same order to all the respondents.

• Unstructured: work as a guide to the interviewer. The interviewer is free to arrange the form and timing of inquiry. The main advantage is flexibility.

Page 31: data collection

• Open ended: the respondent is free to express his/her views and ideas rather than limited to stated certain alternatives.

• Close Ended: The responses are limited to the stated alternatives, like Yes or No

• Mixed: combination of both open & close ended• Pictorial: Pictures are used to promote interest in

answering the questions.

Page 32: data collection

• Advantage:• It can be used as a method or as a base for interview.• Can be posted, emailed and faxed.• Can cover a large number of people.• Wide geographic coverage.• Relatively cheap.

Page 33: data collection

• Disadvantage:• Design Problem• Questions have to be relatively simple.• Low response rate• Time delay• Assumes no literacy problem• No control over who complete it.• Not possible to give assistance if required.

Page 34: data collection

• There is not any such a scientific method to frame question, but some general guidelines to frame the questionnaire are,

Page 35: data collection

Formulation Of Questionnaire

• Step 1: Specify the information needed.• Try to make a dummy table• Categorized the problem• Decide the statistical tool in advance• Select the parameter• Should address all the different components

of a problem• Define the target group

Page 36: data collection

• Step 2: Specify the type of interview method• It can be,• Personal: can use complex type of question• Telephonic: medium type of question• Mail: simple question

Page 37: data collection

• Step 3: Determine the content of an individual question

• Is the question necessary?• Are several question needed instead of one?• Try to avoid double barreled question.

Page 38: data collection

• Step 4: Design the question to overcome the respondents inability & unwillingness to answer.

• Overcoming inability to answer.• Is the respondent informed?• Can the respondent articulate response?• Overcome unwillingness to answer• Legitimate purpose• Sensitive information

Page 39: data collection

• Step 5: Choose the structure of Question• Open ended: respondents are free to give any

answer of their choice• Close ended: choices are given• Multiple choice: more than 2 alternative• Dichotomous: only 2 alternatives

Page 40: data collection

• Step 6: Choose the question wording• Don’t use ambiguous words.• Question should be simple & easy to

understand.• Avoid leading and biased question.

Page 41: data collection

• Step 7: Determine the order of Questions• Basic: only to solve the basic problem• Classification: related to sociographic &

demographic variable of the respondents.• Identification: related to identity of individual

like address, phone number etc.• Don’t ask the sensitive or difficult question in

the beginning.• Question should be placed in a logical order.

Page 42: data collection

• Step 8: Format & Layout• Format should be standard & proper because

it affects the brand image of the organization.

• Step 9: Printing of Questionnaire• Print out should be very much clear or of good

quality.• Questionnaire should look impressive.

Page 43: data collection

• Step 10: Pretesting• Usually a small number of respondents are

selected for the pre-test. The respondents selected for the pilot survey should be broadly representative of the type of respondent to be interviewed in the main survey.

• Protocol Analysis: Respondents are allow to think & speak in order to identify the problem

• Debriefing: Suggestion take after filling the questionnaire.

Page 44: data collection

Questionnaire Schedule

Questionnaire is generally sent through mail to informants to be answered as specified in a covering letter, but otherwise without further assistance from the sender.

A schedule is generally filled by the research worker or enumerator, who can interpret the questions when necessary.

Data collection is cheap and economical

Data collection is more expensive as money is spent on enumerators and in imparting trainings to them.

Non response is usually high as many people do not respond and many return the questionnaire without answering all questions.

Non response is very low because this is filled by enumerators who are able to get answers to all questions.

Page 45: data collection

It is not clear that who replies. Identity of respondent is not known.

The questionnaire method is likely to be very slow since many respondents do not return the questionnaire.

Information is collected well in time as they are filled by enumerators.

No personal contact is possible in case of questionnaire as the questionnaires are sent to respondents by post who also in turn returns the same by post.

Direct personal contact is established

Page 46: data collection

This method can be used only when respondents are literate and cooperative.

The information can be gathered even when the respondents happen to be illiterate.

Wider and more representative distribution of sample is possible.

There remains the difficulty in sending enumerators over a relatively wider area.

Risk of collecting incomplete and wrong information is relatively more

The information collected is generally complete and accurate as enumerators can remove difficulties if any faced by respondents in correctly understanding the questions.

The success of questionnaire methods lies more on the quality of the questionnaire itself.

It depends upon the honesty and competence of enumerators

Page 47: data collection

Overview of the Stages of Data Analysis

Page 48: data collection

Editing

• Editing is the process of reviewing the data to ensure maximum accuracy and clarity.

• Editing should be conducted as the data is being

collected. This applies to the editing of the collection forms used for pretesting as well as those for the full-scale project.

Page 49: data collection

• Careful editing early in the collection process will often catch misunderstandings of instructions, errors in recording, and other problems at a stage when it is still possible to eliminate them for the later stages of the study.

• Early editing has the additional advantage of permitting the questioning of interviewers while the material is still relatively fresh in their minds.

Page 50: data collection

• Editing is normally centralized so as to ensure consistency and uniformity in treatment of the data.

• If the sample is not large, a single editor usually edits all the data to reduce variation in interpretation.

• In those cases where the size of the project makes the use of more than one editor mandatory, it is usually best to assign each editor a different portion of the data collection form to edit.

• In this way the same editor edits the same items on all forms, an arrangement that tends to improve both consistency and productivity.

Page 51: data collection

Types of Editing

1. Field Editing – Preliminary editing by a field supervisor on the

same day as the interview to catch technical omissions, check legibility of handwriting, and clarify responses that are logically or conceptually inconsistent.

2. In-house Editing– Editing performed by a central office staff; often

done more rigorously (the quality of being extremely thorough and careful) than field editing.

Page 52: data collection

Legibility of entries.

• the quality of being clear enough to read

• Obviously the data must be legible in order to be used.

• Where not legible, although it may be possible to infer the response from other data collected, where any real doubt exists about the meaning of data it should not be used.

Page 53: data collection

Completeness of entries.

• On a fully structured collection form, the absence of an entry is ambiguous.

• It may mean either that the respondent could not or would not provide the answer, that the interviewer failed to ask the question, or that there was a failure to record collected data

Page 54: data collection

• Consistency of entries.

• Inconsistencies raise the question of which response is correct.

• Discrepancies may be cleared up by questioning the interviewer or callbacks to the respondent.

• When discrepancies cannot be resolved, discarding both entries is usually the wisest course of action.

Page 55: data collection

• Accuracy of entries.

• An editor should keep an eye out for any indication of inaccuracy in the data.

• Of particular importance is the detection of any repetitive response patterns in the reports of individual interviews.

• Such patterns may well be indicative of systematic interviewer bias or interviewer/respondent dishonesty.

Page 56: data collection

Coding

• Coding is the process of assigning responses to data categories and numbers are assigned to identify them with the categories.

• Pre-coding refers to the practice of assigning codes to categories and sometimes printing this information on structured questionnaires and observation forms before the data are collected.

• Post-coding refers to the assignment of codes to responses after the data are collected. Post-coding is most often required when responses are reported in an unstructured format.

Page 57: data collection

• Once a complete code has been established, after post-coding, a formal coding manual or codebook is often created and made available to those who will be entering or -analyzing the data.

Page 58: data collection

TABULATION:

The mass of data collected has to be arranged in some kind of concise and logical order.

Tabulation summarizes the raw data and displays data in form of some statistical tables.

Tabulation is an orderly arrangement of data in rows and columns.

OBJECTIVE OF TABULATION:

1. Conserves space & minimizes explanation and descriptive statements.

2. Facilitates process of comparison and summarization.

3. Facilitates detection of errors and omissions.

4. Establish the basis of various statistical computations.

Page 59: data collection

• BASIC PRINCIPLES OF TABULATION:

• Tables should be clear, concise & adequately titled.

• Every table should be distinctly numbered for easy reference.

• Column headings & row headings of the table should be clear & brief.

• Units of measurement should be specified at appropriate places.

• Explanatory footnotes concerning the table should be placed at appropriate places.

• Source of information of data should be clearly indicated.

Page 60: data collection

• The columns & rows should be clearly separated with dark lines

• Demarcation should also be made between data of one class and that of another.

• Comparable data should be put side by side.

• The figures in percentage should be approximated before tabulation.

• The alignment of the figures, symbols etc. should be properly aligned and adequately spaced to enhance the readability of the same.

• Abbreviations should be avoided.

Page 61: data collection

• Univariate

• Bivariate

• Multivariate

No of person Occupation

No of person Occupation

Male Female

No of person Occupation

Male Female

Married Unmarried Married Unmarried