Project Bollywood - Final Report


Citation preview

  • 8/7/2019 Project Bollywood - Final Report


    Cross Tabulation

    As one of our objective to find out the preference of the respondents, hence we

    decided to carry out cross-tabulation on the basis of the satisfaction level of the

    respondents and their frequency to watch a movie in the theatre in a month. The

    area in which the preferences were looked after were

    1) Which kind of Directors philosophy they liked in watching a movie?

    2) What kind of actors do they prefer to watch for particular script or directors


    3) Which kind of music do they prefer in a movie?

  • 8/7/2019 Project Bollywood - Final Report


    From the cross-tabulation it is clear that those who are satisfied or neutral about the

    current bollywood movies, they will prefer to watch a movie which is close to reality.

    They feel that actors who have established themselves in this field are fit to portray

    characters. They felt that Romantic or soft numbers will really add value to such

    movies. One interesting fact from this survey is that even those who are currently

    not satisfied or those who watch movie more than once in month share the same


    Preference of the respondents who are





    & VeryDissatisfied



    Actor Established 59 15 54

    Music Soft 28

    Romantic 29 11 30Director Realistic 34 8 29

    Sarcasticcomedy 7

    About 92 % of the respondents watches movie for entertainment purpose and at the

    same time they want that it should be more close to reality. It is also well clear from

    the type of script they prefer to watch. From the graph below, we can see that

    maximum number of respondents wants the script of the film should be close to

    reality or current issues prevalent in the society.

  • 8/7/2019 Project Bollywood - Final Report


    It is not that other scripts or philosophy wil not work, for example Dhoom was aTrendy and stylish kind of movie with all fast music numbers and comparatively

    new face, but movie like 3 Idiots, Taare Zameen Par etc were more preferred with

    a repeat audience. The main USP of 3 idiot was all the factors as preferred by the


    The above survey does not indicate that our TG group wants a real life story in a

    serious manner. They want that this philosophy should be explained in a light way

    or in comic way. This is clear from the graph below. Maximum number of

    respondents prefered comedy movies over other genres like romance, action,

    thriller etc.

    On this front also, 3 Idiots scores well. The movie brought about the problem of

    graduating youth in a light humorous way which was well accepted by our TG.

  • 8/7/2019 Project Bollywood - Final Report


    From cross-tabulation, it is clear that maximum respondent prefered comedy.

    However, there were many respondents who liked action, comedy, romance and so

    on. Hence combining these genres, we found out that maximum respondents who

    has multiple genre choices, majority of them prefered either Romantic Comedy or

    action comedy. Few of them prefered Action Romance Comedy.

    Multiple Regression to find relationship between Satisfaction level of the Respondent and

    the main attributes of the film

    Variables Entered/Removed(b)


    VariablesRemoved Method

  • 8/7/2019 Project Bollywood - Final Report



    Location,Hero, Film

    Title, Genre,Director,Heroine,


    , Story(a)

    . Enter

    a All requested variables entered.b Dependent Variable: Satisfaction

    Model Summary

    Model R R SquareAdjusted R

    SquareStd. Error ofthe Estimate

    1.340(a) .116 .041 .794





    t Sig.B Std. Error Beta

    1 (Constant) 3.589 .433 8.296 .000

    Genre -.056 .069 -.084 -.804 .423

    Film Title -.074 .075 -.104 -.990 .324

    Story .212 .084 .346 2.520 .013

    Hero -.072 .084 -.098 -.854 .395Heroine -.066 .075 -.095 -.886 .378

    Director -.210 .086 -.270 -2.451 .016


    .138 .083 .182 1.667 .098

    Songs -.068 .074 -.102 -.909 .365

    Location .048 .070 .070 .681 .497

    The main aim of carrying out multiple regression was to find a correlation between the

    satisfaction level of the respondent with the various attributes of the movie such as genre, title,

    actor, director etc.

    But as seen from the table, after running multiple regression in SPSS, it is clear that the change

    in the satisfaction level is explained by only 11.6% change in the above mentioned independent

    variable. Hence, there may be other variables affecting the satisfaction level of the respondents.

  • 8/7/2019 Project Bollywood - Final Report


    Checking out the multicollinearity, it was found that there was much any relation between the

    independent variables hence clarifying that each variable is independent with respect to each


    Analysis of Q 10 - Promotional activities influencing the respondents to watch a movie

    In order to find out, whether the variables such as TV promo, Music, PR activities, Critics

    Review, Word of Mouth etc have an equal impact on the respondents in influencing them to

    watch a particular movie, we first applied one way ANNOVA test.

    One way ANNOVA Test

  • 8/7/2019 Project Bollywood - Final Report


    Null Hypothesis :- All the variables mentioned above create an equal impact in influencing

    people to watch a movie.

    Alternative Hypothesis :- All the promotional effort have different impact on influencing

    people to watch a movie.

    ANOVASource of

    Variation SS df MS F


    value F crit

    BetweenGroups 99.25 5 19.85

    24.406578 1E-22


    Within Groups 580.7 7140.81330



    5 719

    As F(observed) > F Critical, our null hypothesis is rejected, and it is clear that all promotional effort

    has different impact on influencing people to watch a movie.

    To find out which of these promotional variables should be focused more, which can highly

    influence the people. For that we carried out Factor Analysis.

    Factor Analysis


    1. Different kind of variables such as TV promo, Music, PR activities, Controversies, Word

    of mouth etc as attributes on basis of which factoring is to be done.

    2. Using Principle component method for data extraction and saving factor scores as

    variables to be formed basis for cluster analysis

  • 8/7/2019 Project Bollywood - Final Report


    The Output and Analysis

    From the table of Total Variance Explained, we found out only two factors which explain about

    53% cumulative variance of the variables given. This is mainly due to the fact that all the

    variables are not having much difference. Hence we can say that all the promotional variables

    can be grouped in to two factors namely Direct Promotion and Indirect Promotion

    Total Variance Explained



    Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

    Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %

    1 1.766 29.440 29.440 1.766 29.440 29.440 1.611 26.857 26.85

    2 1.411 23.511 52.951 1.411 23.511 52.951 1.566 26.094 52.95

    3 .917 15.291 68.242

    4 .811 13.519 81.760

    5 .566 9.442 91.202

    6 .528 8.798 100.000

    Extraction Method: Principal Component


    Rotated Component Matrixa


    1 2

    TV_Promo .756 -.013

    Music .825 -.070

    PR_Activities .587 .303

    Critic_reviews -.034 .752

    wordofmouth .034 .726

    controversies .110 .613

    Extraction Method: Principal Component


    Rotation Method: Varimax with Kaiser


  • 8/7/2019 Project Bollywood - Final Report


    Direct Influence :- As seen from the Rotated Component Matrix, TV Promo, Music and PR

    Activities are the variables which are related to the above factor.

    Indirect Influence :- As seen from the Rotated Component Matrix, Critic review, Word of

    mouth, Controversies are the variables which are related to the above factor.

    Cluster Analysis

    We carried out hierarchical clustering using agglomeration scheduling. We used Dendogram plot

    to find out the clusters. Warde method of linkage was used and squared Euclidean distance used

    as a basis to find out the clusters.

  • 8/7/2019 Project Bollywood - Final Report


    Agglomeration Schedule


    Cluster Combined


    Stage Cluster First Appears

    Next StageCluster 1 Cluster 2 Cluster 1 Cluster 2

    1 55 113 .000 0 0 67

    2 77 110 .000 0 0 7

    3 97 108 .000 0 0 111

    4 100 101 .000 0 0 29

    5 93 94 .000 0 0 75

    6 20 92 .000 0 0 15

    7 18 77 .000 0 2 32

    8 34 68 .000 0 0 79

    9 57 66 .000 0 0 19

    10 44 56 .000 0 0 46

    11 3 47 .000 0 0 78

    12 31 33 .000 0 0 40

    13 28 29 .000 0 0 90

    14 12 27 .000 0 0 45

    15 19 20 .000 0 6 43

    16 1 4 .000 0 0 65

    17 52 114 .000 0 0 39

    18 39 116 .001 0 0 50

    19 57 62 .002 9 0 86

    20 6 8 .004 0 0 56

    21 43 72 .007 0 0 73

    22 79 87 .010 0 0 44

    23 40 83 .013 0 0 27

    24 46 112 .015 0 0 76

    25 64 89 .018 0 0 58

    26 9 80 .021 0 0 48

    27 7 40 .025 0 23 52

    28 96 120 .029 0 0 41

    29 23 100 .032 0 4 47

    30 98 103 .036 0 0 68

    31 10 76 .040 0 0 39

  • 8/7/2019 Project Bollywood - Final Report


    Dendrogram using Ward Method

    C A S E 0 5 10 15 20 Label Num +---------+---------+---------+---------+---------+

    55 113 39 116 11 34 68 16 118 13 25 3 47 9

    80 85 100 101 23 50 95 97 108 86 109 1 4 78 107

    40 83 7 14 117 57 66 62 44 56 59 58 98 103 61

    48 6 8 67 102 91 46 112 2 36

  • 8/7/2019 Project Bollywood - Final Report


    96 120 63 71 77 110 18

    88 74 79 87 37 12 27 5 64 89 75 82 99 41 54

    31 33 21 115 24 119 70 43 72 26 52 114 10 76 53

    65 32 111 38 106 49 22 93 94 104 30 81 84 60 45

    42 35 51 73 28 29 20 92 19 105 17

  • 8/7/2019 Project Bollywood - Final Report


    15 69


    Based on the coefficients and looking at the Dendogram, we found out that there are two clusters

    with different characteristics.




    Music PRActivit






    No ofrespond


    1 4.39 4.47 3.92 4.13 4.68 4.11 38

    2 4.30 4.66 3.66 3.13 3.71 3.11 56

    The above table shows the mean of the values for the given variables for each cluster. It is seen

    from table that clusters 2 which has maximum respondents music has the high mean value as

    compared to cluster -1. Also there is not much significant difference in the mean values for TV

    promo and PR activities.

    From the cluster Analysis, we can interpret that Direct Influencing variables like TV

    Promo, Music, PR Activities should be given more importance while designing promotion

    campaign in order to influence people to watch movies.

  • 8/7/2019 Project Bollywood - Final Report


    Analysis of Q 5 - Factors influencing the respondents to watch a movie

    In order to find out, whether the variables such as :

    GenreFilm TitleStoryLead HeroLeadHeroineDirector


    have an equal impact on the respondents in influencing them to watch a particular movie, we

    first applied one way ANNOVA test.

    One way ANNOVA Test

    Null Hypothesis :- All the variables mentioned above create an equal impact in influencing

    people to watch a movie.

    Alternative Hypothesis :- All the promotional effort have different impact on influencing

    people to watch a movie.

    ANOVASource of

    Variation SS df MS F P-value

    Between Groups 55.56296296 8 6.945370375.0650



    Within Groups 1468.6 1071 1.37124183

    Total 1524.162963 1079

    As F(observed) > F Critical, our null hypothesis is rejected, and it is clear that all the different factor

    has different impact on influencing people to watch a movie.

  • 8/7/2019 Project Bollywood - Final Report


    To find out which of these promotional variables should be focused more, which can highly

    influence the people. For that we carried out Factor Analysis.

    Factor Analysis


    1. Different kind of variables such as Genre, Story, Lead Hero etc as attributes on basis of

    which factoring is to be done.

    2. Using Principle component method for data extraction and saving factor scores as

    variables to be formed basis for cluster analysis

    The Output and Analysis

  • 8/7/2019 Project Bollywood - Final Report


    From the table of Total Variance Explained, we found out only three factors which explain about

    60% cumulative variance of the variables given. This is mainly due to the fact that all the

    variables are not having much difference. Hence we can say that the entire influencing variable

    to watch a movie can be grouped in to three factors namely Cast, Crew& Content, Film Identity

    and X-fator.

    Cast, Crew& Content: - As seen from the Rotated Component Matrix, Story, Lead Hero,

    Lead Heroine and Director are the variables which are related to the above factor.

    Film Identity: - As seen from the Rotated Component Matrix, Film Title and Production

    House are the variables which are related to the above factor.

    X-factor:- As seen from the Rotated Component Matrix, Songs and Sets/Location are the

    variables which are related to the above factor.

    Cluster Analysis

    We carried out hierarchical clustering using agglomeration scheduling. We used Dendogram plot

    to find out the clusters. Warde method of linkage was used and squared Euclidean distance used

    as a basis to find out the clusters.

  • 8/7/2019 Project Bollywood - Final Report


    Agglomeration Schedule


    Cluster Combined


    Stage Cluster First Appears

    Next StageCluster 1 Cluster 2 Cluster 1 Cluster 2

    1 100 101 .000 0 0 38

    2 48 49 .000 0 0 57

    3 43 46 .000 0 0 62

    4 28 29 .000 0 0 107

    5 67 84 .004 0 0 72

    6 15 117 .016 0 0 21

    7 61 64 .029 0 0 102

    8 80 94 .046 0 0 88

    9 74 114 .066 0 0 32

    10 12 16 .090 0 0 82

    11 7 23 .116 0 0 45

    12 13 95 .145 0 0 16

    13 2 31 .175 0 0 22

    14 17 33 .212 0 0 56

    15 86 112 .255 0 0 32

    16 4 13 .303 0 12 43

    17 5 11 .350 0 0 69

    18 14 90 .399 0 0 67

    19 51 91 .449 0 0 27

    20 27 89 .501 0 0 60

    21 3 15 .554 0 6 30

    22 2 75 .608 13 0 56

    23 9 120 .662 0 0 42

    24 98 105 .722 0 0 89

    25 38 42 .783 0 0 70

    26 93 99 .845 0 0 65

    27 19 51 .921 0 19 60

    28 81 115 .996 0 0 64

    29 10 56 1.072 0 0 46

    30 3 72 1.156 21 0 85

    31 34 96 1.241 0 0 39

  • 8/7/2019 Project Bollywood - Final Report



    * * * * * * * * * * * * * * * * * * * H I E R A R C H I C A L C L U S T E R A N A LY S I S * * * * * * * * * * * * * * * * * * *

    Dendrogram using Ward Method

    Rescaled Distance Cluster Combine

    C A S E 0 5 10 15 20 Label Num +---------+---------+---------+---------+---------+

    100 101 83 12 16 24 70 107 77 118 102 81 115 27

    89 51 91 19 15 117 3 72 74 114 86 112 26 52 116

    108 82 9

    120 111 53 93 99 25 79 97

  • 8/7/2019 Project Bollywood - Final Report


    110 73 104 85 88 109 6

    5 11 78 98 105 30 76 8 34 96 66 22 92 17 33

    2 31 75 80 94 13 95 4

    119 67 84 65 103 28 29

    14 90 71 113 7 23 18 87 1 48 49 39 50 35 61

    64 55 62 106 60 10 56 69 38 42 47

  • 8/7/2019 Project Bollywood - Final Report


    68 37 40 43 46 41 63

    45 54 36 58 44 57


    Based on the coefficients and looking at the Dendogram, we found out that there are two clusters

    with different characteristics.




    Story Leadhero






    songs Sets /Locati



    1 3.25 2.86 4.20 3.63 3.58 3.82 3.14 3.20 2.41

    2 3.00 2.95 3.41 3.64 3.36 2.91 3.09 4.32 4.18

    3 2.10 3.55 1.45 2.30 2.30 2.45 2.95 1.95 2.65

    The above table shows the mean of the values for the given variables for each cluster. It is seen

    from table that clusters 1 which has maximum respondents Genre, Story, Lead Hero, Lead

    Heroine, Director and Production House has the high mean value as compared to cluster -2


    From the cluster Analysis, we can interpret that Cast, Crew and Content factor should be

    given more importance while making a movie in order to influence people to watch movies.

  • 8/7/2019 Project Bollywood - Final Report


    Que-11-Satisfaction Level with the Movie

    Here we used one sample z-test to find out about the satisfaction level of the respondent

    with the movie released in the recent past.

    So we construct a null and alternative hypothesis.

    Null Hypothesis---mean is greater than or equal to3

    Alternative hypothesis---mean is less than 3Critical value for alpha=0.05 will be 1.6577 onetail t test.

    So we calculated t(calculated value)-1.806

    So here Tcal> Ttab

    So we will reject the Null Hpothesis and will accept the Alternative Hypothesis.

    It means the most of the people in sample believe that they are not satisfied with the kind of

    movie in the recent past.