• UNIT 7:Bivariate Statistics

    Introductory activity
    In Kabeza village, after her 9 observations about farming, 
    UMULISA saw that in every house observed, where there are 
    a number x of cows there are also y domestic ducks, and then 
    she got the following results of (x,y) pairs: (1,4), (2,8), (3,2), 
    (4,12), (5,10), (6,14), (7,16), (8,6), (9,18)
    a) Represent this information graphically in a 
    ( x y coordinates , ) − .
    b) Chose two points, find the equation of a line joining 
    them and draw it in the same graph. How are the positions 
    of remaining points vis-a -vis this line? 
    c) According to your observation from (a), explain in 
    your own words if there is any relationship between the 
    variation of the number x of cows and the number y of 

    domestic ducks. 

    J

    Until now, we know how to determine the measures of central 
    tendency in one variable. In this unit, we will use those measures 
    in two quantitative variables known as double series. In statistics, 
    double series includes technique of analyzing data in two variables, 
    when we focus on the relationship between a dependent variable-y 
    and an independent variable-x. The linear regression method will 
    be used in this unit. The estimation target is a function of the 
    independent variable called the regression function which will be 

    a function of a straight line.

    Objectives
    By the end of this unit, a student will be able to:
    ᇢ find measures of variability in two quantitative 
    variables.
    ᇢ draw the scatter diagram of given statistical 
    series in two quantitative variables.
    ᇢ determine the linear regression line of a given 
    series,
    ᇢ calculate a linear coefficient of correlation of a 

    given double series and interpret it

    J

    J

    In case of two variables, say x and y, there is another important 
    result called covariance of x and y, denoted cov , ( x y), which is a 
    measure of how these two variables change together. 
    The covariance of variables x and y is a measure of how these 
    two variables change together. If the greater values of one variable 
    mainly correspond with the greater values of the other variable, 
    and the same holds for the smaller values, i.e. the variables tend to 
    show similar behavior, the covariance is positive. In the opposite 
    case, when the greater values of one variable mainly correspond 
    to the smaller values of the other, i.e. the variables tend to show 
    opposite behavior, the covariance is negative. If covariance is zero 
    the variables are said to be uncorrelated, meaning that there is no 
    linear relationship between them.
    Therefore, the sign of covariance shows the tendency in the linear 
    relationship between the variables. The magnitude of covariance 
    is not easy to interpret.

    Covariance of variables x and y, where the summation of frequencies

    J

    J

    H

    H

    J

    H

    J

    G

    J

    D

    ,K

    J

    H

    J

    J

    J

    H

    J

    J

    KI

    K

    J

    K

    MJ

    J

    J

    JIU

    H

    J

    MJ

    HU

    J

    U

    Method of ranking
    Ranking can be done in ascending or descending order.
    Example 7.6
    Suppose that we have the marks, x, of seven students in this order:
    12, 18, 10, 13, 15, 16, 9
    We assign the rank 1, 2, 3, 4, 5, 7 such that the smallest value of x 

    will be ranked 1.

    J

    JI

    I

    J

    U

    H

    H

    a) It is required to estimate the value of X for Nepal, where the 
    value of Y is 450.
    i) Find the equation for a suitable line of regression. 
    Simplify your answer as far as possible, giving the 
    constants correct to three significant figures 
    ii) Use your equation to obtain the required estimate 
    b) Use your equation to estimate the value of x for North Korea, 
    where the value of Y was 858. Comment on your answer.
    Solution 
    a) i) Neither variable has been controlled in the given data and 
    since you are required to estimate the life expectancy, X 
    years, when the GDP per head, Y dollars is 160 dollars, it is 
    sensible to use the regression line of X on Y

    The regression line of X on Y has equation

    Unit Summary


    End of Unit Assessment

    1. For each set of data, find:

    a) equation of the regression line of y on x.

    b) equation of the regression line of x on y


    r = 0.95
    Find both equations of the regression lines. Also estimate the value of

    y for x = 30 .

    5. The following results were obtained from records of age (x) and 

    systolic blood pressure Yes of a group of 10 men:

    a) Calculate the Spearman’s coefficient of rank correlation 
    between position in the league and average attendance.
    b) Comment on your results
    21.A company is to replace its fleet of cars. Eight possible models 
    are considered and the transport manager is asked to rank them, 
    from 1 to 8, in order of preference. A saleswoman is asked to use 
    each type of car for a week and grade them according to their 
    suitability for the job (A-very suitable to E-unsuitable). 

    The price is also recorded:

    23.To test the effect of a new drug twelve patients were examined 
    before the drug was administered and given an initial score (I) 
    depending on the severity of various symptoms. After taking the 
    drug they were examined again and given a final score (F). A 
    decrease in score represented an improvement. The scores for 
    the twelve patients are given in the table below:




    UNIT 6:Matrices and Determinant of Order 3UNIT8:Conditional Probability and Bayes Theorem