<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type" />
<meta http-equiv="Content-Language" content="en" />
<meta name="generator" content="Pressbooks 5.18.1" />
<meta name="pb-authors" content="Jiwon N. Speers" />
<meta name="pb-editors" content="Kristin Conlin, Harvey Sky, and Nett Smith" />
<meta name="pb-translators" content="" />
<meta name="pb-reviewers" content="" />
<meta name="pb-illustrators" content="" />
<meta name="pb-contributors" content="Salih Binich and Russell Almond" />
<meta name="pb-title" content="Analytic Techniques for Public Management and Policy" />
<meta name="pb-language" content="en" />
<meta name="pb-cover-image" content="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/12/Speers_cover-scaled.jpg" />
<meta name="pb-primary-subject" content="JHBC" />
<meta name="pb-copyright-holder" content="Jiwon N. Speers" />
<meta name="pb-book-license" content="cc-by-nc-sa" />
<meta name="pb-copyright-year" content="2021" />
<meta name="pb-audience" content="adult" />
<meta name="pb-publication-date" content="1614124800" />
<title>Analytic Techniques for Public Management and Policy</title>
</head>
<body lang='en' >
<div id="half-title-page"><h1 class="title">Analytic Techniques for Public Management and Policy</h1></div>
<div id="title-page"><h1 class="title">Analytic Techniques for Public Management and Policy</h1><h2 class="subtitle"></h2><h3 class="author">Jiwon N. Speers</h3><h3 class="author">Salih Binich and Russell Almond</h3><h4 class="publisher"></h4><h5 class="publisher-city"></h5></div>
<div id="copyright-page"><div class="ugc">
<div class="license-attribution"><p><img src="https://ubalt.pressbooks.pub/app/themes/pressbooks-book/packages/buckram/assets/images/cc-by-nc-sa.svg" alt="Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License" /></p><p>Analytic Techniques for Public Management and Policy by Jiwon N. Speers is licensed under a <a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>, except where otherwise noted.</p></div>
</div></div>
<div id="toc"><h1>Contents</h1><ul><li class="front-matter introduction"><a href="#front-matter-introduction"><span class="toc-chapter-title">Preface</span></a></li><li class="part display-none"><a href="#part-main-body">Main Body</a></li><li class="chapter standard"><a href="#chapter-correlation"><span class="toc-chapter-title">Chapter 1. Correlation</span></a></li></ul></div>
<div class="front-matter introduction" id="front-matter-introduction" title="Preface"><div class="front-matter-title-wrap"><h3 class="front-matter-number">1</h3><h1 class="front-matter-title">Preface</h1></div><div class="ugc front-matter-ugc"><p>It took 10 years to organize this e-book. I was introduced to general linear models as a teaching assistant in 2012 in the program of measurement and statistics at Florida State University (FSU). At the time I was struck by the power of the analytic techniques – a suite of statistical knowledge and techniques for drawing analytical conclusions from a set of numeric values. At the same time, I was struck by the fundamental limitations (e.g., regression assumptions) of the applicability of the techniques, especially when applied to the field of public management and policy. I left the program with the impression that statistics was attractive but ultimately not very useful for social scientists, vowing to monitor progress in the field.</p> <p>I returned to the program in 2015, and have learned in-depth analytical techniques since that time. I earned my Ph.D. in measurement and statistics in 2020 (I have initially had my Ph.D. in public management and policy). The past decade has been an exciting time for my academic journey as computer technology has developed rapidly, particularly in the area of psychometrics. These developments have coupled with recent advances in econometrics and the ever-increasing quality of quantitative research in the social sciences. <em>Analytic Techniques for Public Management and Policy </em>was written with the hope where the techniques can be used effectively to be evidence-based research and that it might encourage public management and policy researchers to inform more effective governance. This e-book based on ordinary least squares (OLS) regression is mostly based on three resources: Dr. Russell G. Almond&#8217;s statistics classes, Dr. Salih Binich&#8217;s measurement classes, and Dr. Tom Cook&#8217;s quasi-experimental design workshop. I am very grateful to Dr. Almond and Dr. Binich at FSU, and Dr. Cook at Northwestern University.</p> </div></div>
<div class="chapter standard" id="chapter-correlation" title="Chapter 1. Correlation"><div class="chapter-title-wrap"><h3 class="chapter-number">1</h3><h2 class="chapter-title">Chapter 1. Correlation</h2></div><div class="ugc chapter-ugc"> <div class="contents"><p class="import-Normal">In the Basic Statistical Analysis Course (e.g., PUAD 628), we dealt with a single variable or univariate data. Another type of important statistical analysis problem is the problem of identifying the relationship between multiple variables. To do so, we need to turn to bivariate data. For instance, economists are often interested to understand the relationship between two variables as follows,</p> <p style="padding-left: 40px">(1) Education and wages,</p> <p style="padding-left: 40px">(2) Salaries and CEO performance, and</p> <p style="padding-left: 40px">(3) Aid and economic growth</p> <p class="import-Normal">In these problems, we are interested in whether one variable increases accordingly to the other, and whether the relationship is very pronounced or to the extent that there is a trend. If this relationship is identified, it can be appropriately used for business strategy, investment strategy, economic policy, and educational policy establishment.</p> <p class="import-Normal">Correlation analysis and regression analysis are methods of analyzing the relationship between the two variables. Correlation analysis is interested in the degree to which the correlation between the two variables is clear. On the other hand, regression analysis is interested in deriving the relationship between the two variables into a specific equation. Accordingly, in correlation analysis, two variables are treated as two equal random variables, whereas in regression analysis, one of the two variables is regarded as an independent variable, so only the dependent variable is treated as a random variable.</p> <p class="import-Normal">In other words, under the perspective of correlation analysis, the levels of the variables are not under the control of the researcher because variables constitute random samples from the population. However, under the mentality of regression, one variable is clearly an outcome we want to predict or understand. In regression, a dependent variable is treated as a random variable, but independent variables (predictors) are treated as fixed variables (i.e., predictors constitute the only values of interest in the study so that the levels of the variables are under the control of the researcher).</p> <p class="import-Normal">These two analysis methods are used complementarily to identify the relationship between variables. In this chapter, we first look at correlation analysis, and then we look at regression analysis in the next chapter. To measure the association between two variables, a joint distribution is used as follows:</p> <p>&nbsp;</p> <p class="import-Normal"><img class="aligncenter wp-image-74 size-large" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-1024x530.png" alt="Graph of joint distribution analysis" width="1024" height="530" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-1024x530.png 1024w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-300x155.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-768x397.png 768w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-65x34.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-225x116.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian-350x181.png 350w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Multivariate_Gaussian.png 1156w" /></p> <p class="import-Normal">Here, an independent variable is located on the x-axis and the dependent variable is depicted on the y-axis. The independent variable is labeled <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> and is usually placed on the horizontal axis, while the other, dependent variable, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" />, is mapped to the vertical axis. The height is seen as the frequency of observations (or cases).</p> <p class="import-Normal">The above joint distribution displays a normal distribution, and the normal distribution consists of three elements:</p> <p class="import-Normal" style="padding-left: 40px">(1) Bell-shaped,</p> <p class="import-Normal" style="padding-left: 40px">(2) Symmetric, and</p> <p class="import-Normal" style="padding-left: 40px">(3) Unimodal.</p> <h2>1. Scatterplot</h2> <p class="import-Normal">To explore relationships between two variables, we often employ a scatterplot, which plots two variables against one another.</p> <p class="import-Normal" style="text-align: center"><img class="aligncenter wp-image-75 size-full" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots.png" alt="Four scatterplots depicting different relationships" width="936" height="758" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots.png 936w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots-300x243.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots-768x622.png 768w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots-65x53.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots-225x182.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/scatterplots-350x283.png 350w" /></p> <p class="import-Normal">We typically begin depicting relationships between two variables (i.e., X-Y Relationship) using a scatterplot—a bivariate plot that depicts three key characteristics of the relationship between two variables.</p> <p class="import-Normal" style="padding-left: 40px">(1) <strong>Strength</strong>: How closely related are the two variables? (Weak vs. strong)</p> <p class="import-Normal" style="padding-left: 40px">(2) <strong>Direction</strong>: Which values of each variable are associated with the values of the other variable? (Positive vs. negative)</p> <p class="import-Normal" style="padding-left: 40px">(3) <strong>Shape</strong>: What is the general structure of the relationship? (Linear vs. curvilinear or some other form)</p> <p class="import-Normal">By convention, when we intend to use one variable as a predictor of the other variable (called the criterion or outcome variable), we put the predictor on the x-axis and the criterion or outcome on the y-axis.</p> <p class="import-Normal">When we want to show that a certain function (here, a line) can describe the relationship and that that function is useful as a predictor of the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> variable based on <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" />, we include a regression line—the line that best fits the observed data.</p> </div> <p><img class="aligncenter wp-image-110 size-full" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression.png" alt="Example of linear regression line on scatterplot" width="947" height="613" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression.png 947w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression-300x194.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression-768x497.png 768w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression-65x42.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression-225x146.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Linear_regression-350x227.png 350w" /></p> <div class="contents"><p class="import-Normal">As is true for many other characteristics of distributions that we wish to describe, parameters and statistics describe the association between two variables. The most commonly used statistic is the Pearson Product Moment Correlation (<img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" />, which estimates a population parameter of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-da039068127cf2ec5fc05123d4d3546f_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#114;&#104;&#111;" title="Rendered by QuickLaTeX.com" height="12" width="9" style="vertical-align: -4px;" /> — rho). The correlation coefficient captures the three aspects of the relationship depicted in the scatterplot.</p> <p class="import-Normal" style="padding-left: 40px">(1) <strong>Strength</strong>: How closely related are the two variables? The absolute value of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> ranges from 1 <em>(positive or negative</em>) for a perfect relationship to 0 for no relationship at all.</p> <p class="import-Normal" style="padding-left: 40px">(2) <strong>Direction</strong>: Which values of each variable are associated with the values of the other variable? A positive sign, or no sign, in front of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> indicates a positive relationship while a negative sign indicates a negative relationship.</p> <p class="import-Normal" style="padding-left: 40px">(3) <strong>Shape</strong>: What is the general structure of the relationship? Correlation <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c887d991d16cf585e280f1d5c2b07362_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#114;&#41;" title="Rendered by QuickLaTeX.com" height="19" width="20" style="vertical-align: -5px;" /> always depicts the fit of the observed data to the best-fitting straight line.</p> <p class="import-Normal">Note that if a scatterplot does not show a linear relationship, we do not take it as a correlation because if a relationship is not linear. In other words, even though a statistical software program generates a numeric value for a correlation once you input data that represent two variables, it does not mean it is an actual correlation <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c887d991d16cf585e280f1d5c2b07362_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#114;&#41;" title="Rendered by QuickLaTeX.com" height="19" width="20" style="vertical-align: -5px;" /> because it is not always linear between the two variables. If the actual relation is nonlinear, then the correlation value generated by the statistics tool should be nullified.</p> <p class="import-Normal" style="text-align: center"><img class="aligncenter wp-image-78 size-full" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots.png" alt="Series of graphs showing different correlation coefficient" width="575" height="388" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots.png 575w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots-300x202.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots-65x44.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots-225x152.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Pearson_Correlation_Coefficient_and_associated_scatterplots-350x236.png 350w" /></p> <p class="import-Normal">The magnitude of correlation <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c887d991d16cf585e280f1d5c2b07362_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#114;&#41;" title="Rendered by QuickLaTeX.com" height="19" width="20" style="vertical-align: -5px;" /> is between -1 and +1. A no correlation represents <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-a1b16a01a291a1151674e7b9ae771526_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="12" width="41" style="vertical-align: 0px;" />. Both -1 and +1 are the maximum correlation, whereas the signs are opposite. According to Cohen’s rules of thumbs, a small correlation ranges <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c95d43add3db042d1fb0ed3b1061dd07_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#48;&#32;&#60;&#32;&#114;&#32;&#60;&#32;&#46;&#49;" title="Rendered by QuickLaTeX.com" height="14" width="78" style="vertical-align: -2px;" />, a medium correlation is <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-75401b622d4f3f147a72d4ff51613726_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#46;&#49;&#32;&#60;&#32;&#114;&#32;&#60;&#32;&#46;&#51;" title="Rendered by QuickLaTeX.com" height="14" width="83" style="vertical-align: -2px;" />, and a large correlation is <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-9d0b6586fc2ac6254c271816a3faf718_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#46;&#51;&#32;&#60;&#32;&#114;&#32;&#60;&#32;&#46;&#53;" title="Rendered by QuickLaTeX.com" height="15" width="82" style="vertical-align: -2px;" />, respectively.</p> <h2>II. Covariance</h2> <p class="import-Normal">An important concept relating to correlation is the covariance of two variables (<img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5b924862e158e0b6037d1ac1b2e986d1_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#83;&#95;&#123;&#88;&#89;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="34" style="vertical-align: -3px;" />—note that the covariance is a measure of dispersion between <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" />). The covariance reflects that degree to which two variables vary together or covary. The equation for the covariance is very similar to the equation for the variance, only the covariance has two variables.</p> <p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-a1e967bdc244ea6c2343254c78654d91_l3.svg" height="39" width="239" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#83;&#95;&#123;&#88;&#89;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#123;&#92;&#83;&#105;&#103;&#109;&#97;&#125;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#123;&#110;&#125;&#123;&#40;&#88;&#95;&#105;&#32;&#45;&#32;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#88;&#125;&#41;&#40;&#89;&#95;&#105;&#32;&#45;&#32;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#89;&#125;&#41;&#125;&#125;&#123;&#110;&#45;&#49;&#125;&#44; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="import-Normal">where <span style="font-size: NaNpt;color: #;text-decoration: none"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-4ccfb3100fd0179974f316d7de2e7c47_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#98;&#97;&#114;&#123;&#88;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="16" style="vertical-align: 0px;" /> </span>is mean of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-998dba373bb08119e12a646c873be88c_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="15" width="20" style="vertical-align: -3px;" />, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-f5a92b804d65f29a81bd4ae4c1e41108_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#98;&#97;&#114;&#123;&#89;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="14" style="vertical-align: 0px;" /> is mean of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-1ce9a015a804a13965be499e5d05726c_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="15" width="15" style="vertical-align: -3px;" />, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#110;" title="Rendered by QuickLaTeX.com" height="8" width="11" style="vertical-align: 0px;" /> is number of sample size, and individual is <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-695d9d59bd04859c6c99e7feb11daab6_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#105;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;" />. Note the denominator is <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-f30b71e7fcec69d119f30f67cf09c975_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#110;&#45;&#49;" title="Rendered by QuickLaTeX.com" height="12" width="40" style="vertical-align: 0px;" />, not just <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#110;" title="Rendered by QuickLaTeX.com" height="8" width="11" style="vertical-align: 0px;" />. In general, when the covariance is a large, positive number, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> tends to be large when <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> tends to be large (both are positive). When the covariance is a large, negative number, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> tends to be large and positive when <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> tends to be large but negative. When the covariance is near zero, there is no clear pattern like this—positive values tend to be canceled by negative values of the product.</p> <p class="import-Normal">However, there is one problem with the covariance—it is in raw score units, so we cannot tell much about whether the covariance is indeed large enough to be important by looking at it. The solution to this problem is the same solution applied in the realm of comparing two means—we standardize the statistic by dividing by a measure of the spread of the relevant distributions. Thus, the correlation coefficient is defined as:</p> </div> <div class="contents"><p class="ql-center-displayed-equation" style="line-height: 35px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-af7ccce192101e341c6d7b6128ee0c2a_l3.svg" height="35" width="105" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#114;&#95;&#123;&#88;&#89;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#115;&#95;&#123;&#88;&#89;&#125;&#125;&#123;&#123;&#115;&#95;&#88;&#125;&#123;&#115;&#95;&#89;&#125;&#125;&#44; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="import-Normal">where <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-44badabf265343e62f12ec343194dbde_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#83;&#95;&#88;" title="Rendered by QuickLaTeX.com" height="15" width="23" style="vertical-align: -3px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-ef93dae578b5c5183ad5bac9c5f098ce_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#83;&#95;&#89;" title="Rendered by QuickLaTeX.com" height="15" width="22" style="vertical-align: -3px;" /> are standard deviations of the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> scores and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5b924862e158e0b6037d1ac1b2e986d1_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#83;&#95;&#123;&#88;&#89;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="34" style="vertical-align: -3px;" /> is the covariance. That is, correlation is standardized covariance.</p> <p class="import-Normal">Because <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5b924862e158e0b6037d1ac1b2e986d1_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#83;&#95;&#123;&#88;&#89;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="34" style="vertical-align: -3px;" /> cannot exceed <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5f6b6b106d8facee031927322a119309_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#124;&#123;&#83;&#95;&#88;&#125;&#123;&#83;&#95;&#89;&#125;&#124;" title="Rendered by QuickLaTeX.com" height="19" width="53" style="vertical-align: -5px;" />, the limit of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-bd4e0c47056576d0d5180679cfb949be_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#124;&#123;&#114;&#125;&#124;" title="Rendered by QuickLaTeX.com" height="19" width="14" style="vertical-align: -5px;" /> is 1.00. Hence, one way to interpret <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> is as a measure of the degree to which the covariance reaches its maximum possible value—when the two variables covary as much as they possibly could, the correlation coefficient equals 1.00. Note that we typically do not interpret <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> as a proportion, however. Therefore, the correlation coefficient tells us the strength of the relationship between the two variables. If this relationship is strong, then we can use knowledge about the values of one variable to predict the values of the other variable.</p> <p class="import-Normal">Recall that the shape of the relationship being modeled by the correlation coefficient is linear. Hence, <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> describes the degree to which a straight line describes the values of the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> variable across the range of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> values. If the absolute value of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c409433a9e2dfcdb83360a974d243f18_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#114;" title="Rendered by QuickLaTeX.com" height="8" width="8" style="vertical-align: 0px;" /> is close to 1, then the observed <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> points all lie close to the best-fitting line. As a result, we can use the best-fitting line to predict what the values of the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> variable will be for any given value of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" />. To make such a prediction, we obviously need to know how to create the best-fitting (i.e., regression) line.</p> <h2>III. Principles of Regression</h2> <p class="import-Normal">Recall that the equation for a line takes the form <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-fa9f9b2477cc538e9fa54bf2a089584c_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;&#32;&#61;&#32;&#109;&#88;&#32;&#43;&#32;&#98;" title="Rendered by QuickLaTeX.com" height="14" width="99" style="vertical-align: -2px;" />. However, it is common to use two symbols that are b’s with subscripts. I will use the notation</p> <p>&nbsp;</p> <p class="import-Normal"></p><p class="ql-center-displayed-equation" style="line-height: 14px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-e5efe21e85566ee15dcffb8d7cac6f64_l3.svg" height="14" width="106" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#89;&#32;&#61;&#32;&#123;&#98;&#95;&#48;&#125;&#32;&#43;&#32;&#123;&#98;&#95;&#49;&#125;&#88; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <p class="import-Normal">We need to show whether <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> is an actual score or our estimate of a score. We will put a hat (^) over the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> to indicate that we are using the linear equation to estimate <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" />. Also, we subscript the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> with <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-695d9d59bd04859c6c99e7feb11daab6_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#105;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;" /> to index the scores for the <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-ff2a929cbe391ddfc9c5592455e07070_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#105;&#94;&#123;&#116;&#104;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="19" style="vertical-align: 0px;" /> case. The line is</p> <p>&nbsp;</p> <p class="ql-center-displayed-equation" style="line-height: 19px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-13a6e41a23d12593337fb179fb28c8b7_l3.svg" height="19" width="112" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;&#32;&#61;&#32;&#98;&#95;&#48;&#32;&#43;&#32;&#123;&#98;&#95;&#49;&#125;&#123;&#88;&#95;&#105;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <div class="contents"><p class="import-Normal">This is called a “fitted or estimated regression line.” The components or parameters in the equation are defined as follows:</p> <p>&nbsp;</p> <div><span style="font-size: NaNpt;color: #;text-decoration: none"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> </span>is the value of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> predicted by the linear model for case <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-695d9d59bd04859c6c99e7feb11daab6_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#105;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;" />.</div> <p class="import-Normal"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-09babc59bd325328ae8045c729d0cd72_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="14" style="vertical-align: -3px;" /> is the slope of the regression line (the change in <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> associated with a one-unit difference in <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" />).</p> <p class="import-Normal"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-13037b1ccc713463df34f375292ca76e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="15" style="vertical-align: -3px;" /> is the intercept (the value of <span style="font-size: NaNpt;color: #;text-decoration: none"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> </span>when <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-993c7c0aa11080ca552c4fededbcfe76_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="12" width="49" style="vertical-align: 0px;" />).</p> <p class="import-Normal"><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-998dba373bb08119e12a646c873be88c_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="15" width="20" style="vertical-align: -3px;" /> is the value of the predictor variable for case <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-695d9d59bd04859c6c99e7feb11daab6_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#105;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;" />.</p> <p class="import-Normal">There are several other versions of the model. The one above represents the predicted scores but we can also write the model in terms of the observed scores:</p> <p>&nbsp;</p> </div> <p class="ql-center-displayed-equation" style="line-height: 14px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-2422e3bcc6923b0ce96daab7cfebc58e_l3.svg" height="14" width="151" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#89;&#95;&#105;&#32;&#61;&#32;&#98;&#95;&#48;&#32;&#43;&#32;&#123;&#98;&#95;&#49;&#125;&#123;&#88;&#95;&#105;&#125;&#32;&#43;&#32;&#101;&#95;&#105;&#32;&#46; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <div class="contents"><p class="import-Normal">Note that we’ve added an error term <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d33c164f455b97af0a78c1c0eaac4383_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#101;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="11" width="13" style="vertical-align: -3px;" /> and now <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> does not have a hat. This is also equivalent to the model showing the predicted score plus an error or residual:</p> <p>&nbsp;</p> <p class="ql-center-displayed-equation" style="line-height: 19px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-076eb33fb027a10ae994ec231a75d047_l3.svg" height="19" width="95" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#89;&#95;&#105;&#32;&#61;&#32;&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;&#32;&#43;&#32;&#101;&#95;&#105;&#32;&#46; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <p class="import-Normal">Note that the first model is an equation for the line and the other two are equations for the points that fall around the line. Therefore, the equation for the line describes the points right along the line and the other equations describe the points:</p> <p class="import-Normal"><img class="alignnone wp-image-112 size-full" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a.png" alt="Scatterplot graph demonstrating statistical analysis" width="875" height="607" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a.png 875w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a-300x208.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a-768x533.png 768w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a-65x45.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a-225x156.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-1a-350x243.png 350w" /></p> <p class="import-Normal">We will have a variety of notations for regression and different books do not all use the same notation. I use the hat (^) over Y to indicate an estimated score of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" />.</p> <p class="import-Normal">We also use the hat over Greek symbols such as <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-b6a7605b1bcca8f1b416eaf733f34e08_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#98;&#101;&#116;&#97;" title="Rendered by QuickLaTeX.com" height="17" width="11" style="vertical-align: -4px;" /> to indicate estimates of population parameters. One confusion we will need to deal with (later) concerns “beta weights” or “standardized coefficients” which some books denote using Greek letters even though they are sample estimates. I will call these <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-36dce5e85a5519815bf3ab9580ae848d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#94;&#42;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" />.</p> <p class="import-Normal">Our task is to identify the values of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-13037b1ccc713463df34f375292ca76e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="15" style="vertical-align: -3px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-09babc59bd325328ae8045c729d0cd72_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="14" style="vertical-align: -3px;" /> that produce the best-fitting linear function. That is, we use the observed data to identify the values of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-13037b1ccc713463df34f375292ca76e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="15" style="vertical-align: -3px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-09babc59bd325328ae8045c729d0cd72_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="14" style="vertical-align: -3px;" /> that minimize the distances between the observed values (<img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" />) and the predicted values (<img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" />). However, we can’t simply minimize the sum of differences between <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-82606c3098bb09002088b0f6f9ffbb2a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#89;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> (recall that <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-83a3bca99ba4134cbb3cddd9d15c3bcd_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#101;&#95;&#105;&#32;&#61;&#32;&#123;&#89;&#95;&#105;&#125;&#32;&#45;&#32;&#123;&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;&#125;" title="Rendered by QuickLaTeX.com" height="19" width="90" style="vertical-align: -3px;" /> is the residual from the linear model) because any line that intersects <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-77a28069b1a11f13239d3ae4ee0fa387_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#88;&#125;&#44;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#89;&#125;&#41;" title="Rendered by QuickLaTeX.com" height="20" width="50" style="vertical-align: -5px;" /> on the coordinate plane will result in an average residual equal to 0.</p> <p class="import-Normal">To solve this problem, we take the same approach used in the computation of the variance—we find the values of <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-13037b1ccc713463df34f375292ca76e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="15" style="vertical-align: -3px;" /> and <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-09babc59bd325328ae8045c729d0cd72_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#98;&#95;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="14" style="vertical-align: -3px;" /> that minimize the squared residuals. This solution is called the (ordinary) least-squares solution (i.e., OLS regression).</p> <p class="import-Normal">Fortunately, the least-squares solution is simple to find, given statistics that you already know how to compute.</p> <p>&nbsp;</p> </div> <p class="ql-center-displayed-equation" style="line-height: 18px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-8d4780384564db092a42e0f5434b7e91_l3.svg" height="18" width="107" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#98;&#95;&#48;&#32;&#61;&#32;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#89;&#125;&#32;&#45;&#32;&#123;&#98;&#95;&#49;&#125;&#123;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#88;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <p class="ql-center-displayed-equation" style="line-height: 43px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cedbf1076715240f1c049bedfdadafeb_l3.svg" height="43" width="161" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#98;&#95;&#49;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#95;&#123;&#88;&#89;&#125;&#125;&#123;&#123;&#83;&#94;&#50;&#95;&#88;&#125;&#125;&#32;&#61;&#32;&#114;&#95;&#123;&#88;&#89;&#125;&#123;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#95;&#89;&#125;&#123;&#83;&#95;&#88;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p>&nbsp;</p> <div class="contents"><p class="import-Normal">These values minimize <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-4a77ab8158803d67cd38a3a8e61e2cbe_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#123;&#123;&#92;&#83;&#105;&#103;&#109;&#97;&#94;&#110;&#95;&#123;&#105;&#61;&#49;&#125;&#125;&#40;&#89;&#95;&#105;&#32;&#45;&#32;&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;&#41;&#94;&#50;" title="Rendered by QuickLaTeX.com" height="21" width="110" style="vertical-align: -5px;" />, the sum of the squared residuals.</p> <hr /> <p class="import-Normal" style="text-align: center"><span style="background-color: #ccffff;color: #000000"><strong>[Exercise 1]</strong></span></p> <p class="import-Normal">As an exercise example, consider the data below. We are interested in determining whether wages <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cebb98455ccb5ef2ace8cecd767c3151_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#88;&#41;" title="Rendered by QuickLaTeX.com" height="19" width="28" style="vertical-align: -5px;" /> would be useful in predicting first-quarter productivity <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-47e4149c2b90f22c1425f1ee9b7657cb_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#89;&#41;" title="Rendered by QuickLaTeX.com" height="19" width="26" style="vertical-align: -5px;" /> for factory workers. So, we decide the wages for a group of workers, allow all of them to work, and then obtain each worker’s productivity after one-quarter of work. We get the following descriptive statistics.</p> <p class="ql-center-displayed-equation" style="line-height: 16px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-bdd7faba7f0da92c22903245fe8237c5_l3.svg" height="16" width="66" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#88;&#125;&#32;&#61;&#32;&#53;&#48;&#48; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="ql-center-displayed-equation" style="line-height: 16px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-4f95e5127aa0ad8726ab444aed4e96db_l3.svg" height="16" width="59" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#89;&#125;&#32;&#61;&#32;&#50;&#46;&#53; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="ql-center-displayed-equation" style="line-height: 14px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-74a7772a0b30c7bb6e3b599cbfd04305_l3.svg" height="14" width="72" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#115;&#95;&#88;&#32;&#61;&#32;&#49;&#48;&#48; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="ql-center-displayed-equation" style="line-height: 15px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-3bb046f8f808397e431eedb3d533d95b_l3.svg" height="15" width="66" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#115;&#95;&#89;&#32;&#61;&#32;&#46;&#55;&#48; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="ql-center-displayed-equation" style="line-height: 18px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-44fd9685cb6e77a04effc3ec0b38b9f7_l3.svg" height="18" width="69" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#114;&#95;&#123;&#120;&#121;&#125;&#32;&#61;&#32;&#46;&#54;&#53; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <p class="import-Normal">Based on the given conditions, provide the estimated regression equation.</p> <hr /> <p class="import-Normal">Let’s plot that regression line (i.e., a statistics software program will plot this line for you if you have raw data). The line will always pass through the point (<img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-ec9c56b79176336d410254cc3ce6072a_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#88;&#125;&#44;&#32;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#89;&#125;" title="Rendered by QuickLaTeX.com" height="19" width="39" style="vertical-align: -4px;" />) which is (500, 2.5) for our data.</p> <p><img class="alignnone wp-image-113 size-full" src="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a.png" alt="Scatterplot graph of the regression line for exercise 1" width="913" height="540" srcset="https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a.png 913w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a-300x177.png 300w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a-768x454.png 768w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a-65x38.png 65w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a-225x133.png 225w, https://ubalt.pressbooks.pub/app/uploads/sites/9/2020/11/Stat-2a-350x207.png 350w" /></p> <p class="import-Normal">We may compute <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> for another <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> value (say, 700) to get a second point on the line:</p> <p class="ql-center-displayed-equation" style="line-height: 22px;"><span class="ql-right-eqno">&nbsp; </span> <span class="ql-left-eqno">&nbsp; </span><img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-4a1a2c29f9ba5a58d0bf7d5eedb88565_l3.svg" height="22" width="237" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;&#32;&#61;&#32;&#46;&#48;&#48;&#52;&#53;&#53;&#40;&#55;&#48;&#48;&#41;&#32;&#43;&#32;&#46;&#50;&#50;&#53;&#32;&#61;&#32;&#51;&#46;&#52;&#49; &#92;&#93;" title="Rendered by QuickLaTeX.com" /></p> <div class="contents"><p class="import-Normal">Therefore, what do this regression line and its parameters tell us?</p> <p class="import-Normal">The intercept tells us that the best guess at productivity when wages = 0 equals .225—a situation that is conceptually impossible because wages cannot be as low as zero. This points out an important point, sometimes the model will predict impossible values.</p> <p class="import-Normal">What is <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-7b4968740fce1409caf0a25034aafd3e_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#104;&#97;&#116;&#123;&#89;&#125;&#95;&#105;" title="Rendered by QuickLaTeX.com" height="19" width="15" style="vertical-align: -3px;" /> for <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-ccf4664e7a683a37325c7adf550aa4ca_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#32;&#61;&#32;&#57;&#48;&#48;" title="Rendered by QuickLaTeX.com" height="12" width="67" style="vertical-align: 0px;" />?</p> <p class="import-Normal">The slope tells us that, for every 1-point increase in wages, we get an increase in productivity of .00455. The covariance and correlation (as well as the slope) tell us that the relationship between wages and productivity is positive. That is, productivity tends to increase when wages increase.</p> <p class="import-Normal">Note, however, that it is incorrect to ascribe a causal relationship between wages and productivity in this context. There are several other conditions that need to be met in order to confidently state that interventions that change wages will also change productivity. Do you know what those are?</p> <p class="import-Normal">We will now spend the next two months or so learning all of the steps in regression analysis. This is where we are headed, but there are many pieces of this process to learn. Today we saw how to “estimate model” for one <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;" /> (one independent variable).</p> <p class="import-Normal" style="padding-left: 40px">(1) Preliminary analyses</p> <ul><li style="list-style-type: none"><ul><li style="list-style-type: none"><ul><li class="import-Normal">Inspect scatterplots.</li> <li class="import-Normal">Conduct case analysis.</li> <li class="import-Normal">If no problems, continue with regression analyses.</li> </ul> </li> </ul> </li> </ul> <p class="import-Normal" style="padding-left: 40px">(2) Regression analyses</p> <ul><li style="list-style-type: none"><ul><li style="list-style-type: none"><ul><li class="import-Normal">Estimate model.</li> <li class="import-Normal">Check possible violations of assumptions for this model.</li> <li class="import-Normal">Test overall relationship.</li> <li class="import-Normal">If the overall relationship is significant, continue with the description of the effects of independent variable (IV)’s (or if not, try other models).</li> <li class="import-Normal">For each interval and dichotomous IV, test coefficient, compute interval, assess the importance, and compute a unique contribution to <img src="https://ubalt.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-3300c1f40e1dcdc79baadc068577395c_l3.svg" class="ql-img-inline-formula quicklatex-auto-format" alt="&#82;&#94;&#50;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: 0px;" />.</li> <li class="import-Normal">For each categorical IV, test global effect and, if significant, follow up with test, interval, and assessment of importance for each comparison.</li> <li class="import-Normal">If the equation will be used for prediction, assess the precision of prediction.</li> </ul> </li> </ul> </li> </ul> </div> <p>&nbsp;</p> <p>Sources: Modified from the class notes of Salih Binich (2011) and Russell G. Almond (2011).</p> </div> </div> </div></div>

</body>
</html>