Scatter plot correlation unknown

3/15/2024

The dataset that the response variable depends on contains values of what we call the explanatory variable (or independent variable). When exploring the relationship between two datasets, if one set seems to depend on the other, we’ll say that dataset contains values of the response variable (or dependent variable). However, in this case, there’s not a preferred choice for dependence, as each could be seen as depending on the other. These datasets are generally related: as one goes up, the other goes down.

For example, consider the percent of a country’s budget devoted to the military and the percent earmarked for public health. It’s worth noting that not every pair of related datasets has clear dependence. If it were the other way around, people could buy a new, more expensive house and then expect a raise! (This is very bad advice.) Which depends on the other? In this case, sale price depends on income: people who have a higher income can afford a more expensive house. Here’s another example: if we collected data on home purchases in a certain area, and noted both the sale price of the house and the annual household income of the purchaser, we might expect a relationship between those two. In our exam example, it is appropriate to say that the score on the final depends on the score on the midterm, rather than the other way around: if the midterm depended on the final, then we’d need to know the final score first, which doesn’t make sense. Relationships Between Quantitative Datasetsīefore we can evaluate a relationship between two datasets, we must first decide if we feel that one might depend on the other. First, though, we need to lay some graphical groundwork.

The statistical method of regression can find a formula that does the best job of predicting a score on the final exam based on the student’s score on the midterm, as well as give a measure of the confidence of that prediction! In this section, we’ll discover how to use regression to make these predictions. A student with a really good grade on the midterm might be overconfident going into the final, and as a result doesn’t prepare adequately. Of course, that relationship isn’t set in stone a student’s performance on a midterm exam doesn’t cement their performance on the final! A student might use a poor result on the midterm as motivation to study more for the final. Similarly, if a student did poorly on the midterm, they probably also did poorly on the final exam. It seems reasonable to expect that there is a relationship between those two datasets: If a student did well on the midterm, they were probably more likely to do well on the final than the average student. For example, a student who wants to know how well they can expect to score on an upcoming final exam may consider reviewing the data on midterm and final exam scores for students who have previously taken the class. One of the most powerful tools statistics gives us is the ability to explore relationships between two datasets containing quantitative values, and then use that relationship to make predictions.

Estimate and interpret regression lines.
Distinguish among positive, negative and no correlation.
Construct a scatter plot for a dataset.
We conclude "there is not enough evidence at the \(\alpha\) level to conclude that there is a linear relationship in the population between the predictor x and response y.By the end of this section, you will be able to:
If the P-value is larger than the significance level \(\alpha\), we fail to reject the null hypothesis.
We conclude that "there is sufficient evidence at the\(\alpha\) level to conclude that there is a linear relationship in the population between the predictor x and response y."
If the P-value is smaller than the significance level \(\alpha\), we reject the null hypothesis in favor of the alternative.
As always, the P-value is the answer to the question "how likely is it that we’d get a test statistic t* as extreme as we did if the null hypothesis were true?" The P-value is determined by referring to a t-distribution with n-2 degrees of freedom. Third, we use the resulting test statistic to calculate the P-value. There is one more point we haven't stressed yet in our discussion about the correlation coefficient r and the coefficient of determination \(R^\)

0 Comments

Scatter plot correlation unknown

Leave a Reply.

Author

Archives

Categories