An Accident in R – Plot

While going over some data for an unrelated research project, I accidentally discovered a very useful feature of R.

 

First, I load a CSV File:

> heat<-read.csv(“MiamiHeat.csv”)

For this example, I’m using a data set from the Miami Heat NBA team here.  The Wikipedia article on Basketball Statistics describes the columns in this sheet.

Normally, I might plot something I find interesting, let’s plot rebounds vs steals:

> plot(heat$Rebounds,heat$Steals)

 

That’s kind of interesting, but we have many columns in this csv file that might provide some insight about the NBA.   The plot looks pretty muddled, but we might as well check for a correlation:

> cor.test(heat$Rebounds,heat$Steals)

    Pearson’s product-moment correlation

data:  heat$Rebounds and heat$Steals
t = -0.0943, df = 80, p-value = 0.925
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.2270342  0.2069341
sample estimates:
cor
-0.01054669

We see can’t show a correlation for this.  Is there anything interesting here?  Well, we can actually plot each column against every other column with one command!

> plot(heat)

While, from this blog post, this image may not seem useful, it is quite useful on a decent sized screen.   Here’s a closer look from  a subsection:

It looks like FGA (field goals attempted) and Points are positively correlated.

> plot(heat$FGA,heat$Points)

We also see that a positive correlation with some effect:

    Pearson’s product-moment correlation

data:  heat$FGA and heat$Points
t = 3.394, df = 80, p-value = 0.001074
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1492569 0.5309065
sample estimates:
cor
0.3547730

You can see how using R’s plot on ALL of your data can be useful for exploring possible relationships among your data.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *