Brendan's R Notes Page

R logo

R is a free, open source environment for statistical computing and graphics. For more info, program download, and manuals, see: www.r-project.org

For a great reference for a variety of graphing tasks see:
R Graph Gallery : Enhance your data visualisation with R.

This page is where I keep a list of some common and less common R tricks that I can copy/paste from rather than trying to remember how I did it last time. It is loosely organized into the following categories.

General R Syntax Statistics Graphics Database

General R Syntax

Use C style string formatting for labels

label.dates <- sprintf("%d%02d", result.df$YYYYMM, result.df$DDAY)
label.dates[1:5]
 [1] "20060101" "20060102" "20060103" "20060104" "20060105" 

String concatenation with the paste command

myFileName <- paste("1yr_", myAP, "_chart.jpeg")
General R Syntax Statistics Graphics Database

Statistics

Work with the quartiles of a vector

# check the quartiles of a vector
> quantile(na.omit(v2eff))
     0%     25%     50%     75%    100% 
 90.200  98.115  98.850  99.395 100.000 

# get the inter-quartile distance of a vector/distribution
> quantile(residuals(v2lm), probs=.75) - quantile(residuals(v2lm), probs=.25)
      75% 
0.8533763 

Plot the (kernel) density estimate of a vector

# the na.omit command excludes any values from the "diff" vector that are NA's, preventing error with functions that use it
plot(density(na.omit(diff)),
  xlab="Vector 1 - Vector 2",
  main="Distribution of Differences between two vectors")

# the rug command adds slashes on the axis to represent where the data points are
rug(na.omit(diff))

Get the argument max (or min) of a vector

Use the "which" command.
inputDates[which.max(v2eff - v1eff)]
 [1] "20070530"
General R Syntax Statistics Graphics Database

Graphics

Use custom tick labels for an axis of a plot

myBinLabels <- c("0-10","11-20","21-30","31-40","41-50","51-60")
plot(mean.v2eff, ylim=c(80,100), xaxt="n")
axis(side=1, at=c(1:6), labels=myBinLabels)

Generate a sequence of dates and use that vector as axis labels on a plot

myaxis.days <- seq(as.Date("2006/1/1"), as.Date("2007/12/31"), "days")

# to convert it to a certain format
myaxis.months <- format(myaxis.days, "%b %y")

# in the axis command, las=2 turns labels perpendicular to the axis
axis.Date(1, at=myaxis.months, format="%b %y", las=2)

Dynamically interact with a plot to label outlier dates

First set up a plot.
plot(result.df$APO_ARREFF, result.df$V2EFF)

Call the identify function on the plot with a vector of labels. Clicking on a data point adds a label, the function returns the index of point selected.
identify(result.df$APO_ARREFF, result.df$V2EFF, labels=label.dates)

Save a chart straight from script rather than manually through GUI

jpeg(file="myChart.jpeg", width=800, height=400, quality=100)
plot(xVariable, yVariable, main="My Chart")
dev.off()
General R Syntax Statistics Graphics Database

Database

Query an Oracle database directly from R

This uses the R library
RODBC
If you get an error: "there is no package called 'RODBC'", then you need to install the package. From the Packages menu, select 'Install package', and select RODBC.
# load the library
library(RODBC)

# open the connection, assuming you have an ODBC setup, call the DSN here
# this will prompt you for user id, password, and you can switch to a different TNS service name as well
mychannel <- odbcConnect("crs")

# enter the query and save the result in a data frame in R for further processing later
result.df <- sqlQuery(mychannel, "select * from bhogan.npiasairports where state='NY'")

# close the ODBC connection 
close(mychannel)

If you have a more complex query to enter, use the paste command to improve readability
result.df <- sqlQuery(mychannel,
paste("select *",
"from bhogan.derived_aspm_slots A",
"where", 
"A.slice_start_loc>=to_date('20070817 08','yyyymmdd hh24') and",
"A.slice_start_loc<to_date('20070818 01','yyyymmdd hh24') and",
"A.locid='EWR'",
"and A.latest='T'",
"order by A.yyyymm, A.dday, A.hhour, A.qtr"))
General R Syntax Statistics Graphics Database



Maintained by Brendan Hogan, comments or questions: bhogan@virginia.edu
Last Modified: Tuesday, 05-Aug-2008 18:57:05 EDT