Cheat Sheet for Data Exploration in R

Most Commonly Used R Libraries
Code to Install package in R:


Outlier Detection Outlier, EVIR
Feature Selection Features, RRF
Data Transformation plyr, data.table
Data Visualization ggplot2, googleVis
Dimension Reduction factoMiner, CCP
Missing Value Imputations MissForest, MissMDAJ


1. Steps to Load a data file(s)

Read CSV file into R

Read a Tab seperated file

2. Steps to convert a variable to different data type

Use to test for data type xyz. Returns TRUE or FALSE
Use to explicitly convert it.

to one long vector  to matrix  to data frame
from vector c(x,y) cbind(x,y) rbind(x,y) data. frame(x.y)
from matrix as.vector(myrnatrix)
from data frame as.matrix(myframe)


3.Steps to Transpose a Dataset
example of melt function

4. Steps to Sort DataFrame
sort by var1

Sort by var1 and var2 (descending)

5. Steps to Create plots (Histogram)

6. Steps to Generate frequency tables with R

7. Sample Dataset in R

8. Remove duplicate values of a variable

9. Find class level count average and sum in R


10. Recognize and treat missing values and outliers

11. Merge / Join data sets


You might also like More from author