If you can’t use the tools, you can’t analyze the data
R is a statistical programming language developed by scientists that has open source libraries for statistics, machine learning, and data science. R lends itself well to business because of its depth of topic-specific packages and its communication infrastructure. R has packages covering a wide range of topics such as econometrics, finance, and time series. R has best-in-class tools for visualization, reporting, and interactivity, which are as important to business as they are to science. Because of this, R is well-suited for scientists, engineers and business professionals.
- Business Capability (1 = Low, 10 = High)
- Ease of Learning (1 = Difficult, 10 = Easy)
- Cost (Free/Minimal, Low, High)
- Trend (0 = Fast Decline, 5 = Stable, 10 = Fast Growth)
11 Reasons You Should Learn R
REASON 01: R IS OPEN-SOURCE AND FREELY AVAILABLE TOOL.
Unlike SAS and Matlab, one can freely install, use, update, clone, modify, redistribute and resell R. This saves lots of money, but it also allows for easy upgrades, which is useful for a statistical programming language.
REASON 02: R IS CROSS-PLATFORM AND OS COMPATIBLE TOOL.
R can be run on Windows, Mac OS X and Linux. It can also import data from Microsoft Excel, Microsoft Access, MySQL, SQLite, Oracle and many other programs as well.
REASON 03: R IS A POWERFUL SCRIPTING LANGUAGE.
As such, R can handle large, complex data sets. R is also the best language to use for heavy, resource intensive simulations and it can be used on high performance computer clusters.
REASON 04: R HAS WIDESPREAD ACCLAIM.
With an estimated 2 million users, R is one of the top programming languages of 2017.
REASON 05: R IS HIGHLY FLEXIBLE AND EVOLVED.
Many new developments in statistics first appear as R packages.
REASON 06: PUBLISHERS LOVE R
R integrates easily with document preparation systems like LaTeX. That means statistical output and graphics from R can be embedded into word-processing documents.
REASON 07: R HAS A HUGE, VIBRANT COMMUNITY AND RESOURCE BANK
with a global community of passionate users who regularly interact on discussion forums and attend conferences. In addition, about 2000 free libraries are available for your unlimited use, covering statistical areas of finance, cluster analysis, high performance computing and more.
REASON 08: LEARNING R IS EASY WITH THE TIDYVERSE
Learning R used to be a major challenge. Base R was a complex and inconsistent programming language. Structure and formality was not the top priority as in other programming languages. This all changed with the “tidyverse”, a set of packages and tools that have a consistently structured programming interface.
REASON 09: R COMMUNITY SUPPORT
Being a powerful language alone is not enough. To be successful, a language needs community support. We’ll hit on two ways that R excels in this respects: CRAN and the R Community.
REASON 10: R HAS HEART
We already talked about the infrastructure, the tidyverse, that enables the ecosystem of applications to be built using a consistent approach. It’s this infrastructure that brings life into your data analysis. The tidyverse enables:
- Data manipulation (dplyr, tidyr)
- Working with data types (stringr for strings, lubridate for date/datetime, forcats for categorical/factors)
- Visualization (ggplot2)
- Programming (purrr, tidyeval)
- Communication (Rmarkdown, shiny)
When tools such as dplyr and ggplot2 came to fruition, it made the learning curve much easier by providing a consistent and structured approach to working with data. As Hadley Wickham and many others continued to evolve R, the tidyverse came to be, which includes a series of commonly used packages for data manipulation, visualization, iteration, modeling, and communication. The end result is that R is now much easier to learn (we’ll show you in our next article!)
REASON 10: R for Business
Rmarkdown is a framework for creating reproducible reports that has since been extended to building blogs, presentations, websites, books, journals, and more. It’s the technology that’s behind this blog, and it allows us to include the code with the text so that anyone can follow the analysis and see the output right with the explanation. What’s really cool is that the technology has evolved so much. Here are a few examples of its capability:
- rmarkdown for generating HTML, Word and PDF reports
- rmarkdown for generating presentations
- flexdashboard for creating web apps via the user-friendly Rmarkdown format.
- blogdown for building blogs and websites
- bookdown for creating online books
- Interactive documents
- Parameterized reports for generating custom reports (e.g. reports for a specific geographic segment, department, or segment of time)
REASON 11: The R community is awesome!
How companies are using R
- Ford uses R to improve the design of its vehicles.
- Basically, Twitter uses R to monitor user experience.
- The US National Weather Service uses R to predict severe flooding.
- The Human Rights Data Analysis Group uses R to quantify the impact of war.
- R is being used by The New York Times to create infographics.
- Google uses R to calculate the ROI of advertising campaigns.
- Facebook uses R to update Facebook status updates and its social network graph
WHAT SHOULD YOU DO?
Don’t make the decision tougher than what it is. Think about where you are coming from:
Are you a computer scientist or software engineer? If yes, choose Python.
Are you an analytics professional or mechanical/industrial/chemical engineer looking to get into data science? If yes, choose R.
Think about what you are trying to do:
Are you trying to build a self-driving car? If yes, choose Python.
Are you trying to communicate business analytics throughout your organization? If yes, choose R.
R can also be used in a big data context; You often hear that Scala and Python are great, and that is true, but you could also consider R when you’re working on visualization or data exploration; See this question and answers for more information – Is R considered unsuitable for Big Data when compared to Python?
Of course, tools like Mahout will always also be worth your time, and for the professional goals that you’re talking about, it’s an “and-and” story. My advice would be to check some companies and/or industries that you would like to work for and then see how much Mahout is actually used versus R for you to prioritize your learning.