R Tidyverse Reporting And Analytics For Excel Users

From Excel To Big Data and Interactive Dashboard Visualizations in 5 Hours

Last updated 2022-01-10 | 4.5

- How R implements common Excel functions and how it can actually be faster
- easier and more flexible than the familiar Excel methods. By the end of this course you will have the knowledge to work with large sets of data faster and easier than you ever thought possible. This course uses the tidyverse libraries in R which provide an elegant solution for solving 99% of our reporting requirements. Tidyverse is incredibly well supported and by focusing solely on this one set of libraries we can massively shortcut the amount of time that is required to get up and running with R. Many beginner courses in R will start teaching what is known as base R which covers the original methods for working with R that have been enhanced and simplified through the tidyverse. Although it maybe useful to have some understanding of base R it is neither a requirement of this course or a requirement to becoming really productive in the R language. As we go through the exercises we are going to be comparing to the most common way that processes are typically carried out in Excel.

What you'll learn

How R implements common Excel functions and how it can actually be faster
easier and more flexible than the familiar Excel methods. By the end of this course you will have the knowledge to work with large sets of data faster and easier than you ever thought possible. This course uses the tidyverse libraries in R which provide an elegant solution for solving 99% of our reporting requirements. Tidyverse is incredibly well supported and by focusing solely on this one set of libraries we can massively shortcut the amount of time that is required to get up and running with R. Many beginner courses in R will start teaching what is known as base R which covers the original methods for working with R that have been enhanced and simplified through the tidyverse. Although it maybe useful to have some understanding of base R it is neither a requirement of this course or a requirement to becoming really productive in the R language. As we go through the exercises we are going to be comparing to the most common way that processes are typically carried out in Excel.

* Requirements

* You should be comfortable with vlookups
* if statements and pivot tables in Excel

Description

If you use Excel for any type of reporting or analytics then this course is for you. There are a lot of great courses teaching R for statistical analysis and data science that can sometimes make R seem a bit too advanced for every day use. Also since there are many different ways of using R that can often add to the confusion. The reality is that R can be used to make your every day reporting analytics that you do in Excel much faster and easier without requiring any complex statistical techniques while at the same time giving you a solid foundation to expand into those areas if you so wish. This course uses the Tidyverse standards for using R which provides a single, comprehensive and easy to understand method for using R without complicating things via multiple methods. It's designed to build upon the the skills you are already familiar with in Excel to shortcut your learning journey. 

When I first started using R I thought that it could be a good replacement for the automation type processes I used to write in VBA. This can be quite off putting for a lot of Excel users as VBA often adds an extra layer of complexity to your work and is often only something which is done to automate a process which has already been established in Excel. One of the key benefits of Excel is that you are working directly with the data without having to go through the complexity and overhead of using a programming language. 

Programming languages such as VBA are actually very difficult for working with data as there isn't even any concise way of referencing common data elements such as named table columns. To carry out an operation on every row would take several lines of code which runs slow and ends up hiding your formula which actually contains your business logic. 

Despite all of this people use VBA anyway as once you invest the time to setup your processes you can run the exact same steps thousands of times with a click of a button. 

What if there were a way to work directly with your data as simply as Excel but also have more programming power than VBA? That's what R can do for you. 

Since I've started using R people have asked me when it would be beneficial to use R instead of Excel. Here are some examples

1. vlookups and sumifs on large datasets can run very slowly in Excel. I've helped people to replace multiple lines of vlookups that take 80 minutes to run in Excel with a single function in R that takes less than 1/10th of a second. 

2. Exploring and analysing your data in R can be Viewed in a simple table like Excel but also has a wide range of other methods which can be more effective.

3. Dashboards and visualisations are much richer and easier to construct than in Excel

4. Distributing your work in Excel can be beneficial since almost everyone has Excel installed. The problems with this are that not everyone always has the same version of Excel or addins installed which means your work might not be compatible. Also files are usually emailed around which can very quickly lead to hundreds of untracked copies of your Excel files with slight variations in them. The outputs from R can be simple Excel or csv files however your output can also be a web app that can be centrally stored and tracked on a server compatible with any web browser on your computer or smart phone. 

5. Team collaboration and version control in Excel is done via shared workbooks and track changes. Turning on these features in Excel disables some of Excels best features and still results in file locking. Team collaboration in R is done on github which allows you to easily work across teams without file locking issues and full audit histories of your work. 

The beauty of R is that once you start using it you will no longer have to make a special investment of time to automate your processes after your analysis is done. Practically anything that you can do in Excel you'll be able to do faster and better for even your first round of analysis and will leave you with an script which means your work is reproducible and automated from the very beginning. 

Even though your existing Excel skills will help you to pick up R one of the hardest things is that you're so familiar with Excel that it's too easy to keep on using it. I used Excel for years and spent thousands of hours studying how to use it more efficiently, I even taught advanced courses in it. It seemed obvious to me at the time that it was one of the most efficient ways to work with data. Even though working with a programming language might be more powerful it often had too much over head and was too removed from the actual data analysis. R is the programming language I wish I learnt 20 years ago. Perhaps somewhat counter intuitively you'll end up spending less time thinking about how to put a piece  of work together than Excel and more time looking at your data in new ways that you've probably never even thought of. 

Who this course is for:

  • If you are currently using Excel to deliver a regular reporting requirement this course will be a good fit for you. This course is designed to introduce R to non programmer Excel users who are already familiar with Pivot Tables, vlookups ,if statements and charts.

Course content

6 sections • 42 lectures

Intro Preview 00:40

R programming for beginners and people wanting to learn the Tidyverse methodology for learning R. Tidyverse is a widely used modern standard for using R which makes it very efficient for working with data. If you want to learn R programming then you want to learn Tidyverse. This course has been designed specifically with Excel and database users in mind. Excel terminology and references are used throughout the course to help link new r programming skills to your existing Excel skill set. 

About Your Instructor & How This Course Came to Be Preview 02:59

Why Use R? Preview 06:37

R programming with tidyverse can simplify, speed up, scale up and automate many of the tasks normally done in Excel. For years I worked with Excel, VBA and SQL databases. I'd never even heard of R programming until a good friend mentioned it to me. 

How to Get the Most out of this Course Preview 03:08

R Programming provides an interactive way of working with data which shortens the time to see input and results. This makes R programming a great language to learn quickly. To get the most out of this course I recommend completing all of the lectures first to get a good overview and solutions to common problems. After completing the lectures take advantage of the interactive nature of R programming to tweak and experiment with code to rapidly learn the language.

Accelerated Learning Techniques for Using The Example Files Preview 06:36

How to use the example R code that comes with this course to more rapidly learn R programming. 

Setting up R Preview 09:00

What you need to install to get up and running with R programming and tidyverse. 

Setting up R Extra Points Preview 01:11

Microsoft R vs the Cran distribution of R. What's the difference and what should you use to setup your R programming environment. 

Foundations Preview 02:33

R programming for beginners? Learn and get setup with Tidyverse. Learn the Tidyverse syntax and skipping over Base R programming can shortcut your journey to learning R programming.

Loading Data Preview 13:30

Learn to load data using Tidyverse R programming. Note the difference between Tidyverse read_csv and Base R read.csv

R Projects and Working Directories Preview 05:16

R Projects makes your r programming more portable when you send it to different users. Many r programming courses will teach the use of set working directory. R projects solve this requirement more elegantly. 

Loading Data Extra Points Preview 01:25

R Projects makes your r programming more portable when you send it to different users. Many r programming courses will teach the use of set working directory. R projects solve this requirement more elegantly. 

Calculated Columns Preview 09:02

Using the Tidyverse methodology for R programming we can add and update calculated columns in our datasets using the mutate function. 

Filtering Preview 02:15

In this lecture we will review how to filter our data using the Tidyverse methodology for R programming

Pivoting with R for data science vs Pivoting with Excel Preview 16:38

Excel users tend to think about pivoting data very differently to data scientists.


In this lecture, we're going to go over how to pivot data in R and perhaps even more importantly how to unpivot it.


In Excel, you're probably familiar with the 4 sections of a pivot table

Filter, Rows, Columns and values


pivoting is how we aggregate data.


This aggregated data is useful for


1. Compacting our data when there is too much of it to deal with

2. Getting an overall sense of our data

3. Setting up charts

4. Reporting


Data is grouped by either rows which creates a longer vertical summary or columns which creates a wider horizontal summary.


In Excel it's common to use pivot tables to create 2 dimensional summary reports using both rows and columns.


For reporting this can be useful as we can more efficiently utilise both horizontal and vertical space to visualize our data.


When it comes to analysis and data science work though it's more useful to have data in a long or tidy format.


If you'ved worked with Excel for a while you've probably recieved the output from a pivot table that doesn't contain the underlying data and is thus difficult to use for any further analysis.


Long or tidy format is how the computer needs to the data to be formatted for almost all processing.


Because of this it's likely that an analyst will spend far more time reversing the results of data that has been transposed horizontally.


When we go through the code I will show you how to use all of these functions.


As you get more into data science you'll probably notice that it tends to be more useful to perform all aggregations through the group_by function without also using the pivot_wider function to horizontally transpose the results.


This leaves the data in a format that can subsequently pipped into other functions like data visualizations more easily.

Vlookups to Joins Preview 09:07

Getting Data Into Excel with Power Query Preview 07:31

Here we will go over how to export your data from an R program and import it into Excel using Power Query. 

Data Visualisation Libraries to Install for this section Preview 00:56

R programming libraries to install for data visualization. 

Data Visualisation Options Preview 04:51

R programming data visualisation options

GGPlot Intro and Setting Aesthetic Properties Preview 06:23

ggplot is the tidyverse  r programming library for creating static visualisations. It's very versatile, works well with plotly interactive visualisations and is a good way for understanding other libraries which use a similar methodology such as high charter. 

GGPlot Geoms (aka Chart types) Preview 06:07

Tidyverse R programming ggplot geoms allow you to specif chart types

GGPlot Layering Multiple Charts and Text Preview 06:06

Tidyverse r programming ggplot layering multiple charts and text

GGPlot Creating Multiple Charts with Facets Preview 01:56

Tidyverse r programming ggplot layering multiple charts with Facets

Plotly Preview 07:19

Tidyverse r programming upgrading your ggplots to interactive HTML widgets using Plotly

Highcharter Preview 03:52

Tidyverse r programming HighCharter is a wrapper library for high charts which is a beautifully implemented interactive visualisation library.

Data Tables Preview 08:53

Tidyverse r programming DataTables JS is a wrapper to a popular java script control for displaying and working with interactive tables suitable for HTML dashboards

RMarkdown Preview 04:33

Tidyverse r programming RMarkdown allows you to combine text, layout and code into a single document which can be output to a number of formats. RMarkdown is a prerequisite for Flexdashboards

Flexdashboards Preview 04:41

Tidyverse r programming Flexdashboards are a special type of rmarkdown document which enables to very efficiently produce dashboards with very little code. I highly recommend flexdashboards for either distributing your work as stand alone HTML files or for managing the layout of Shiny applications as the code is much less than a standard Shiny UI. 

Fixing Errors Preview 07:17

Tidyverse r programming identifying common errors and their solutions

FAQ Intro Preview 01:03

R vs Python Preview 18:26

R & Python are the top 2 programming languages for data analytics and data science right now. There is a lot of debate over which one is better and the information tends to get a bit skewed. R & Python are each stronger at different things. This video covers my opinion on the strengths of each language so that people can make a more informed decision about which one to use for their circumstances.

Fill Missing Values Preview 04:51

Using the zoo timeseries functions to fill down missing values.

NA Values and Bulk Column Operations Preview 05:34

Want to efficiently deal with NA values and bulk column operations?

Data cleaning and prep is one of the most time-consuming tasks in data science. Learn some powerful tips to streamline those processes here.

Email Marketing Analysis with R Preview 19:32

Want to learn how to access and analyse email with R?

How about no code data visualisation with R?

R Script into Dashboards Preview 32:58

In this video I'm going to go over a number of questions from one of students. You'll learn how you can integrate existing R scripts into dashboards, how to clean up your code, functional programming, converting from wide to long data formats to make ggplot visualisation code more efficient and more.

Q&A R Script to R Shiny Flexdashboard, ggplot, reproducibility, functional programming, rmarkdown

Load R Shiny Flexdashboards Faster Preview 15:49

Why is it taking so long to render my dashboards after I've created them?

After watching this video this student was able to cut a long load time by over 50% which is a great result.

Load 1000 CSV Files in 3 Seconds to Dataframe Preview 05:19

In this video we will look at how we can use PURRR Map or more specifically MAP_DF from the PURRR library along with read_csv from the readr library to load 1000 csv files into a dataframe within 3 seconds.

PURRR and READR are both part of the R Tidyverse core library and ecosystem.

Map is a powerful alternative to for loop and lapply. Map is more concise and functionally focused when compared to for loops.

Map when compared to lapply allows for us to return values as explicit data types other than lists. For example map_df will return your results in a dataframe / tibble eliminating the need to do a row_bind after calling lapply. Map_dbl or map_int will return numeric vectors which can easily be passed into addtional numeric functions. Map_chr will return a character vector which you can pass into stringr operations such as str_c, paste or paste0.

Combine Excel Files Fast Preview 17:20

Demonstrates how you can use R Tidyverse to combine Excel files

R Shiny Flexdashboard Export Data Preview 13:14

In this video I show you how to build in an export function into your R Shiny Flexdashboards.

Dynamic Dashboard Filters Preview 06:33

Learn how to filter and present tabular data through the dashboard and make it more interactive and dynamic.

Dynamic Column Selector Preview 05:06

Learn how and why you might want to implement a dynamic column selector for your interactive dashboards.

A lot of the time when you're doing data analysis you end up with many more columns than you can see on the screen.

Now you could basically subset the columns so you're just seeing less columns but the problem with that is that maybe you just want to see a different column at different points in time or you have different clients with different requirements.

So instead of hard-coding these values in what we can do is we can create a dynamic selection instead so that anybody can have whatever they want or whatever you let them have whenever they want to actually see it.

Pivoting Preview 09:12

In this lecture we will review how pivoting data in R programming is different to Excel. 

Bonus Lecture: More Resources to Reach Your Goals Faster Preview 00:54