Unholy Trinity of R, Excel VBA and Powerpoint - Part 2 of 2

Using R to trigger Excel VBA macros to create PowerPoint presentations

R
Excel
VBA
PowerPoint
openxlsx
RDCOMClient
Author

Vishal Katti

Published

December 29, 2021

Abstract
This post demonstrates how to run VBA macros in Excel which in turn creates Presentations based off PowerPoint Templates.

This is part 2 of 2. Read part 1 here.

Quick Recap

In the previous post, we create the .potx template from the .pptx file we wanted to automate and the Excel template with the macro .xlsm that uses the PowerPoint template to create a new .pptx file with given data using VBA.

The report we want to automate is…

Figure 1: The Gapminder Report : The PowerPoint presentation we want to automate

…and the Excel and PowerPoint template we created are shown in Figure 2.

Figure 2: Excel Template with VBA macro

In this post, we will write the R script that will first massage the data into desired format and then load the data for one region into the Excel template and execute the VBA macro that will create the PowerPoint file with that data.

Strategy

Before we dive into code, we need to check a few things. We wish to create a presentation for each continent in the Gapminder data. A closer look at the Presentation will tell you what kind of data we need for each slide/graph/table while the Excel template will reveal what should the structure of each dataset should be. While looking into this structure, some questions will pop-up. The idea here is to create the datasets in such a way that they can be easily filtered for each continent and the resultant table can be written to the Excel template without any or very little modification. Let us proceed slide-by-slide.

Creating the datasets

Slide 1

Slide 1 is the title page and needs 2 strings; one for Title, one for Subtitle. The Title for the base presentation is “World Population”. For each continent, it could be “<continent_name> Population”. The subtitle is a combination of Author Name and Created Date. So we need a string like “<author_name> | <created_date>” where created_date is the formatted system date.

These strings can be created while writing the data to the Excel template.

Slide 2

The chart on slide 2 needs raw data structured as below. You will notice that at a continent-level, this table needs a minimum of 5 countries. Do we have any continents in the Gapminder data with less than 5 countries? Yes, we have Oceania with only Australia and New Zealand. For ease of use, let us include these countries along with Asian countries in a new Region variable.

Figure 3: 02_chart

We will create the region variable in the gapminder data. But first, let us load some relevant packages.

Code
options(tidyverse.quiet = TRUE)
library(tidyverse) # duh!
library(rmarkdown) # to display the tables interactively in this post. Not really needed for the final solution.
library(openxlsx) # to write the data to the Excel Template.
# library(RDCOMClient) # to load and run the Excel macro post data load.
Code
# Read in Gapminder data
gp <- gapminder::gapminder

# Create new region variable
gp <- gp %>%
  mutate(region = if_else(as.character(continent) %in% c("Asia","Oceania"),
                          "Asia-Pacific", 
                          as.character(continent)),
         country = as.character(country))

# Keep only relevant columns
gp <- gp %>% select(region, country, year, pop)

# View details
glimpse(gp)
Rows: 1,704
Columns: 4
$ region  <chr> "Asia-Pacific", "Asia-Pacific", "Asia-Pacific", "Asia-Pacific"~
$ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "A~
$ year    <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 20~
$ pop     <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, 1288~

Now that we have the source data available, we must now create the datasets we need that we can write to the Excel template for each region.

The required table in Figure 3 shows the top 4 countries (based on 2007 population) and all other countries clubbed into ‘others’ in a given region and then the total population of the region on a yearly basis. This table has to be created for all 4 regions.

Code
pop_trend <- gp %>%
  group_by(region, country, year) %>% 
  summarise(pop = sum(pop, na.rm = TRUE),
            .groups = 'drop') %>%
  mutate(pop = round(pop/1E6, 0)) %>% # population in millions
  pivot_wider(names_from = year, values_from = pop, names_sort = TRUE) %>% 
  arrange(desc(`2007`)) # sort by max pop to min pop in latest year i.e. 2007

paged_table(pop_trend)

Now that we have the required columns, let’s plan the row order. We notice that, for each region, we have the top 4 countries (as per 2007) , followed by ‘Others’. Let’s create the top-4 dataset.

Code
top4 <- pop_trend %>% 
  group_by(region) %>% 
  slice_max(`2007`, n = 4, with_ties = FALSE) %>% 
  ungroup()

paged_table(top4)

To create the others dataset, we exclude all countries that are present in the top-4.

Code
others <- pop_trend %>% 
  filter(!country %in% top4$country) %>% 
  group_by(region) %>% 
  summarise(across(.cols = -country, .fns = sum),
            .groups = 'drop') %>% 
  mutate(country = "Others") %>% 
  select(region, country, everything())

paged_table(others)

While we create the top-4 and others datasets separately, we will combine them later at the very last moment before writing them to the Excel template.

Now that we have the datasets needed for 02_chart, let’s proceed to the create 02_table . This table gives you the count of countries that fall under various population ranges.

Figure 4: 02_table on Slide 2

Let’s create 02_table. To create this table, we first create a new variable called pop_range.

Code
pop_levels <- c('Less than 500K','500K - 1 Million',
                '1M - 10 Million', '10M - 100 Million',
                '100M - 1 Billion', 'More than 1 Billion')

gp2007 <- gp %>% 
  filter(year == 2007) %>% 
  mutate(pop_range = case_when(pop < 5E5 ~ pop_levels[1],
                               pop < 1E6 ~ pop_levels[2],
                               pop < 1E7 ~ pop_levels[3],
                               pop < 1E8 ~ pop_levels[4],
                               pop < 1E9 ~ pop_levels[5],
                               TRUE      ~ pop_levels[6]),
         pop_range = factor(pop_range, levels = pop_levels))

pop_groups <- gp2007 %>% 
  group_by(region, pop_range, .drop = FALSE) %>% 
  summarise(`# of Countries` = n(),
            .groups = 'drop') %>% 
  arrange(region, pop_range) %>% 
  rename(`Population Category` = pop_range)

paged_table(pop_groups)

Slide 3

Slide 3 contains 2 strings and one chart. The data for the chart looks as shown below.

Figure 5: 03_chart table for Slide 3

The data for 03_chart is the list of top 10 countries in each region as per latest record i.e. 2007. Let’s create the top10 table.

Code
top10 <- gp %>% 
  filter(year == 2007) %>% 
  group_by(region) %>% 
  slice_max(pop, n = 10, with_ties = FALSE) %>% 
  ungroup() %>% 
  select(-year) %>% 
  mutate(pop = round(pop/1E6, 4)) %>% # population in millions
  set_names(c("region","country","population"))

paged_table(top10)

The for loop!

We now have to load the Excel template with the data at appropriate cell locations for one region at a time. Since we have about 4 regions, we will create a vector of unique regions to iterate over.

Code
unique_regions <- gp %>% distinct(region) %>% pull()
cat(unique_regions, sep = "\n")
Asia-Pacific
Europe
Africa
Americas

As our last step, we will create the for loop that will iterate over unique_regions , filter the datasets for each region, write them to the Excel Template, save the template with temporary name. We save the file with different name to prevent unintentionally corrupting the Excel macro template. Finally, we run the macro in the renamed file.

The code will look something like this

Code
for (region in unique_regions) {
  
  # Step 1: filter the data sets
  # Step 2: write the data sets
  # Step 3: save the excel template with different name
  # Step 4: load the renamed Excel file
  # Step 5: run macro
}

Let’s populate the above for loop with the code we need.

Code
for (curr_region in unique_regions) {
  
  # Step 1: filter the data sets
  
  # Slide 1
  S1_title <- paste(curr_region, "Population")
  S1_subtitle <- paste("Vishal Katti","|",format(Sys.Date(),"%b %d, %Y"), sep = "   ")
  
  # Slide 2
  S2_title <- paste(curr_region, "Population since 1952")
  
  S2_top4 <- top4        %>% filter(region == all_of(curr_region)) %>% select(-region) %>% arrange(desc(`2007`))
  S2_others <- others    %>% filter(region == all_of(curr_region)) %>% select(-region)
  S2_top5 <- bind_rows(S2_top4, S2_others)
  
  S2_table <- pop_groups %>% filter(region == all_of(curr_region)) %>% select(-region)
  
  # Slide 3
  S3_title <- paste("Top 10 most populated countries in", curr_region)
  
  S3_chart <- top10      %>% filter(region == all_of(curr_region)) %>% select(-region)
  
  S3_factoid <- paste("The population of", S3_chart$country[1], "is approx.",
                      round(S3_chart$population[1]/S3_chart$population[10], 0),
                      "times that of", S3_chart$country[10])
  
  # Step 2: write the data sets
  
  # Load the template
  wb <- loadWorkbook("path/to/template/XL2PPT.xlsm") # relative to this R script
  sht <- "Sheet1"
  
  # write data to coordinate (col, row)
  writeData(wb, sht, S1_title,    xy = c(3, 3),  colNames = FALSE)
  writeData(wb, sht, S1_subtitle, xy = c(3, 4),  colNames = FALSE)
  writeData(wb, sht, S2_title,    xy = c(3, 7),  colNames = FALSE)
  writeData(wb, sht, S2_top5,     xy = c(3, 9),  colNames = TRUE)
  writeData(wb, sht, S2_table,    xy = c(18, 9), colNames = TRUE)
  writeData(wb, sht, S3_title,    xy = c(3, 18), colNames = FALSE)
  writeData(wb, sht, S3_factoid,  xy = c(3, 19), colNames = FALSE)
  writeData(wb, sht, S3_chart,    xy = c(3, 21), colNames = TRUE)
  
  # Step 3: save the excel template with different name
  saveWorkbook(wb, "path/to/template/XL2PPT_edited.xlsm", overwrite = TRUE)
  gc(verbose = TRUE)
  Sys.sleep(2)
  
  # Step 4: load the renamed Excel file
  # Create Excel Application
  xlApp <- COMCreate("Excel.Application")

  # Open the Macro Excel book
  xlWbk <- xlApp$Workbooks()$Open(normalizePath("path/to/template/XL2PPT_edited.xlsm", winslash = "/")) # Change to your directory
  # its ok to run macro without visible excel application
  # If you want to see your workbook, please set it to TRUE
  xlApp[["Visible"]] <- FALSE
  
  # Step 5: run macro
  xlApp$Run("Create_Continental_Deck") # Name of Macro to run

  xlWbk$Close(TRUE) # save and close excel book
  xlApp$Quit()
  gc(verbose = TRUE)
  Sys.sleep(2)
}

Once the code runs completely, you will see 4 new PowerPoint Presentations in your working folder.

Figure 6: Output Files

You can download the full R script from here.

References & Citations

Reuse

Citation

BibTeX citation:
@online{katti2021,
  author = {Katti, Vishal},
  title = {Unholy {Trinity} of {R,} {Excel} {VBA} and {Powerpoint} -
    {Part} 2 of 2},
  date = {2021-12-29},
  url = {https://vishalkatti.com/posts/R2VBA2PPT2},
  langid = {en},
  abstract = {This post demonstrates how to run VBA macros in Excel
    which in turn creates Presentations based off PowerPoint Templates.}
}
For attribution, please cite this work as:
Katti, Vishal. 2021. “Unholy Trinity of R, Excel VBA and Powerpoint - Part 2 of 2.” December 29, 2021. https://vishalkatti.com/posts/R2VBA2PPT2.