22 Chapter 22: Automation (RPA) and AI Agents
Learning outcomes
At the end of this chapter, you should be able to
- Explain the structure of AI agents
- Describe accounting applications of AI agents
- Use AI agent for data entry
- Use AI agent for report creation
Chapter content
Note: include practice and knowledge checks throughout chapter
# Automation
A primary benefit of modeling data and using and developing data related tools is to reduce tedious and mechanical processes. By automating these processes, individuals can accomplish more in less time and/or can spend more time on value-adding work. Many of the methods and tools discussed in this course can be used to automate business functions. When a process is prepared automation can be simple and efficient. This chapter will discuss other topics related to automating tasks such as those discussed in this class.
## Annotating code
To make automation successful, keeping clear and accurate records is usually uncessary. Keeping careful annotations of and within code can help users understand and repeat automation. Most of the time, more annotation and explanation is better than less.
In R, code can be annotated with comments:
“`r
# These are comments to explain the code below
“`
Code can also be stored and explained in notebook software. In R, this is “quarto” and is stored in a “qmd” file. Other similar software such as Jupiter notebooks are also available for R, python, and other languages.
## Functions
Specific and repeated tasks can be simplified by creating functions of these tasks. Later calls to a function can automate those steps. We have used some functions throughout the book. We will review some of the parts of building a function here.
“`r
FunctionName <- function(Inputs){
result<-FunctionOfInputs(Inputs)
result
}
“`
Creating a function requires a name that can later be used to call the function. Here the function is called “FunctionName”. “function” is a term for creating a function. “Inputs” are inputs that go into the function. The inputs can be a single variable, a data frame, or any R object. Multiple inputs into the function are separated by “,”. Within the function, some R actions are applied to the inputs and within the function a result is created. The result is a single R object. Multiple objects could be combined as a list or data frame or other object. The last term in the function provides the output from the function that will be returned when the function is finished. Within the function, the steps may be simple or complicated and only require inputs to be converted into an output.
Default values can be created for inputs. For example, function(x,a=1,b=0). In this case, a and b have default values. x is required and does not have a default value.
The function is called with the FunctionName and inputs:
“`r
FunctionName(Inputs)
“`
## Scheduling
In addition to running steps within a function automatically, we may want to run functions or entire programs automatically. Automatic running can be done with schedulers. This are available in different operating systems. We can schedule an automatic task from R. Let’s say we have program that we have that we want to run to generate output for a regular meeting. Here we will create a simple program.
“`r
dailymtg <- paste(“print(‘This output is for our daily meeting.’)”)
writeLines(dailymtg, “dailymtgscript.R”)
“`
We can schedule the task and here we will run three days during the business week:
“`r
library(taskscheduleR)
taskscheduler_create(taskname = “Daily_Mtg”, rscript = “dailymtgscript.R”,
schedule = “WEEKLY”, starttime = “07:55”, days = c(‘MON’, ‘WED’, ‘FRI’))
“`
Other functions can tell us what we have scheduled and to stop or delete the scheduled program.
“`r
taskscheduler_ls()
taskscheduler_stop(“Daily_Mtg”)
taskscheduler_delete(“Daily_Mtg”)
“`
Suppose that you wanted update the daily report after updating a data set. You could schedule a data analysis update script first and then later schedule the report script.
## Report building
Accounting tasks can involve generating similar reports on a regular basis. Accounting software often can generate standardized output for these reports. Other times reports are ad hoc but repetitive or expand on automated output.
Most autogenerated reports might best be done in whatever software the company uses or in the software you use for data anaysis. Here we will walk through the steps to create an automated report from some data tables that are updated and require automatic updates to the report.
Quarto is a markdown language (formerly RMarkdown) that is a word processing software that generates formatted text that also allows data and code to be incorporated into the document. We will only introduce enough of Quarto to build a basic presentation. For more complicated features and a broader introduction [see here](https://quarto.org/docs/get-started/hello/rstudio.html).
The files for the quarto document created here can be found in [the file folder here](https://www.dropbox.com/scl/fo/qitjsh99ow2okfh36rkdh/ANy9XI_TAUeqWml9ie3fpXE?rlkey=a4kj78uo1kvh8qvv39xow097m&st=r9wfb6p6&dl=0).
### Create new quarto document
To create a new quarto document:
1. In RStudio, go to File–>New File–>Quarto Presentation (or Document if not a presentation)
2. Select the type of presentation, here Reveal JS, and name the presentation. A template document is shown. The default format is “Visual” rather than “Source” so that standard word processing options appear (font formats, insert, etc). New items can be added to the file with “Insert”.
3. Select “Render” and you will be required to save the file and then the presentation will be “knit” together and opened. I am saving the document as “ExPresentation” so it will save as “ExPresentation.qmd”.
4. The presentation opens in the default browser and can be navigated as a presentation.
### Creating standard output
The presentation can be created in much the same way that a document can be created in word processing software. Here I will update some of the content. For a presentation, sections indicate new slides. Content within the section is the content that will be shown on the slide. A new section can be created by adding new text that is header2 type. Lower level type headers can create sections within the slide.
### Including R output
R code can be included in the document. The code is included in a code block. The executable code block can be inserted from the insert menu as an “executable cell”. The executable cell can be selected as R. The code can be run in the document.
In the presentation I will read in some data, summarize it, include a graph, and a formatted table. Note that one of the tables is too large for the presentation. I change the options in the YAML header at the top to make the table scrollable. I also include a few other options to show how to change some presentation options. Other options [can be found here](https://quarto.org/docs/reference/formats/presentations/revealjs.html).
### Making auto updated components
Inline code can reference information in the data and rcode that can be formatted and referenced. Inline code can be included in the text by using “`{r} Statements`”. Here statements includes reference to items to evaluate and include in the text output. Here I include profit margin information in the document that is updated based on the data output.
## Review
In this chapter we have discussed some of the ways to automate tasks. We have discussed annotating code, creating functions, scheduling tasks, and creating reports.
### Conceptual questions
1. Why is it important to annotate code?
2. What are some of the ways that functions can be used to automate tasks?
3. Describe some circumstances in which scheduling tasks might be useful.
4. What are some of the ways that reports can be automated?
### Practice questions
1. Create a function that takes a data frame and returns the mean of the data frame.
2. Save the function from 1 to a file and prepare the scheduler to schedule running the function once a week on Sunday at 10:00 pm.
3. Create a quarto presentation from the automatic template and alter some content and format of the presentation.
4. Add a code block to the presentation that reads in a data set and summarizes it.
5. Add in-line code on a slide that adds the mean of two columns from the data set in 4.
## Solutions to practice questions
1. The following code is one way to create a function that takes a data frame and returns the mean of the data frame:
“`r
meanfunction <- function(df){
mean(df)
}
“`
2. The following code is one way to save the function from 1 to a file and prepare the scheduler to schedule running the function once a week on Sunday at 10:00 pm:
“`r
writeLines(“meanfunction <- function(df){
mean(df)
}”, “meanfunction.R”)
library(taskscheduleR)
taskscheduler_create(taskname = “MeanFunction”, rscript = “meanfunction.R”,
schedule = “WEEKLY”, starttime = “22:00”, days = c(‘SUN’))
“`
3. See [here for customizing a presentation](https://quarto.org/docs/reference/formats/presentations/revealjs.html).
4. The following code is one way to add a code block to the presentation that reads in a data set and summarizes it after selection from the insert menu the insert executable cell and selecting R:
“`r
data <- read.csv(“data.csv”)
summary(data)
“`
5. The following code is one way to add in-line code on a slide that adds the mean of two columns from the data set in 4:
“`r
The mean of the two columns is `r mean(data$column1 + data$column2)`.
“`
Tutorial video
Note: include practice and knowledge checks
Mini-case video
Note: include practice and knowledge checks