Wednesday, October 16, 2013

VBA front end for R

If you work in the analytics industry, I am sure you would have had this debate sometime or the other – pros and cons of R. While everyone agrees that R is quite powerful and has great graphics, most of us, especially those who have worked on GUI based tools like SASEG, etc agree that the text output of R can be pretty verbose. A colleague of mine ran a linear model and immediately exclaimed ‘it looks so bland!’

This set me exploring ways to beautify R – I found some interesting packages which would help in formatting the output – you can check out prettyR and html converter packages which can do wonders to the plain text output in R. However, my requirements were a little customized. We used excel in most of our day to day activities and VBA is quite powerful in parsing/formatting the results. So, why not use Excel and VBA to create a beautiful front end to run R? It could be a macro enabled tool which will read input from an excel sheet, run the regression code using RScript and display the formatted output on excel. Well, turns out that I was able to do all that and even more – this post explains the findings of my endeavor:

The following questions will be answered during the course of this article:
- How to run an RScript through MS Excel using VBA?
- How to run an RScript through command prompt? [in WINDOWS]
- How to pass arguments to an RScript through command line/external code? [in WINDOWS]
- How to read plain text files in MS Excel using VBA? [obviously WINDOWS :) ]

Just so that we know that our commands are executed correctly, let us write the following simple R code and save it in our directory ‘C:\R_code’ as ‘hello.R’

Contents of ‘C:\R_code\hello.R’
# Prints output to console
cat('Hello World')
var1<-5^3
var2<-7^3
cat('\nThe result of adding',var1,'to',var2,'is',var1+var2)

Running RScript through command prompt:

To be able to run R code through command prompt/other applications, you need to have the path of ‘R.exe’ and ‘RScript.exe’ in your system variable PATH. You can do this easily if you have admin rights to your system. Check this link to know how to do it on WINDOWS7. However, if you don’t have admin rights and want to add something to the PATH variable, don’t worry – you can easily add this to the USER variable PATH. Here are the steps on how to do this:

1. Suppose you have the ‘R.exe’ and ‘RScript.exe’ installed in the following directory: ‘C:\Program Files\R\R-2.15.3\bin\x64’. Copy this path to your clipboard.
2. Go to ‘Computer’ -- > Properties
3. On the left pane, click on ‘Advanced system settings’
4. On the ‘System properties’ dialog that opens up, navigate to the ‘Advanced’ tab and click on ‘Environment variables…’


5. On the ‘User variables’ click on ‘New…’


6. In the field ‘Variable Name:’, type PATH
7. In the field ‘Variable Value:’, paste the clipboard value, ie ‘C:\Program Files\R\R-2.15.3\bin\x64’. Add a semicolon ‘;’ after that.
8. Click on ‘Ok’ as many times to dismiss all dialog boxes.
9. Open command prompt and type ‘Rscript’ and hit ENTER. You will see the following:


Now that you have Rscript on your path, you can run R code from any directory on your system, including applications like MS-Excel through VBA. Just repeat step 9 by passing any *.R file as argument with the full path and it will execute as expected:


Running RScript through VBA:

To run this code in MS-Excel using VBA, you need to open a macro enabled worksheet (*.xlsm). To create a new macro enabled sheet, just create a new workbook and click on ‘Save As..’ and save as ‘Excel Macro-Enabled Workbook (*.xlsm)’. Once you have a macro-enabled workbook open, press the shortcut key combination ‘ALT + F11’ to open up the VBA editor. Once that is done, right click on the ‘Project Explorer’ to create a new module (which will be Module1 by default) and then type the following VBA code:

Sub RunRscript()
'runs an external R code through Shell
'The location of the RScript is 'C:\R_code'
'The script name is 'hello.R'

Dim shell As Object
Set shell = VBA.CreateObject("WScript.Shell")
Dim waitTillComplete As Boolean: waitTillComplete = True
Dim style As Integer: style = 1
Dim errorCode As Integer
Dim path As String
path = "RScript C:\R_code\hello.R"
errorCode = shell.Run(path, style, waitTillComplete)
End Sub

If you look at the VBA code carefully, it creates a Windows Shell object and invokes the R command through the shell. Also, the advantage of using Wscript.shell is that you can get VBA to wait till the execution is finished. To get more information on how to run a macro or use the VBA editor, you can refer to a lot of online tutorials that are easily available. A good place to start would be the MSDN tutorial which you can find here.

When you run this macro, you can see that a command window opens up, executes something and closes. But how do you know if the code has actually executed? A good way to redirect the console output on the R code to a file. You can use this by the sink function in R. Here is the modified R code which accomplishes the same:

# Re-directs the console output to a file 'hello.txt'
# The file is created in the directory 'C:\R_code'

sink('C:/R_code/hello.txt',append=F,type="output")
cat('Hello World')
var1<-5^3
var2<-7^3
cat('\nThe result of adding',var1,'to',var2,'is',var1+var2)
sink(NULL)

Once you’ve run the VBA macro, browse to C:\R_code and check if the ‘hello.txt’ has been created or not. If you can find the file there, then congratulations! You have run successfully used VBA to execute an R script.

Passing arguments to an RScript through command line/VBA:

Most of the work which we do requires us to pass inputs/parameters to a tool at runtime. In the code above, let’s say, we wanted ‘var1’ and ‘var2’ to be passed during runtime instead of being hardcoded the way they are right now. Let us create a simple excel tool which accepts two numbers and adds them, the front end would look like this:


Now, the only change in the VBA code would be to read inputs from cells D5 and F5 and pass it on to the RScript. The modified code would look like:

Sub RunRscript()
'runs an external R code through Shell
'The location of the RScript is 'C:\R_code'
'The script name is 'hello.R'
Dim shell As Object
Set shell = VBA.CreateObject("WScript.Shell")
Dim waitTillComplete As Boolean: waitTillComplete = True
Dim style As Integer: style = 1
Dim errorCode As Integer
Dim var1, var2 As Double
var1 = Sheet1.Range("D5").Value
var2 = Sheet1.Range("F5").Value

Dim path As String
path = "RScript C:\R_code\hello.R " & var1 & " " & var2

errorCode = shell.Run(path, style, waitTillComplete)

End Sub

Now, the VBA code is ready to pass two extra parameters to the Rscript and get it executed. But, the change on the input side means we will also have to change the R code to accept the input parameters and process them. This can be accomplished very well using the commandArgs function in R which will read the arguments and store it as a vector. The code changes as below:

# Accepts two numbers and adds them
# Re-directs the console output to a file 'hello.txt'
# The file is created in the directory 'C:\R_code'

args<-commandArgs(trailingOnly=T)
# cat(paste(args,collapse="\n"))
sink('C:/R_code/hello.txt',append=F,type="output")
cat('Hello World')
var1<-as.numeric(args[1])
var2<-as.numeric(args[2])
cat('\nThe result of adding',var1,'to',var2,'is',var1+var2)
sink(NULL)

Note the use of the ‘trailingOnly=T’ option in the commandArgs function. This would make the args vector store only those arguments which are passed by the USER. In addition to the USER arguments, RScript passes some system arguments by default. If you are interests in modifying those (like the directory of the R file, etc), then you would probably keep the trailingOnly argument to FALSE.

You now learnt how to invoke R from excel and how to pass data between R and excel. You can build on these two functionalities to develop some cool stuff which use Excel as front end and R as the backend. By the use of packages like ‘xlsx’ which can create data frames from excel sheets, you can go on to build so many applications like these:



Reading text/picture files in MS Excel using VBA:

Although this part does not contain any R codes, I am posting this for the sake of completeness. This way, you will have one complete tool to play with. Once you have the output of R in a text file/picture file, you can read it back into Excel using VBA and display the nicely formatted result in excel. This part will be particularly useful if you want to create a tool that reads data from excel, does some statistical analysis using R in the backend and then displays the summary of the analysis. Here is the VBA code you can use to parse through a text file:

this code will read from a file hello.txt and store the result Sheet2 starting from range A1 in consecutive rows
Dim sFile As String
sFile = "C:\R_code\hello.txt"

Dim rowNum As Integer
rowNum = 1
Set dest = Sheet2.Cells(rowNum, 1)

Open sFile For Input As #1
Do Until EOF(1)
    Input #1, ReadData
    If Not IsEmpty(ReadData) Then
        dest.Cells = ReadData
        rowNum = rowNum + 1
        Set dest = Sheet1.Cells(rowNum, 1)
    End If
Loop
Close #1 'close the opened file

And the below code can be used to copy pictures into VBA:

Inserts a picture located in R_code into Sheet2 at position A1 onwards
Sheet2.Range("$A$1").Select
Dim sFile As String
sFile = C:\R_code\mypicture1.jpg"

ActiveSheet.Pictures.Insert(sFile) _
        .Select
Selection.ShapeRange.Height = 324
Selection.ShapeRange.Width = 396
    With Selection.ShapeRange.Line
        .Visible = msoTrue
        .ForeColor.ObjectThemeColor = msoThemeColorText1
        .ForeColor.TintAndShade = 0
        .ForeColor.Brightness = 0
        .Transparency = 0
    End With

So, with just little bit of code to format your results, you can get nicely formatted results in the way that you want. Below is the sample of the output from a linear regression model showing model accuracy, beta coefficients (from text file) and residual plots (from picture):


This is just the beginning. Once you have integrated R with VBA and vice versa, there is no limit to what you can achieve. R can be a powerful backend for computations where excel fails and I am sure we all agree that excel is still the de-facto standard for sharing and displaying summary reports. By using the interface techniques mentioned in this post, you can make the two of these complement each other very well. I would encourage you to try this out and let me know your thoughts in the comments below. If you like this post, then please follow this blog for more interesting posts, and tell your friends too :)




49 comments:

  1. Hi Shashia,

    Really nice blog. This is exactly what i need. The cmd part works. However, I am still struggeling with the VBA code to run the R script. I copied your code and changed te directory and script name. So i get this:

    Sub RunRscript()
    'runs an external R code through Shell
    'The location of the RScript is 'C:\R_code'
    'The script name is 'hello.R'

    Dim shell As Object
    Set shell = VBA.CreateObject("WScript.Shell")
    Dim waitTillComplete As Boolean: waitTillComplete = True
    Dim style As Integer: style = 1
    Dim errorCode As Integer
    Dim path As String
    path = "RScript C:\Users\310132724\Desktop\R Programming\Bart.R"
    errorCode = shell.Run(path, style, waitTillComplete)
    End Sub

    However i get the following error: Methor "run" of object "IWshShell3" failed.

    Could you please give me some advice about what I am doing wrong.

    Cheers

    ReplyDelete
    Replies
    1. Hi Bart,

      Thanks for stopping by. Since I have not used vba for sometime, I setup a small code again on my system and followed the exact same steps. It seems to work for me.

      Can you just open a command window and try to run the line

      "RScript C:\Users\310132724\Desktop\R Programming\Bart.R"

      outside of excel? (remove the quotes in the above line)
      See if the output is printed on the console. I suspect it is because of the path issue. Once you get it to work from command prompt, running it from VBA would be smooth. Let me know if you can do that.

      Thanks
      Shashia

      Delete
    2. This comment has been removed by the author.

      Delete
    3. seems you have space in path name, which could be causing the issue.
      Possible solution:
      "RScript ""C:\Users\310132724\Desktop\R Programming\Bart.R"""

      Delete
  2. Hi Shashia,

    Thanks for your fast reply. The code you propose to use in cmd does not work. However, this gives the solution of the error. Cmd is not able to read the space within my folder name "R Programming". I changed this folder name by "RProgramming". Now it works in cmd and in the VBA code!

    Thanks again for your nice blog, I think it is really helpful for people who are using Excel for their daily business and need R for some more complex calculations!

    Cheers
    Bart

    ReplyDelete
    Replies
    1. Oh! the space issue in CMD. I forgot to note that yesterday. Yes, you are right. CMD does not understand spaces in a path. However, if you are bound to have more folders with spaces in the names, you can use quotes to give the full path something like this:

      RScript "C:\Users\310132724\Desktop\R Programming\Bart.R"

      But the simple solution would be to use underscore( _ ) or get rid of the spaces completeley as you did. Glad you were able to figure that you.

      And thanks for your kind words... I am glad you liked it :)

      Delete
    2. Hi Shashia,
      first of all, thanks for the amazing tutorial. I just started working in an insurance company and I'm guessing this will save me a lot of time once I can set it up. I'm having the same issue as Bart, but it's not due to the spacing in my case. By manually putting "Rscript C:\R_code\hello.R" in the command prompt it works and gives the desired output. But the exact VBA-code (first one) u provided gives the "IWshShell3" failed- error. Do you have any ideas?

      Delete
    3. Hi Thibault,

      Thanks for your comment and I am glad you liked the post!

      As for your issue, my guess would be that it could have happened due to excel syntax not getting the right path and R script. My suggestion to narrow down the issue would be the following:
      1. In the VBA code change the path to execute a batch file instead of running a Rscript

      path = "RScript C:\R_code\batchfile.bat"
      errorCode = shell.Run(path, style, waitTillComplete)

      2. Create a batch file in the path, name it 'batchfile.bat' and put the following lines in there:
      %echo off
      echo This is a command line app to run R script
      Rscript.exe "RScript C:\R_code\hello.R"
      echo Command executed
      pause

      3. The pause in the end, keeps control at the command window till you press a key. This way, you can see what has run and what has not. Now run the VBA code and see if it works.

      It is just a suggestion based on my understanding of your issue.
      Let me know how that goes.

      Delete
    4. Hi Shashia,

      Thanks for the quick reply! I tried the code, but the batch file gave an error. I adjusted it a bit, (just deleted Rscript.exe and the quotation marks and then the batch file runs perfectly). The VBA code, however, gives the same error. The command prompt doesn't even open so I don't understand what the problem is.. Do you have any idea?

      Delete
    5. I have the same problem. I haven't found anything that will let me run Rscript from VBA, although I can run it from the cmd. I do get the cmd to open up with Shashia's suggestion, but it says that Rscript is not recognized...

      Delete
    6. I found a way to do it using a ShellAndWait function that can be found by just googling it.

      Delete
    7. Hi HD,

      Thanks for your suggestion on an alternative solution. Please go ahead, test it and let me know in case it works. I shall put it up as an edit on this post so that the others facing this issue might also try it.

      Delete
    8. Hello,

      Can someone post the solution to this IWshShell3 error? I don't think it's the path issue but could be wrong.

      Delete
    9. Hi,
      I struggled with the IWshShell3 error, too. After adding the reference to "Windows scripting host object model" in the VBA Editor options menu it finally worked.
      Cheers
      Christian


      Delete
  3. Shashia,

    Thank you for your blog post. I was able to reproduce it easily. When I started expanding on it I ran into a problem.

    How can I pass a range like Sheet1.Range("I9:I13").Value to R?

    Here is what I did:
    Dim var1 As Variant
    var1 = Sheet1.Range("I9:I13").Value

    Dim var2 As Variant
    var2 = Sheet1.Range("J9:J13").Value

    Dim path As String
    path = "RScript C:\R_code\myfunc.R " & var1 & " " & var2

    But I get a "Type Mismatch" error on the path. Do you have any code that does this or can it even be done?

    Thanks for any help you can give.
    Michael

    ReplyDelete
  4. Hi Michael,

    Thanks for your comment and apologies for the late reply. It was a long vacation for me :)

    I have not tried to pass ranges as parameters before and have run into issues myself when doing that on VBA. However, can you please try an alternative:
    If you are sure that the values you are looking for always exist on the same sheet, same range in Excel, you can use package xlsx in R and ask R to read/modify those values directly, instead of passing the parameters.

    Let me know if that solves your problem. I shall, in the meanwhile, try out other workarounds for the same.

    Thanks,
    Shashi

    ReplyDelete
  5. Hi Shashia,

    It's a very good article for an introduction, but i got errors on some others' machines, "files not found". I created the tool and it's perfect running on my machine. Also in cmd of others' machines, it works fine, but not working for vba..

    May i know why?

    Best

    James

    ReplyDelete
  6. Hi

    I have below macro which is running on button code of the excel and executing R file, the intention here is to update/get result in the same excel where the button/code is running, but when I am executing it says filenotfoundexception and file is in use by another process, can you please help me in this..



    Sub ExceMacro()
    Dim shell As Object
    Set shell = VBA.CreateObject("WScript.Shell")
    Dim waitTillComplete As Boolean: waitTillComplete = True
    Dim style As Integer: style = 1
    Dim errorCode As Integer
    Dim path As String
    path = "RScript C:\R_code\hello.R"
    errorCode = shell.Run(path, style, waitTillComplete)
    End Sub

    ReplyDelete
  7. Hey Shashia

    Thanks alot, this is very helpful blog

    I had just one question how to return values back into excel from R?

    ReplyDelete
  8. Hi Shashia,

    The blog is really very helpful.

    I am able to run the code using the CMD prompt but when i execute the same in macro it gives me the file not found error. I believe the space in path = "Rscript c:\R_code\hello.R" could be a problem. I have tried removing the space in VBA by using quotations.... but no success.

    Any ideas?

    Thanks.

    Atmajit

    ReplyDelete
  9. One possible solution for the filenotfound-error might be to just restart your PC :)

    I found, that the PATH-variable probably is not set for every folder:
    1. Go to the folder where you saved the .xlsm
    2. Shift+Right-Click in the folder (not on a file): Now select 'Open command window here'
    3. Type: Set
    4. Now check, if you can find your path to RScript under Path= ...
    5. If it's not there, restart your PC.

    This solved the problem for me.

    Regards
    Daniel

    ReplyDelete
  10. Hi All,

    Sorry for not taking out time to reply here. It has been quite a while since I visited.
    And I agree with Daniel's response that restarting might be one of the solutions. Please try this and hope it solves your queries.

    @Ramesh, I have explained how to return the values back into excel from R (using the text import via VBA) in the section
    Reading text/picture files in MS Excel using VBA
    Kindly go through it and let me know if you have further questions.

    Thanks,
    Shashi

    ReplyDelete
  11. To get rid of "IWshShell3 error", I have just replaced the default parameter values in the code. It worked for me.


    VBA CODE:
    --------

    Sub RunRscript1()
    Dim shell As Object
    Set shell = VBA.CreateObject("WScript.Shell")
    errorCode = shell.Run("C:\Users\457266\Desktop\R_World\hello.R", 1, True)
    End Sub

    RCODE:
    ------

    sink('C:/Users/457266/Desktop/R_World/hello.txt',append=F,type="output")
    cat('Hello R- World')
    var1<-100
    var2<-200
    cat('\nThe result of adding',var1,'to',var2,'is',var1+var2)
    sink(NULL)

    ReplyDelete
  12. Hi, thanks for you blog it's really useful. I want to know how i can catch the result of an PCA made by R thnaks to VBA. I have succedded in having a variance covariance matrix but when you carry out a PCA in R another window is opened. Could you help me?

    Thanks.

    ReplyDelete
  13. Hi, Thanks for so helpful article.
    I have 1 query. When I have install.packages in my code in R. This doesn't work.
    for Example if my code is something like:

    sink('E:/test.txt',append=F,type="output")
    cat('Hello World')

    install.packages("devtools")

    var1<-5^3
    var2<-7^3
    cat('\nThe result of adding',var1,'to',var2,'is',var1+var2)
    sink(NULL)


    Then it doesn't print even "The result of...." in the text file.
    Could you guide how to run the complete code using the Macro.

    Thanks!

    ReplyDelete
  14. Hello everyone! Has the error message been resolved? :) I am stuck with the same problem too...

    ReplyDelete
  15. SHOWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW!!!!!!!!!!!!!!!!!!
    CONGRATS for your job! Amazing!
    Thank you!
    o/-<]:

    ReplyDelete
  16. Hello,
    I am trying the cmd suggestion, but even after creating PATH= C:\Program Files\R\R-3.2.3\bin\x64;
    Getting an error 'Rscript' is not recognized as an internal or expernal command, operable program or batch file

    Please advice!

    ReplyDelete
  17. This is super helpful. I have been looking for some sources for the link between Excel and R.

    Thank you so much!

    ReplyDelete
  18. This comment has been removed by a blog administrator.

    ReplyDelete
  19. Hello,
    I am trying to do hierarchial clustering using excel as frontend and R as backend. I need to select type of method as Single or complete or average in excel. The type of distances as Eucledian, ward D or manhattan in excel. WIth the click of a button it should display the results by running an algorithm from R. Can I know the VBA code for it ?

    ReplyDelete
    Replies
    1. Hi Anonymous,

      Sure, your requirements seem doable with the code provided here.
      - For creating the dropdowns on VBA, use record macro
      - For passing the distance and linkage types as parameters to your hclust function, refer to 'Passing arguments to an RScript through command line/VBA' section
      - The hierarchical clustering on R can be done using the hclust function. Please search on the internet and you'll get the code for it.
      - For displaying the results / pictures back on excel via VBA, refer to the section 'Reading text/picture files in MS Excel using VBA'
      By breaking down your problem into sub-parts like these and referring relevant sections of the post above, you'll be able to do what is required.

      All the best!

      Delete
  20. This comment has been removed by a blog administrator.

    ReplyDelete
  21. This comment has been removed by a blog administrator.

    ReplyDelete
  22. change "RScript C:\R_code\hello.R" to "RScript C:/R_code/hello.R". then everything will be fine. Use forward slash other than back slash.

    ReplyDelete
  23. This comment has been removed by a blog administrator.

    ReplyDelete
  24. This comment has been removed by a blog administrator.

    ReplyDelete
  25. This comment has been removed by a blog administrator.

    ReplyDelete
  26. This comment has been removed by a blog administrator.

    ReplyDelete
  27. This comment has been removed by a blog administrator.

    ReplyDelete
  28. Thanks for the sharing information about VBA development. It was very useful for me. Keep up the good work!

    ReplyDelete
  29. This comment has been removed by a blog administrator.

    ReplyDelete
  30. This comment has been removed by a blog administrator.

    ReplyDelete
  31. This comment has been removed by a blog administrator.

    ReplyDelete
  32. This comment has been removed by a blog administrator.

    ReplyDelete
  33. Hi

    I tried all methods suggested here. I am still getting the 'IWSHShell3' failed error. My code is as below. The R script runs perfectly when opened via RStudio.
    Can anyone please help?

    Sub RunRscript1()
    Dim shell As Object
    Set shell = VBA.CreateObject("WScript.Shell")
    errorCode = shell.Run("I:\ActModTemp\MehakMittal\ContinuousTesting\IJ012_\Revised\Rscripts\Input File Creation Code.R", 1, True)
    End Sub

    ReplyDelete
  34. Hello,

    Thank you for the details!

    For me, VBA is not being able to call the terminal window to run R script - the Rscript.exe and R code path is correct. I'm able to run it on my system, but can't run it on a friend's laptop. The error is 'Error 70: Permission denied'. Please help!

    ReplyDelete
  35. Hi Shashia,

    Thanks for the wonderful sharing!
    May I know how to display the output of R script in excel. The output is in the table form, how should I modified the code? I had successfully run the script and get the table in command prompt but how should I make it to the excel? I am very new in VBA coding, please help me. Below are my code based on the code above:

    Sub RunRscript()
    'runs an external R code through Shell
    'The location of the RScript is 'C:\R_code'
    'The script name is 'hello.R'

    Dim shell As Object
    Set shell = VBA.CreateObject("WScript.Shell")
    Dim waitTillComplete As Boolean: waitTillComplete = True
    Dim style As Integer: style = 1
    Dim errorCode As Integer
    Dim path As String
    path = "RScript C:\Users\User\Desktop\Intern\FinviztableScrap.R"
    errorCode = shell.Run(path, style, waitTillComplete)

    Dim sFile As String
    sFile = "C:\Users\User\Desktop\Intern\FinviztableScrap.txt"

    Dim rowNum As Integer
    rowNum = 1
    Set dest = Sheet2.Cells(rowNum, 1)

    Open sFile For Input As #1
    Do Until EOF(1)
    Input #1, ReadData
    If Not IsEmpty(ReadData) Then
    dest.Cells = ReadData
    rowNum = rowNum + 1
    Set dest = Sheet1.Cells(rowNum, 1)
    End If
    Loop
    Close #1 'close the opened file

    End Sub


    After I run this code, at the end it shows that "run time error" Object required

    please help me. Thanks in advance

    ReplyDelete
  36. Hey, when I run this code it creates an R file, not a txt file, how can I fix that? Also, how can I write the output back into excel?

    ReplyDelete