Profiling and Benchmarking (1)

  • In general, the usual claim is to worry about ‘execution speed later’

  • This is not 100% true with R

  • If you use vectorization (no or few loops), and these packages listed here, then you will have a good start

  • AFTER that, then you can use several great tools:

Profiling and Benchmarking (2)

## Unit: microseconds
##        expr    min      lq     mean  median      uq    max neval cld
##        loop 4332.8 4380.55 4939.908 4423.45 5038.85 9496.6   100   b
##  vectorized   46.2   47.85   50.357   49.00   50.25  111.7   100  a

Profiling and Benchmarking (3)

If you have Rstudio version >=0.99.1208, then it has profiling as a menu item.

  • alternatively, we wrap any block of code with profvis

  • This can be a spades() call, so it will show you the entire model:

Profiling the spades call

Try it:

mySim <- simInit(
   times = list(start = 0.0, end = 2.0, timeunit = "year"),
   params = list(
     .globals = list(stackName = "landscape", burnStats = "nPixelsBurned")
   ),
   modules = list("randomLandscapes", "fireSpread", "caribouMovement"),
   paths = list(modulePath = system.file("sampleModules", package = "SpaDES"))
)
profvis::profvis({spades(mySim)})

When to profile

  • First, you should have started building your code with the packages we have discussed
  • It will be too late if you have loops in your code, and you are ready to profile to improve it

If you have used these tools, then:

  • When you have mostly finished whatever we are coding
  • Don’t ever start making code more efficient until you have profiled
  • It is almost impossible to tell which bits are the slow parts, without profiling or benchmarking

Strategies for profiling

  • Can do an entire SpaDES model call
  • Can pinpoint specific functions
  • Can test alternative ways of implementing the same thing