## Need for speed

R is widely seen as being ‘slow’ (see julia web page)

But, if you use a few specific tools, then this becomes irrelevant because of the powerful tools in various packages in R

## An aside

Pure R, when the most efficient vectorized code is used, appears to be 1/2x the speed of the most efficient C++.

See Hadley Wickham’s page on Rcpp, scroll down to “Vector input, vector output”… ), noting that if it took 10 minutes to write the C++ code, it would have to be 150,000 times faster to make it worth it.

## Need for speed

Spatial simulation means doing the same thing over and over and over … so we need speed

We will show how to profile your code at the end of this section.

## “Vectorization”

• This is at the core of making R fast. If you don’t do this, then it is probably not useful to use R as a simulation engine.
# Instead of
a <- vector()
for (i in 1:1000) {
a[i] <- rnorm(1)
}

# use vectorized version, which is built into the functions
a <- rnorm(1000)

## Vectors and Matrices

• These are as fast as you can get in R
• Fast numerical operations
• Faster than data.frame
• Anything that is in pure vectors or matrices is ‘fast enough’
• It is always a challenge to keep all code in vectors and matrices
• Thus the following packages…

## Spatial simulation

• To work with spatial simulation (e.g., time and space), it requires more than just spatial data manipulation
• Sometimes it is just base R stuff
• Need to learn how to make functions (allows reusability)
• Need to learn a few key packages that are critical for speed

## Key packages for spatial simulation

• base package – everything matrix or vector is ‘fast’
• raster - for spatially referenced matrices

• not always fast enough, sometimes we copy the data into a matrix, then manipulate, then return the data to the raster object
• sp - equivalent of vector shapefiles in a GIS

• Polygons, Points, Lines
• Not always fast, but essential to have
• see also sf

## Key packages for spatial simulation

• data.table

• For data.frame type data (i.e., columns of data)
• Very fast when object gets large, but is actually slower if the data.frame is small (<100,000 rows)
• SpaDES – many functions; will be moved into a separate package soon

• Rcpp

• R interface to C++ . When you need something fast, and you can’t get it fast enough with existing tools/packages, you can create your own (we will not go further into this here)

## What we will do here

• We will go through SpaDES functions quickly, because there are fewer tutorials online for these
• We will show links to various tutorials for raster, sp, data.table, Rcpp
• Each person should decide which tool is the most useful to them
• Put something into practice

## SpaDES functions

• These are all potentially useful for building spatio-temporal models
?spades-package # section 2 shows many functions

# e.g.,
?move
?cir
?distanceFromEachPoint

## sp

• Quite an old and mature package
• Tutorials

## sf

• Relatively new
• Implements latest GIS data standards
• Very fast, especially reading/writing large data
• CRAN
• GitHub

## The data.table package

From every data.table user ever:

WOW that’s fast!

install.packages('data.table')

(at least for large tables!)

## raster and data.table together

• The current implementation of LANDIS-SpaDES uses a “reduced” data structure throughout

• Instead of keeping rasters of everything (one can imagine that there is redundancy, i.e., 2 pixels next to each other may be identical)

• We make one raster of “id” and one data.table with a column called “id”

• Then we can have as many columns as we want of information about each of these places

• Like “polygons”, but for rasters, and dynamic… can change over time

• This may be useful for your own module

## raster and data.table together

• There is a key helper function:
?rasterizeReduced

What does this do?

## The Rcpp package

From every Rcpp user ever:

WOW! Just wow.

install.packages('Rcpp')