# Chapter 12 Local Spatial Autocorrelation 1

## Introduction

This notebook cover the functionality of the Local Spatial Autocorrelation section of the GeoDa workbook. We refer to that document for details on the methodology, references, etc. The goal of these notes is to approximate as closely as possible the operations carried out using GeoDa by means of a range of R packages.

The notes are written with R beginners in mind, more seasoned R users can probably skip most of the comments on data structures and other R particulars. Also, as always in R, there are typically several ways to achieve a specific objective, so what is shown here is just one way that works, but there often are others (that may even be more elegant, work faster, or scale better).

For this notebook, we use Cleveland house price data. Our goal in this lab is show how to assign spatial weights based on different distance functions.

### 12.0.1 Objectives

After completing the notebook, you should know how to carry out the following tasks:

Identify clusters with the Local Moran cluster map and significance map

Identify clusters with the Local Geary cluster map and significance map

Identify clusters with the Getis-Ord Gi and Gi* statistics

Identify clusters with the Local Join Count statistic

Interpret the spatial footprint of spatial clusters

Assess potential interaction effects by means of conditional cluster maps

Assess the significance by means of a randomization approach

Assess the sensitivity of different significance cut-off values

Interpret significance by means of Bonferroni bounds

#### 12.0.1.1 R Packages used

**spatmap**: To construct significance and cluster maps for a variety of local statistics**geodaData**: To load the data for this notebook**tmap**: To format the maps made**rgeoda**: To run local spatial autocorrelation analysis

#### 12.0.1.2 R Commands used

Below follows a list of the commands used in this notebook. For further details and a comprehensive list of options, please consult the R documentation.

**Base R**:`install.packages`

,`library`

,`setwd`

,`set.seed`

,`cut`

,`rep`

**tmap**:`tm_shape`

,`tm_borders`

,`tm_fill`

,`tm_layout`

,`tm_facets`

## 12.1 Preliminaries

Before starting, make sure to have the latest version of R and of packages that are compiled for the matching version of R (this document was created using R 3.5.1 of 2018-07-02). Also, optionally, set a working directory, even though we will not
actually be saving any files.^{31}

### 12.1.1 Load packages

First, we load all the required packages using the `library`

command. If you donâ€™t have some of these in your system, make sure to install them first as well as
their dependencies.^{32} You will get an error message if something is missing. If needed, just install the missing piece and everything will work after that.

```
library(sf)
library(tmap)
library(rgeoda)
library(geodaData)
library(RColorBrewer)
```

### 12.1.2 spatmap

The main package used throughout this notebook will be **rgeoda**. This package provides the statistical computations of local spatial statistics and **tmap** for the mapping component. All of the visualizations are built with a similar style
to GeoDa. The visualizations include cluster maps and their associated significance maps. The mapping functions
are built off of **tmap** and can have additional layers added to them like `tm_borders`

or `tm_layout`

.

### 12.1.3 geodaData

All of the data for the R notebooks is available in the **geodaData**
package. We loaded the library earlier, now to access the individual
data sets, we use the double colon notation. This works similar to
to accessing a variable with `$`

, in that a drop down menu will
appear with a list of the datasets included in the package. For this
notebook, we use `guerry`

.

`<- geodaData::guerry guerry `

### 12.1.4 Univariate analysis

Throughout the notebook, we will focus on the variable **Donatns**, which is
charitable donations per capita. Before proceeding with the local spatial statistics
and visualizations, we will take preliminary look at the spatial distribution of this
variable. This is done with **tmap** functions. We will not go into too much detail on these
because there is a lot to cover local spatial statistics and this functionality was covered
in a previous notebook. Please the Basic Mapping notebook for more information on basic
**tmap** functionality

For the univariate map, we use the natural breaks or jenks style to get a general sense of the spatial distribution for our variable.

```
tm_shape(guerry) +
tm_fill("Donatns", style = "jenks", n = 6) +
tm_borders() +
tm_layout(legend.outside = TRUE, legend.outside.position = "left")
```

## 12.2 Local Moran

### 12.2.1 Principle

The local Moran statistic was suggested in Anselin(1995) as a way to identify local clusters and local spaital outliers. Most global spatial autocorrelation can be expressed as a double sum over i and j indices, such as \(\Sigma_i\Sigma_jg_{ij}\). The local form of such a statistic would then be, for each observation(location)i, the sum of the relevant expression over the j index, \(\Sigma_jg_{ij}\).

Specifically, the local Moran statistic takes the form \(cz_i\Sigma_jw_{ij}z_j\), with z in deviations from the mean. The scalar c is the same for all locations and therefore does not play a role in the assessment of significance. The latter is obtained by means of a conditional permutation method, where, in turn, each \(z_i\) is held fixed, and the remaining z-values are randomly permuted to yield a reference distribution for the statistic. This operates in the same fashion as for the global Moranâ€™s I, except that the permutation is carried out for each observation in turn. The result is a pseudo p-value for each location, which can then be used to assess significance. Note that this notion of significance is not the standard one, and should not be interpreted that way (see the discussion of multiple comparisons below).

Assessing significance in and of itself is not that useful for the Local Moran. However, when an indication of significance is combined with the location of each observation in the Moran Scatterplot, a very powerful interpretation becomes possible. The combined information allows for a classification of the significant locations as high-high and low-low spatial clusters, and high-low and low-high spatial outliers. It is important to keep in mind that the reference to high and low is relative to the mean of the variable, and should not be interpreted in an absolute sense.

### 12.2.2 Implementation

With the function `local_moran`

from **rgeoda**, we can create a local moran cluster map. The parameters
needed are an **sf** dataframe, which is **guerry** in our case, and the name of a variable from the **sf**
dataframe.

Some help functions that create maps based the statistical results of **rgeoda**:

```
<- function(patterns, classifications, colors){
match_palette <- base::unique(patterns)
classes_present <- matrix(c(classifications,colors), ncol = 2)
mat <- classifications %in% classes_present
logi <- matrix(mat[logi], ncol = 2)
pre_col <- pre_col[,2]
pal return(pal)
}
<- function(df, lisa, alpha = .05) {
lisa_map <- lisa_clusters(lisa,cutoff = alpha)
clusters <- lisa_labels(lisa)
labels <- lisa_pvalues(lisa)
pvalue <- lisa_colors(lisa)
colors <- labels[clusters+1]
lisa_patterns
<- match_palette(lisa_patterns,labels,colors)
pal <- labels[labels %in% lisa_patterns]
labels
"lisa_clusters"] <- clusters
df[tm_shape(df) +
tm_fill("lisa_clusters",labels = labels, palette = pal,style = "cat")
}
<- function(df, lisa, permutations = 999, alpha = .05) {
significance_map <- lisa_pvalues(lisa)
pvalue <- 1 / (1 + permutations)
target_p <- c(.00001, .0001, .001, .01)
potential_brks <- potential_brks[which(potential_brks > target_p & potential_brks < alpha)]
brks <- c(target_p, brks, alpha)
brks2 <- c(as.character(brks2), "Not Significant")
labels <- c(0, brks2, 1)
brks3
<- cut(pvalue, breaks = brks3,labels = labels)
cuts "sig"] <- cuts
df[
<- rev(brewer.pal(length(labels), "Greens"))
pal length(pal)] <- "#D3D3D3"
pal[
tm_shape(df) +
tm_fill("sig", palette = pal)
}
```

It is important to note the default parameters of `local_moran`

. These include `permutations = 999`

,
`significance_cutoff = .05`

, and `weights = NULL`

. Permutations is the number of permutations used in computing the reference distributions
of the local statistic for each location. Significance_cutoff or alpha is the cutoff significance level. The weights parameter is where we specify
the weights used for the computation of the local statistics. In the NULL case, 1st order queen contiguity are computed.

```
<- queen_weights(guerry)
w <- local_moran(w, guerry['Donatns'])
lisa lisa_map(guerry, lisa)
```

To get a significance map for the local moran, we use `significance_map`

. Default number of permutations is 999,
the alpha level is .05.

`significance_map(guerry, lisa) `

#### 12.2.2.1 tmap additions

With the mapping functions of **lisa_map**, additional **tmap** layers can be added
with the `+`

operator. This gives the maps strong formatting options. With `tm_borders`

,
we can make the borders of the local moran map more distinct. With`tm_layout`

we can add
a title and move the legend to the outside of the map. There many more formatting options,
including `tmap_arrange`

, which we used earlier.

```
lisa_map(guerry, lisa) +
tm_borders() +
tm_layout(title = "Local Moran Cluster Map of Donatns", legend.outside = TRUE)
```

We can set the **tmap** mode to â€śviewâ€ťâ€ť to get an interactive base map with `tmap_mode`

.

```
tmap_mode("view")
lisa_map(guerry, lisa) +
tm_borders() +
tm_layout(title = "Local Moran Cluster Map of Donatns",legend.outside = TRUE)
```