Class: SpatialStats::Local::Stat

Inherits:
Object
  • Object
show all
Defined in:
lib/spatial_stats/local/stat.rb

Overview

Stat is the abstract base class for local stats. It defines the methods that are common between all classes and will raise a NotImplementedError on those that are specific for each type of statistic.

Direct Known Subclasses

BivariateMoran, Geary, GetisOrd, Moran, MultivariateGeary

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(scope, field, weights) ⇒ Stat

Base class for local stats



12
13
14
15
16
# File 'lib/spatial_stats/local/stat.rb', line 12

def initialize(scope, field, weights)
  @scope = scope
  @field = field
  @weights = weights.standardize
end

Instance Attribute Details

#fieldObject

Returns the value of attribute field.



17
18
19
# File 'lib/spatial_stats/local/stat.rb', line 17

def field
  @field
end

#scopeObject

Returns the value of attribute scope.



17
18
19
# File 'lib/spatial_stats/local/stat.rb', line 17

def scope
  @scope
end

#weightsObject

Returns the value of attribute weights.



17
18
19
# File 'lib/spatial_stats/local/stat.rb', line 17

def weights
  @weights
end

Instance Method Details

#crand(permutations, rng) ⇒ Numo::Int32

Conditional randomization algorithm used in permutation testing. Returns a matrix with permuted index values that will be used for selecting values from the original data set.

The width of the matrix is the max number of neighbors + 1 which is way less than it would be if the original vector was shuffled in full.

This is super important because most weight matrices are very sparse so the amount of shuffling/multiplication that is done is reduced drastically.

Returns:

  • (Numo::Int32)

    matrix of shape perms x wc_max + 1

See Also:



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/spatial_stats/local/stat.rb', line 69

def crand(permutations, rng)
  # basing this off the ESDA method
  # need to get k for max_neighbors
  # and wc for cardinalities of each item
  # this returns an array of length n with
  # (permutations x neighbors) Numo Arrays.
  # This helps reduce computation time because
  # we are only dealing with neighbors for each
  # entry not the entire list of permutations for each entry.
  n_1 = weights.n - 1

  # weight counts
  wc = weights.wc
  k = wc.max + 1
  prange = (0..permutations - 1).to_a

  ids_perm = (0..n_1 - 1).to_a
  Numo::Int32.cast(prange.map { ids_perm.sample(k, random: rng) })
end

#expectationObject

Raises:

  • (NotImplementedError)


23
24
25
# File 'lib/spatial_stats/local/stat.rb', line 23

def expectation
  raise NotImplementedError, 'method expectation not implemented'
end

#mc(permutations = 99, seed = nil) ⇒ Array

Permutation test to determine a pseudo p-values of the #stat method. Shuffles x values, recomputes #stat for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.

Parameters:

  • permutations (Integer) (defaults to: 99)

    to run. Last digit should be 9 to produce round numbers.

  • seed (Integer) (defaults to: nil)

    used in random number generator for shuffles.

Returns:

  • (Array)

    of p-values

See Also:



101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# File 'lib/spatial_stats/local/stat.rb', line 101

def mc(permutations = 99, seed = nil)
  # For local tests, we need to shuffle the values
  # but for each item, hold its value in place and shuffle
  # its neighbors. Then we will only test for that item instead
  # of the entire set. This will be done for each item.
  rng = gen_rng(seed)
  rids = crand(permutations, rng)

  n_1 = weights.n - 1
  sparse = weights.sparse
  row_index = sparse.row_index
  ws = sparse.values
  wc = weights.wc
  stat_orig = stat

  arr = Numo::DFloat.cast(x)
  ids = (0..n_1).to_a
  observations = Array.new(weights.n)
  (0..n_1).each do |idx|
    idsi = ids.dup
    idsi.delete_at(idx)
    idsi.shuffle!(random: rng)
    idsi = Numo::Int32.cast(idsi)
    sample = arr[idsi[rids[true, 0..wc[idx] - 1]]]

    # account for case where there are no neighbors
    row_range = row_index[idx]..(row_index[idx + 1] - 1)
    if row_range.size.zero?
      observations[idx] = permutations
      next
    end

    wi = Numo::DFloat.cast(ws[row_range])
    stat_i_new = mc_i(wi, sample, idx)
    stat_i_orig = stat_orig[idx]
    observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new,
                                            permutations)
  end

  observations.map do |ri|
    (ri + 1.0) / (permutations + 1.0)
  end
end

#mc_bv(permutations, seed) ⇒ Array

Permutation test to determine a pseudo p-values of the #stat method. Shuffles y values, hold x values, recomputes #stat for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.

Parameters:

  • permutations (Integer)

    to run. Last digit should be 9 to produce round numbers.

  • seed (Integer)

    used in random number generator for shuffles.

Returns:

  • (Array)

    of p-values

See Also:



157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
# File 'lib/spatial_stats/local/stat.rb', line 157

def mc_bv(permutations, seed)
  rng = gen_rng(seed)
  rids = crand(permutations, rng)

  n_1 = weights.n - 1
  sparse = weights.sparse
  row_index = sparse.row_index
  ws = sparse.values
  wc = weights.wc
  stat_orig = stat

  arr = Numo::DFloat.cast(y)
  ids = (0..n_1).to_a
  observations = Array.new(weights.n)
  (0..n_1).each do |idx|
    idsi = ids.dup
    idsi.delete_at(idx)
    idsi.shuffle!(random: rng)
    idsi = Numo::Int32.cast(idsi)
    sample = arr[idsi[rids[true, 0..wc[idx] - 1]]]

    # account for case where there are no neighbors
    row_range = row_index[idx]..(row_index[idx + 1] - 1)
    if row_range.size.zero?
      observations[idx] = permutations
      next
    end

    wi = Numo::DFloat.cast(ws[row_range])
    stat_i_new = mc_i(wi, sample, idx)
    stat_i_orig = stat_orig[idx]
    observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new,
                                            permutations)
  end

  observations.map do |ri|
    (ri + 1.0) / (permutations + 1.0)
  end
end

#quadsArray

Determines what quadrant an observation is in. Based on its value compared to its neighbors. This does not work for all stats, since it requires that values be negative.

In a standardized array of z, high values are values greater than 0 and it's neighbors are determined by the spatial lag and if that is positive then it's neighbors would be high, low otherwise.

Quadrants are:

HH

a high value surrounded by other high values

LH

a low value surrounded by high values

LL

a low value surrounded by low values

HL

a high value surrounded by low values

Returns:

  • (Array)

    of labels



213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
# File 'lib/spatial_stats/local/stat.rb', line 213

def quads
  # https://github.com/pysal/esda/blob/master/esda/moran.py#L925
  z_lag = SpatialStats::Utils::Lag.neighbor_average(weights, z)
  zp = z.map(&:positive?)
  lp = z_lag.map(&:positive?)

  # hh = zp & lp
  # lh = zp ^ true & lp
  # ll = zp ^ true & lp ^ true
  # hl = zp next to lp ^ true
  hh = zp.each_with_index.map { |v, idx| v & lp[idx] }
  lh = zp.each_with_index.map { |v, idx| (v ^ true) & lp[idx] }
  ll = zp.each_with_index.map { |v, idx| (v ^ true) & (lp[idx] ^ true) }
  hl = zp.each_with_index.map { |v, idx| v & (lp[idx] ^ true) }

  # now zip lists and map them to proper terms
  quad_terms = %w[HH LH LL HL]
  hh.zip(lh, ll, hl).map do |feature|
    quad_terms[feature.index(true)]
  end
end

#statObject

Raises:

  • (NotImplementedError)


19
20
21
# File 'lib/spatial_stats/local/stat.rb', line 19

def stat
  raise NotImplementedError, 'method stat not defined'
end

#summary(permutations = 99, seed = nil) ⇒ Array

Summary of the statistic. Computes stat, mc, and groups then returns the values in a hash array.

Parameters:

  • permutations (Integer) (defaults to: 99)

    to run. Last digit should be 9 to produce round numbers.

  • seed (Integer) (defaults to: nil)

    used in random number generator for shuffles.

Returns:

  • (Array)


243
244
245
246
247
248
249
# File 'lib/spatial_stats/local/stat.rb', line 243

def summary(permutations = 99, seed = nil)
  p_vals = mc(permutations, seed)
  data = weights.keys.zip(stat, p_vals, groups)
  data.map do |row|
    { key: row[0], stat: row[1], p: row[2], group: row[3] }
  end
end

#varianceObject

Raises:

  • (NotImplementedError)


27
28
29
# File 'lib/spatial_stats/local/stat.rb', line 27

def variance
  raise NotImplementedError, 'method variance not implemented'
end

#x=(values) ⇒ Object Also known as: z=



31
32
33
# File 'lib/spatial_stats/local/stat.rb', line 31

def x=(values)
  @x = values.standardize
end

#y=(values) ⇒ Object



36
37
38
# File 'lib/spatial_stats/local/stat.rb', line 36

def y=(values)
  @y = values.standardize
end

#z_scoreArray

Z-score for each observation of the statistic.

Returns:

  • (Array)

    of the number of deviations from the mean



44
45
46
47
48
49
50
# File 'lib/spatial_stats/local/stat.rb', line 44

def z_score
  numerators = stat.map { |v| v - expectation }
  denominators = variance.map { |v| Math.sqrt(v) }
  numerators.each_with_index.map do |numerator, idx|
    numerator / denominators[idx]
  end
end