Class: SpatialStats::Local::Stat
- Inherits:
-
Object
- Object
- SpatialStats::Local::Stat
- Defined in:
- lib/spatial_stats/local/stat.rb
Overview
Stat is the abstract base class for local stats. It defines the methods that are common between all classes and will raise a NotImplementedError on those that are specific for each type of statistic.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#field ⇒ Object
Returns the value of attribute field.
-
#scope ⇒ Object
Returns the value of attribute scope.
-
#weights ⇒ Object
Returns the value of attribute weights.
Instance Method Summary collapse
-
#crand(permutations, rng) ⇒ Numo::Int32
Conditional randomization algorithm used in permutation testing.
- #expectation ⇒ Object
-
#initialize(scope, field, weights) ⇒ Stat
constructor
Base class for local stats.
-
#mc(permutations = 99, seed = nil) ⇒ Array
Permutation test to determine a pseudo p-values of the
#stat
method. -
#mc_bv(permutations, seed) ⇒ Array
Permutation test to determine a pseudo p-values of the
#stat
method. -
#quads ⇒ Array
Determines what quadrant an observation is in.
- #stat ⇒ Object
-
#summary(permutations = 99, seed = nil) ⇒ Array
Summary of the statistic.
- #variance ⇒ Object
- #x=(values) ⇒ Object (also: #z=)
- #y=(values) ⇒ Object
-
#z_score ⇒ Array
Z-score for each observation of the statistic.
Constructor Details
#initialize(scope, field, weights) ⇒ Stat
Base class for local stats
12 13 14 15 16 |
# File 'lib/spatial_stats/local/stat.rb', line 12 def initialize(scope, field, weights) @scope = scope @field = field @weights = weights.standardize end |
Instance Attribute Details
#field ⇒ Object
Returns the value of attribute field.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def field @field end |
#scope ⇒ Object
Returns the value of attribute scope.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def scope @scope end |
#weights ⇒ Object
Returns the value of attribute weights.
17 18 19 |
# File 'lib/spatial_stats/local/stat.rb', line 17 def weights @weights end |
Instance Method Details
#crand(permutations, rng) ⇒ Numo::Int32
Conditional randomization algorithm used in permutation testing. Returns a matrix with permuted index values that will be used for selecting values from the original data set.
The width of the matrix is the max number of neighbors + 1 which is way less than it would be if the original vector was shuffled in full.
This is super important because most weight matrices are very sparse so the amount of shuffling/multiplication that is done is reduced drastically.
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
# File 'lib/spatial_stats/local/stat.rb', line 69 def crand(permutations, rng) # basing this off the ESDA method # need to get k for max_neighbors # and wc for cardinalities of each item # this returns an array of length n with # (permutations x neighbors) Numo Arrays. # This helps reduce computation time because # we are only dealing with neighbors for each # entry not the entire list of permutations for each entry. n_1 = weights.n - 1 # weight counts wc = weights.wc k = wc.max + 1 prange = (0..permutations - 1).to_a ids_perm = (0..n_1 - 1).to_a Numo::Int32.cast(prange.map { ids_perm.sample(k, random: rng) }) end |
#expectation ⇒ Object
23 24 25 |
# File 'lib/spatial_stats/local/stat.rb', line 23 def expectation raise NotImplementedError, 'method expectation not implemented' end |
#mc(permutations = 99, seed = nil) ⇒ Array
Permutation test to determine a pseudo p-values of the #stat
method. Shuffles x values, recomputes #stat
for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/spatial_stats/local/stat.rb', line 101 def mc(permutations = 99, seed = nil) # For local tests, we need to shuffle the values # but for each item, hold its value in place and shuffle # its neighbors. Then we will only test for that item instead # of the entire set. This will be done for each item. rng = gen_rng(seed) rids = crand(permutations, rng) n_1 = weights.n - 1 sparse = weights.sparse row_index = sparse.row_index ws = sparse.values wc = weights.wc stat_orig = stat arr = Numo::DFloat.cast(x) ids = (0..n_1).to_a observations = Array.new(weights.n) (0..n_1).each do |idx| idsi = ids.dup idsi.delete_at(idx) idsi.shuffle!(random: rng) idsi = Numo::Int32.cast(idsi) sample = arr[idsi[rids[true, 0..wc[idx] - 1]]] # account for case where there are no neighbors row_range = row_index[idx]..(row_index[idx + 1] - 1) if row_range.size.zero? observations[idx] = permutations next end wi = Numo::DFloat.cast(ws[row_range]) stat_i_new = mc_i(wi, sample, idx) stat_i_orig = stat_orig[idx] observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new, permutations) end observations.map do |ri| (ri + 1.0) / (permutations + 1.0) end end |
#mc_bv(permutations, seed) ⇒ Array
Permutation test to determine a pseudo p-values of the #stat
method. Shuffles y values, hold x values, recomputes #stat
for each variation, then compares to the computed one. The ratio of more extreme values to permutations is returned for each observation.
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
# File 'lib/spatial_stats/local/stat.rb', line 157 def mc_bv(permutations, seed) rng = gen_rng(seed) rids = crand(permutations, rng) n_1 = weights.n - 1 sparse = weights.sparse row_index = sparse.row_index ws = sparse.values wc = weights.wc stat_orig = stat arr = Numo::DFloat.cast(y) ids = (0..n_1).to_a observations = Array.new(weights.n) (0..n_1).each do |idx| idsi = ids.dup idsi.delete_at(idx) idsi.shuffle!(random: rng) idsi = Numo::Int32.cast(idsi) sample = arr[idsi[rids[true, 0..wc[idx] - 1]]] # account for case where there are no neighbors row_range = row_index[idx]..(row_index[idx + 1] - 1) if row_range.size.zero? observations[idx] = permutations next end wi = Numo::DFloat.cast(ws[row_range]) stat_i_new = mc_i(wi, sample, idx) stat_i_orig = stat_orig[idx] observations[idx] = mc_observation_calc(stat_i_orig, stat_i_new, permutations) end observations.map do |ri| (ri + 1.0) / (permutations + 1.0) end end |
#quads ⇒ Array
Determines what quadrant an observation is in. Based on its value compared to its neighbors. This does not work for all stats, since it requires that values be negative.
In a standardized array of z, high values are values greater than 0 and it's neighbors are determined by the spatial lag and if that is positive then it's neighbors would be high, low otherwise.
Quadrants are:
- HH
-
a high value surrounded by other high values
- LH
-
a low value surrounded by high values
- LL
-
a low value surrounded by low values
- HL
-
a high value surrounded by low values
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
# File 'lib/spatial_stats/local/stat.rb', line 213 def quads # https://github.com/pysal/esda/blob/master/esda/moran.py#L925 z_lag = SpatialStats::Utils::Lag.neighbor_average(weights, z) zp = z.map(&:positive?) lp = z_lag.map(&:positive?) # hh = zp & lp # lh = zp ^ true & lp # ll = zp ^ true & lp ^ true # hl = zp next to lp ^ true hh = zp.each_with_index.map { |v, idx| v & lp[idx] } lh = zp.each_with_index.map { |v, idx| (v ^ true) & lp[idx] } ll = zp.each_with_index.map { |v, idx| (v ^ true) & (lp[idx] ^ true) } hl = zp.each_with_index.map { |v, idx| v & (lp[idx] ^ true) } # now zip lists and map them to proper terms quad_terms = %w[HH LH LL HL] hh.zip(lh, ll, hl).map do |feature| quad_terms[feature.index(true)] end end |
#stat ⇒ Object
19 20 21 |
# File 'lib/spatial_stats/local/stat.rb', line 19 def stat raise NotImplementedError, 'method stat not defined' end |
#summary(permutations = 99, seed = nil) ⇒ Array
Summary of the statistic. Computes stat
, mc
, and groups
then returns the values in a hash array.
243 244 245 246 247 248 249 |
# File 'lib/spatial_stats/local/stat.rb', line 243 def summary(permutations = 99, seed = nil) p_vals = mc(permutations, seed) data = weights.keys.zip(stat, p_vals, groups) data.map do |row| { key: row[0], stat: row[1], p: row[2], group: row[3] } end end |
#variance ⇒ Object
27 28 29 |
# File 'lib/spatial_stats/local/stat.rb', line 27 def variance raise NotImplementedError, 'method variance not implemented' end |
#x=(values) ⇒ Object Also known as: z=
31 32 33 |
# File 'lib/spatial_stats/local/stat.rb', line 31 def x=(values) @x = values.standardize end |
#y=(values) ⇒ Object
36 37 38 |
# File 'lib/spatial_stats/local/stat.rb', line 36 def y=(values) @y = values.standardize end |
#z_score ⇒ Array
Z-score for each observation of the statistic.
44 45 46 47 48 49 50 |
# File 'lib/spatial_stats/local/stat.rb', line 44 def z_score numerators = stat.map { |v| v - expectation } denominators = variance.map { |v| Math.sqrt(v) } numerators.each_with_index.map do |numerator, idx| numerator / denominators[idx] end end |