@@ -225,26 +225,6 @@ graph embedding capacities. In the next section (and @fig:embedding), we show
225
225
how the amount of dimensionality reduction can affect the quality of the
226
226
embedding.
227
227
228
- ## Graph embedding has been under-used in the prediction of species interactions
229
-
230
- One prominent family of approaches we do not discuss in the present manuscript
231
- is Graph Neural Networks [ GNN; @Zhou2020Graph ] . GNN are, in a sense, a method to
232
- embed a graph into a dense subspace, but belong to the family of deep learning
233
- methods, which has its own set of practices [ see * e.g.* @Goodfellow2016Deep ] . An
234
- important issue with methods based on deep learning is that, because their
235
- parameter space is immense, the sample size of the data fed into them must be
236
- similarly large (typically thousands of instances). This is a requirement for
237
- the model to converge correctly during training, but this assumption is unlikely
238
- to be met given the size of datasets currently available for metawebs (or single
239
- time/location species interaction networks). This data volume requirement is
240
- mostly absent from the techniques we list below. Furthermore, GNN still have
241
- some challenges related to their shallow structure, and concerns related to
242
- scalability [ see @Gupta2021Graph for a review] , which are mostly absent from the
243
- methods listed in @tbl : methods . Assuming that the uptake of next-generation
244
- biomonitoring techniques does indeed deliver larger datasets on species
245
- interactions [ @Bohan2017Nextgeneration ] , there is nevertheless the potential for
246
- GNN to become an applicable embedding/predictive technique in the coming years.
247
-
248
228
| Method | Object | Technique | Reference | Application |
249
229
| ------------- | --------------- | -------------------------------- | ------------------------ | ----------------------------------------------------------------------------------------------------------------- |
250
230
| tSNE | nodes | statistical divergence | @Hinton2002Stochastic | [ @Cieslak2020Tdistributed , species-environment responses $^a$] [ @Gibb2021Data , host-virus network representation] |
@@ -467,35 +447,6 @@ target and destination network. This proposal can specifically be evaluated by
467
447
adding nodes to the network to embed, and assessing the performance of
468
448
predictive models [ see * e.g.* @Llewelyn2022Predicting ] .
469
449
470
- ## Minding legacies shaping ecological datasets
471
-
472
- In large parts of the world, boundaries that delineate geographic regions are
473
- merely a reflection the legacy of settler colonialism, which drives global
474
- disparity in capacity to collect and publish ecological data. Applying any
475
- embedding to biased data does not debias them, but rather embeds these biases,
476
- propagating them to the models using embeddings to make predictions.
477
- Furthermore, the use of ecological data itself is not an apolitical act
478
- [ @Nost2021Political ] : data infrastructures tend to be designed to answer
479
- questions within national boundaries (therefore placing contingencies on what is
480
- available to be embedded), their use often drawing upon, and reinforcing,
481
- territorial statecraft [ see * e.g.* @Barrett2005Environment ] . As per
482
- @Machen2021Thinking , these biases are particularly important to consider when
483
- knowledge generated algorithmically is used to supplement or replace human
484
- decision-making, especially for governance (* e.g.* enacting conservation
485
- decisions on the basis of model prediction). As information on networks is
486
- increasingly leveraged for conservation actions [ see * e.g.* @Eero2021Use ;
487
- @Naman2022Food ; @Stier2017Integrating ] , the need to appraise and correct biases
488
- that are unwittingly propagated to algorithms when embedded from the original
489
- data is immense. These considerations are even more urgent in the specific
490
- context of biodiversity data. Long-term colonial legacies still shape taxonomic
491
- composition to this day [ @Lenzner2022Naturalized ; @Raja2022Colonialism ] , and
492
- much shorter-term changes in taxonomic and genetic richness of wildlife emerged
493
- through environmental racism [ @Schmidt2022Systemic ] . Thus, the set of species
494
- found at a specific location is not only as the result of a response to
495
- ecological processes separate from human influence, but also the result of
496
- human-environment interaction as well as the result legislative/political
497
- histories.
498
-
499
450
# Conclusion: metawebs, predictions, and people
500
451
501
452
Predictive approaches in ecology, regardless of the scale at which they are
@@ -553,4 +504,55 @@ manuscript. All authors contributed to writing and editing the manuscript.
553
504
554
505
** Data availability:** There is no data associated with this manuscript.
555
506
507
+ > Box
508
+
509
+ ## Graph Neural Networks
510
+
511
+ One prominent family of approaches we do not discuss in the present manuscript
512
+ is Graph Neural Networks [ GNN; @Zhou2020Graph ] . GNN are, in a sense, a method to
513
+ embed a graph into a dense subspace, but belong to the family of deep learning
514
+ methods, which has its own set of practices [ see * e.g.* @Goodfellow2016Deep ] . An
515
+ important issue with methods based on deep learning is that, because their
516
+ parameter space is immense, the sample size of the data fed into them must be
517
+ similarly large (typically thousands of instances). This is a requirement for
518
+ the model to converge correctly during training, but this assumption is unlikely
519
+ to be met given the size of datasets currently available for metawebs (or single
520
+ time/location species interaction networks). This data volume requirement is
521
+ mostly absent from the techniques we list below. Furthermore, GNN still have
522
+ some challenges related to their shallow structure, and concerns related to
523
+ scalability [ see @Gupta2021Graph for a review] , which are mostly absent from the
524
+ methods listed in @tbl : methods . Assuming that the uptake of next-generation
525
+ biomonitoring techniques does indeed deliver larger datasets on species
526
+ interactions [ @Bohan2017Nextgeneration ] , there is nevertheless the potential for
527
+ GNN to become an applicable embedding/predictive technique in the coming years.
528
+
529
+ ## Minding legacies shaping ecological datasets
530
+
531
+ In large parts of the world, boundaries that delineate geographic regions are
532
+ merely a reflection the legacy of settler colonialism, which drives global
533
+ disparity in capacity to collect and publish ecological data. Applying any
534
+ embedding to biased data does not debias them, but rather embeds these biases,
535
+ propagating them to the models using embeddings to make predictions.
536
+ Furthermore, the use of ecological data itself is not an apolitical act
537
+ [ @Nost2021Political ] : data infrastructures tend to be designed to answer
538
+ questions within national boundaries (therefore placing contingencies on what is
539
+ available to be embedded), their use often drawing upon, and reinforcing,
540
+ territorial statecraft [ see * e.g.* @Barrett2005Environment ] . As per
541
+ @Machen2021Thinking , these biases are particularly important to consider when
542
+ knowledge generated algorithmically is used to supplement or replace human
543
+ decision-making, especially for governance (* e.g.* enacting conservation
544
+ decisions on the basis of model prediction). As information on networks is
545
+ increasingly leveraged for conservation actions [ see * e.g.* @Eero2021Use ;
546
+ @Naman2022Food ; @Stier2017Integrating ] , the need to appraise and correct biases
547
+ that are unwittingly propagated to algorithms when embedded from the original
548
+ data is immense. These considerations are even more urgent in the specific
549
+ context of biodiversity data. Long-term colonial legacies still shape taxonomic
550
+ composition to this day [ @Lenzner2022Naturalized ; @Raja2022Colonialism ] , and
551
+ much shorter-term changes in taxonomic and genetic richness of wildlife emerged
552
+ through environmental racism [ @Schmidt2022Systemic ] . Thus, the set of species
553
+ found at a specific location is not only as the result of a response to
554
+ ecological processes separate from human influence, but also the result of
555
+ human-environment interaction as well as the result legislative/political
556
+ histories.
557
+
556
558
# References
0 commit comments