Skip to content

Commit adc43e5

Browse files
committed
Update paper
1 parent 76a245d commit adc43e5

File tree

2 files changed

+87
-16
lines changed

2 files changed

+87
-16
lines changed

paper/paper.bib

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,3 +144,29 @@ @article{li2022two
144144
doi = {10.3390/app122110979},
145145
publisher = {Multidisciplinary Digital Publishing Institute}
146146
}
147+
148+
@article{shenk2021traja,
149+
title = {Traja: A Python toolbox for animal trajectory analysis},
150+
author = {Shenk, Justin and Byttner, Wolf and Nambusubramaniyan, Saranraj and Zoeller, Alexander},
151+
journal = {Journal of Open Source Software},
152+
volume = {6},
153+
number = {63},
154+
pages = {3202},
155+
year = {2021},
156+
doi = {10.21105/joss.03202},
157+
url = {https://doi.org/10.21105/joss.03202},
158+
publisher = {The Open Journal}
159+
}
160+
161+
@article{joo2020navigating,
162+
title={Navigating through the R packages for movement},
163+
author={Joo, Rocio and Boone, Matthew E and Clay, Thomas A and Patrick, Samantha C and Clusella-Trullas, Susana and Basille, Mathieu},
164+
journal={Journal of Animal Ecology},
165+
volume={89},
166+
number={1},
167+
pages={248--267},
168+
year={2020},
169+
doi = {10.1111/1365-2656.13116},
170+
url = {https://doi.org/10.1111/1365-2656.13116},
171+
publisher={Wiley Online Library}
172+
}

paper/paper.md

Lines changed: 61 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ authors:
1919
orcid: 0000-0001-7305-4710
2020
equal-contrib: true
2121
affiliation: 3
22-
- name: E. Altshuler
22+
- name: E.~Altshuler
2323
orcid: 0000-0003-4192-5635
2424
equal-contrib: false
2525
affiliation: 3
@@ -55,7 +55,8 @@ been studied in many different fields, including robotics, behavior analysis,
5555
mobility pattern mining, and user activity recognition [@da2019survey]. This task
5656
presents multiple challenges for conventional classification models, such as the
5757
indeterminate length of trajectories [@li2022two], the range of entities that generate
58-
trajectories [@janczura2020classification; @xiao2017identifying; @bae2022transformer], and the absence of established standards in trajectory datasets
58+
trajectories [@janczura2020classification; @xiao2017identifying; @bae2022transformer],
59+
and the absence of established standards in trajectory datasets
5960
[@xiao2017identifying; @bae2022transformer].
6061

6162
Our study endeavors to lay the foundation for the assessment of innovative
@@ -64,10 +65,26 @@ a new framework, referred to as `pactus`, which addresses the challenges of
6465
trajectory classification by providing direct access to a carefully chosen
6566
collection of datasets and several trajectory classifiers. `pactus`
6667
facilitates researchers' ability to experiment with various approaches
67-
and assess their performance on different types of data.
68+
and assess their performance on different types of data. A comprehensive software
69+
documentation is provided on
70+
([https://pactus.readthedocs.io/en/latest/](pactus.readthedocs.io)).
6871

6972
# Statement of need
7073

74+
In recent years, several software libraries have emerged, aiming to automate trajectory
75+
data analysis. Within the R community, there are various available tools
76+
[@joo2020navigating]. Recognizing the popularity and extensive usage of Python, the `traja`
77+
software [@shenk2021traja] was developed to integrate different analysis techniques for
78+
two-dimensional trajectories, primarily focusing on animal behavioral analysis. Additionally,
79+
the `yupi` library [@reyes2023yupi] was created to handle trajectory analysis for applications
80+
involving an arbitrary number of dimensions.
81+
82+
Although these libraries offer valuable tools for trajectory classification, such as
83+
classification models and feature extraction from trajectories, they were not specifically
84+
designed for this task. Consequently, contemporary research on trajectory classification
85+
faces limitations in terms of evaluation, often considering only a limited number of datasets
86+
or reporting only a reduced set of metrics [@bae2022transformer].
87+
7188
The lack of standardization in trajectory datasets, coupled with the difficulty
7289
of obtaining these datasets for evaluation, poses a significant challenge to
7390
researchers working in fields related to trajectory classification. Moreover,
@@ -88,38 +105,66 @@ researchers to distribute their findings as simple Python scripts, relying on `p
88105
for all tasks related to data acquisition, processing, and model evaluation.
89106

90107

91-
# Software Overview
92-
93-
The functionalities of `pactus` can be divided into four different categories as shown in
94-
\autoref{fig:overview}.
108+
# Pactus Software Library
95109

96-
![Overview of the resources available in `pactus` coupled with an usage example.\label{fig:overview}](1.pdf)
110+
The functionalities of `pactus` can be divided into modules: Data handling, Feature extraction,
111+
Classification models and Evaluation.
97112

113+
## Data handling
98114

99-
The selection of datasets was conducted with meticulous care to encompass a broad
115+
The library provides direct access to some of the most commonly used datasets for trajectory
116+
classification. The selection of datasets was conducted with meticulous care to encompass a broad
100117
range of trajectories and classification objectives. Our initial selection includes
101118
GeoLife [@zheng2009mining; @zheng2008understanding; @zheng2010geolife], The Starkey
102-
Project dataset, also known as `Animals' in the trajectory classification
119+
Project dataset, also known as `Animals` in the trajectory classification
103120
community [@rapp2009elk], four different datasets from the the UCI repository
104121
[@Dua:2019] and two different hurricane datasets, provided by National Hurricane
105122
Center [@landsea2013atlantic] and the China Meteorological Administration
106123
[@ying2014overview; @lu2021western] respectively. To ensure consistency, all
107124
datasets were transformed into a standardized format utilizing the trajectory
108125
data structures proposed in [@reyes2023yupi]. Datasets are not bundled with the
109126
software package, but rather will be downloaded and cached automatically upon each
110-
individual access through the library.
127+
individual access through the library. A complete guide on how to use custom datasets or
128+
requesting the inclusion of new datasets into `pactus` can be found in the documentation.
129+
130+
## Feature extraction
111131

112132
In order to mitigate the different-length trajectories on some datasets, `pactus`
113133
is able to extract statistical features from any trajectory and convert an arbitrary
114134
length trajectory into a fixed size vector whose components are engineered features
115135
typically used in the literature [@xiao2017identifying; @zheng2008understanding].
116136

117-
Finally, several classification algorithms can be evaluated on the vectorized
118-
versions of the trajectories (e.g., Random Forest, SVM, KNN) or, alternatively,
119-
classifiers able to handle variable-size inputs (e.g., LSTM or Transformers [@bae2022transformer]) can be evaluated directly on the trajectory data.
120-
In both cases, typical evaluation metrics for classification are computed
121-
automatically for the model being evaluated.
137+
Users can implement their own method to perform this conversion, and an example on how
138+
to do it can be found in the documentation. However, there is a default method that uses
139+
all the features computed by the `yupi` library.
140+
141+
## Classification models and Evaluation
142+
143+
Several classification algorithms are included in `pactus`. Some of them can be evaluated
144+
on the vectorized versions of the trajectories (e.g., Random Forest, SVM, KNN). In other cases
145+
the classifiers are able to handle variable-size inputs (e.g., LSTM or Transformers
146+
[@bae2022transformer]) and can be evaluated directly on the trajectory data. In both cases,
147+
typical evaluation metrics for classification are computed automatically for the model being evaluated.
148+
149+
## Overview
150+
151+
All the functionalities of the library can be integrated in a single script. \autoref{fig:overview}
152+
shows an example on how to use `pactus` for training and evaluating a Random Forest model using the
153+
Starkey Project dataset, also known as `Animals`.
154+
155+
![Overview of the resources available in `pactus` coupled with an usage example.\label{fig:overview}](1.pdf)
156+
122157

158+
# Conclusions
123159

160+
The software presented with this work, `pactus`, addresses typical challenges faced in trajectory
161+
classification research. By providing researchers with direct access to curated datasets and trajectory
162+
classifiers, `pactus` enhances the availability of resources for evaluation. It is concieved with extensibility
163+
in mind, encouraging researchers to contribute their own datasets and methods. The evaluation methodology ensures
164+
reproducibility and comparability of results, facilitating the identification of effective trajectory classification
165+
methods for specific scenarios. Additionally, pactus promotes reproducible research by enabling researchers to
166+
distribute their findings as Python scripts, relying on pactus for data acquisition, processing, and model
167+
evaluation. Overall, pactus offers a valuable tool for researchers in the field of trajectory classification,
168+
addressing key challenges and facilitating future advancements in the field.
124169

125170
# References

0 commit comments

Comments
 (0)