Skip to content

Commit 5306587

Browse files
committed
docs: add gida wiki
1 parent ab6ec5e commit 5306587

File tree

7 files changed

+174
-0
lines changed

7 files changed

+174
-0
lines changed

ditec/docs/gida/installation.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Manually download the data
2+
TODO: Update later
3+
4+
# Instal the Data Interface
5+
Currently, GiDA is available for Python >= 3.10. Please setup a virtual environment before installation.
6+
7+
As some libraries are tailored to your OS and CUDA, user should install them separately as follows:
8+
9+
1. Install [PyTorch >= 2.3](https://pytorch.org/get-started/locally/)
10+
2. Instal [PyG >= 2.3](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html)
11+
12+
At this time, GiDA works best on PyTorch and PyG 2.3.
13+
14+
Afterwards, you can clone GiDA or install it via pip:
15+
16+
```python
17+
pip install git+https://github.com/DiTEC-project/DiTEC_WDN_dataset.git
18+
```
19+
20+
Tada! GiDA data interface has been installed!
21+

ditec/docs/gida/overview.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# GiDA - The Gigantic Dataset
2+
3+
This work includes a collection of synthetic scenarios devised from 36 **Water Distribution Networks (WDNs)**.
4+
5+
For the sake of clarity, it would be better to get into familiarized concepts:
6+
7+
* **Scenario** denotes as a sequence of snapshots.
8+
9+
* **Snapshot** represents a measured steady-state of a particular WDN and is often modelled as an undirect graph.
10+
11+
* **Input parameters** includes simulation inputs, such as demands, pipe diameter, and so on.
12+
13+
* **Output parameters** includes simulation outcomes which researchers are interested in (e.g., pressure, flow rate, head, ...)
14+
15+
Both parameters are described as nodal/edge features in the snapshot graph. Their values are diverse but temporal correlated with those of other snapshots in the **same** scenario.
16+
However, in GiDA, two scenarios are considered completely different WDNs despite their origin being the same network.
17+
18+
19+
20+
# Acknowledgement
21+
This work is funded by the project DiTEC: Digital Twin for Evolutionary Changes in Water Networks (NWO 19454).
22+
23+
# Citing GiDA
24+
25+
* For the up-to-date dataset and interface, please use this:
26+
```
27+
TODO: UPDATE LATER
28+
```
29+
30+
* For the older dataset versions, please use this:
31+
```tex
32+
@article{tello2024largescale,
33+
AUTHOR = {Tello, Andrés and Truong, Huy and Lazovik, Alexander and Degeler, Victoria},
34+
TITLE = {Large-Scale Multipurpose Benchmark Datasets for Assessing Data-Driven Deep Learning Approaches for Water Distribution Networks},
35+
JOURNAL = {Engineering Proceedings},
36+
VOLUME = {69},
37+
YEAR = {2024},
38+
NUMBER = {1},
39+
ARTICLE-NUMBER = {50},
40+
URL = {https://www.mdpi.com/2673-4591/69/1/50},
41+
ISSN = {2673-4591},
42+
DOI = {10.3390/engproc2024069050}
43+
}
44+
```

ditec/docs/gida/parameters.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Parameters
2+
An input attribute is named as `<component>_<attribute>`. Nodal components include `reservoir`, `junction`, and `tank`, while edge components involve `pipe`, `headpump`, `powerpump`, etc.
3+
4+
Tip: Open `.zip` file to see available attributes as filename (csv) or folder name (zarr).
5+
6+
On the other hand, another kind of attribute is simulation output that has no component prefix (e.g., velocity, pressure, ...). They concatenate features of components based on their type (node or link).Therefore, we might encounter a mismatch in size when striving to stack input and output parameters. Consider this example:
7+
```python
8+
# This should raise an error
9+
GidaV6(
10+
zip_file_paths=[
11+
r"./Dataset/simgen_Anytown_20240524_1202_csvdir_20240527_1205.zip", # Anytown datset
12+
],
13+
node_attrs=[
14+
"junction_base_demand", # load junc base_demand (#junctions)
15+
("reservoir_base_head", "junction_elevation", "tank_elevation"), # load node elevation(#reservoirs + #tanks + #junctions)
16+
],
17+
num_records=100, # take only 100 records
18+
)
19+
```
20+
Intuitively, we can observe the size inconsistency between `junction_base_demand` and the tuple of elevation-related parameters. However, we sometimes want to define `node_attrs` in this way.\
21+
To solve this, GiDA offers the `*` operator indicating a specific parameter whose size is less than others. Let's fix the above example:
22+
```python
23+
GidaV6(
24+
zip_file_paths=[
25+
r"./Dataset/simgen_Anytown_20240524_1202_csvdir_20240527_1205.zip" # Anytown datset
26+
],
27+
node_attrs=[
28+
"*junction_base_demand", # load junc base_demand (#junctions) with asterisk
29+
("reservoir_base_head", "junction_elevation", "tank_elevation"), # load node elevation(#reservoirs + #tanks + #junctions)
30+
],
31+
num_records=100, # take only 100 records
32+
)
33+
```
34+
In this way, GiDA pads the incomplete parameters according to the tuple or non-asterisk parameters.

ditec/docs/gida/quickstart.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
2+
# Download datasets
3+
TODO: UPDATE LATER.
4+
## GiDA-V1
5+
Please go to [Gida-V1](https://zenodo.org/records/11353195), download the dataset, and place it into a folder, say `/Dataset`.
6+
7+
8+
# Tutorial
9+
For the first-time user, please refer to the `datasets.py` script and review the `GidaV6.__init__` function. A minimal example is also provided at the end of the script.
10+
11+
The data interface `GidaV6` will take node (edge) attributes and output a set of records. Each records is a `Data` instance (visit [here](https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.data.Data.html#torch_geometric.data.Data) for more information). This `Data` contains a snapshot graph described by the (sparsed) adjacency matrix A, nodal feature X, and edge feature E. Also, if label is available, we have label Y corresponding to either node or edge. In the case both edge and node sets have their own labels, Y is for label of nodes, while E_Y stands for label of edges.
12+
13+
Assume you want to load the train set of Anytown network, a very simple interface can be declared as follow:
14+
```
15+
from gigantic_dataset.core.datasets import GidaV6
16+
gida = GidaV6(
17+
zip_file_paths=[
18+
r"./Dataset/simgen_Anytown_20240524_1202_csvdir_20240527_1205.zip", # Anytown datset
19+
],
20+
node_attrs=[
21+
"demand", # load nodal demand
22+
],
23+
edge_attrs=["pipe_diameter", "pipe_length"], # load some properites at edge
24+
label_attrs=["pressure"], # expect labels Y are pressure
25+
edge_label_attrs=["flowrate"], # expect edge labels E_Y are flowrate
26+
split_set="train", # take train set only
27+
num_records=100, # take only 100 records
28+
selected_snapshots=None, # take all snapshots
29+
)
30+
# You can call a record directly
31+
print(gida[0]) # Data instance
32+
# Or via a data loader
33+
from torch_geometric.loader import DataLoader
34+
loader = DataLoader(gida, batch_size=1)
35+
print(next(iter(loader))) #Batch instance
36+
```

ditec/docs/index.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Tut for writer
2+
3+
This is the default view and should not be overriden.
4+
5+
To create a wiki for your project so-called `Project A`, please do the following steps:
6+
7+
1. Clone the project at [here](https://github.com/DiTEC-project/DiTEC-project.github.io).
8+
9+
2. Run command `pip install mkdocs` to start working on wiki with MKDOCS.
10+
11+
3. Create a new directory whose name is matched your project name.
12+
13+
4. Add markdown files into the directory. Note that each markdown represents a page of the wiki.
14+
15+
5. Check `mkdocs.yml` and layout your wiki structure.
16+
17+
6. For development, run `mkdocs serve` to show the view on localhost.
18+
19+
7. For deployment to the Github Page, follow (this tutorial)[https://www.mkdocs.org/user-guide/deploying-your-docs/]
20+
21+

ditec/mkdocs.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
site_name: DiTEC
2+
theme: readthedocs
3+
nav:
4+
- GiDA- The Gigantic Dataset:
5+
- Introduction:
6+
- Overview: gida/overview.md
7+
- Install GiDA: gida/installation.md
8+
- Quickstart: gida/quickstart.md
9+
- Tutorials:
10+
- Datasets: gida/datasets.md
11+
- Parameter stacking: gida/parameters.md
12+
- Simulation Configuration: gida/simconfig_tut.md
13+
- Scenario generation: gida/scene_gen.md
14+
- Advance Topics:
15+
- Hydraulic Parameter Optimization: gida/hpo.md
16+
- Naive/ Manual HPO: gida/naive_manual_hpo.md
17+
- PSO: gida/pso.md

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
mkdocs

0 commit comments

Comments
 (0)