|
62 | 62 | "In DeepSensor, a `Task` is a `dict`-like data structure that contains context sets, target sets, and other metadata.\n",
|
63 | 63 | "Before diving into the [](./task_loader) class which generates `Task` objects from `xarray` and `pandas` objects,\n",
|
64 | 64 | "we will first introduce the `Task` class itself.\n",
|
65 |
| - "In the code cell below, `task` is a `Task` object.\n", |
66 |
| - "Printing a `Task` will print each of its entries and replace numerical arrays with their shape for convenience." |
| 65 | + "\n", |
| 66 | + "First, we will generate a `Task` using DeepSensor. These code cells are kept hidden because they includes\n", |
| 67 | + "features that are only covered later in the User Guide. Only expand them if you are curious!" |
67 | 68 | ]
|
68 | 69 | },
|
69 | 70 | {
|
70 | 71 | "cell_type": "code",
|
71 | 72 | "execution_count": 1,
|
| 73 | + "metadata": { |
| 74 | + "ExecuteTime": { |
| 75 | + "start_time": "2023-11-01T14:28:15.732009455Z" |
| 76 | + }, |
| 77 | + "collapsed": false, |
| 78 | + "tags": [ |
| 79 | + "hide-cell" |
| 80 | + ] |
| 81 | + }, |
72 | 82 | "outputs": [
|
73 | 83 | {
|
74 | 84 | "name": "stderr",
|
|
106 | 116 | "era5_ds = data_processor(era5_raw_ds)\n",
|
107 | 117 | "aux_ds, land_mask_ds = data_processor([auxiliary_raw_ds, land_mask_raw_ds], method=\"min_max\")\n",
|
108 | 118 | "station_df = data_processor(station_raw_df)"
|
109 |
| - ], |
110 |
| - "metadata": { |
111 |
| - "collapsed": false, |
112 |
| - "tags": [ |
113 |
| - "hide-cell" |
114 |
| - ], |
115 |
| - "ExecuteTime": { |
116 |
| - "start_time": "2023-11-01T14:28:15.732009455Z" |
117 |
| - } |
118 |
| - } |
| 119 | + ] |
119 | 120 | },
|
120 | 121 | {
|
121 | 122 | "cell_type": "code",
|
122 | 123 | "execution_count": 2,
|
123 | 124 | "metadata": {
|
124 |
| - "tags": [ |
125 |
| - "remove-cell" |
126 |
| - ], |
127 | 125 | "ExecuteTime": {
|
128 | 126 | "end_time": "2023-11-01T14:32:15.553656830Z",
|
129 | 127 | "start_time": "2023-11-01T14:32:15.548454739Z"
|
130 |
| - } |
| 128 | + }, |
| 129 | + "tags": [ |
| 130 | + "hide-cell" |
| 131 | + ] |
131 | 132 | },
|
132 | 133 | "outputs": [],
|
133 | 134 | "source": [
|
|
136 | 137 | "task = task_loader(\"2016-06-25\", context_sampling=[52, 112], target_sampling=245)"
|
137 | 138 | ]
|
138 | 139 | },
|
| 140 | + { |
| 141 | + "cell_type": "markdown", |
| 142 | + "metadata": {}, |
| 143 | + "source": [ |
| 144 | + "In the code cell below, `task` is a `Task` object.\n", |
| 145 | + "Printing a `Task` will print each of its entries and replace numerical arrays with their shape for convenience." |
| 146 | + ] |
| 147 | + }, |
139 | 148 | {
|
140 | 149 | "cell_type": "code",
|
141 | 150 | "execution_count": 3,
|
|
178 | 187 | },
|
179 | 188 | {
|
180 | 189 | "cell_type": "markdown",
|
| 190 | + "metadata": { |
| 191 | + "collapsed": false |
| 192 | + }, |
181 | 193 | "source": [
|
182 | 194 | "**Exercise:**\n",
|
183 | 195 | "\n",
|
|
188 | 200 | "- The number of target sets\n",
|
189 | 201 | "- The number of observations in each target set\n",
|
190 | 202 | "- The dimensionality of each target set\n"
|
191 |
| - ], |
192 |
| - "metadata": { |
193 |
| - "collapsed": false |
194 |
| - } |
| 203 | + ] |
195 | 204 | },
|
196 | 205 | {
|
197 | 206 | "cell_type": "markdown",
|
|
206 | 215 | },
|
207 | 216 | {
|
208 | 217 | "cell_type": "markdown",
|
| 218 | + "metadata": { |
| 219 | + "collapsed": false |
| 220 | + }, |
209 | 221 | "source": [
|
210 | 222 | "### Gridded data in Tasks\n",
|
211 | 223 | "\n",
|
212 | 224 | "For convenience, data that lies on a regular grid is given a compact tuple representation for the `\"X\"` entries:"
|
213 |
| - ], |
214 |
| - "metadata": { |
215 |
| - "collapsed": false |
216 |
| - } |
| 225 | + ] |
217 | 226 | },
|
218 | 227 | {
|
219 | 228 | "cell_type": "code",
|
220 | 229 | "execution_count": 4,
|
221 |
| - "outputs": [], |
222 |
| - "source": [ |
223 |
| - "task_with_gridded_data = task_loader(\"2016-06-25\", context_sampling=[\"all\", \"all\"], target_sampling=245)" |
224 |
| - ], |
225 | 230 | "metadata": {
|
226 |
| - "collapsed": false, |
227 | 231 | "ExecuteTime": {
|
228 | 232 | "end_time": "2023-11-01T14:32:15.620494504Z",
|
229 | 233 | "start_time": "2023-11-01T14:32:15.570462444Z"
|
230 |
| - } |
231 |
| - } |
| 234 | + }, |
| 235 | + "collapsed": false |
| 236 | + }, |
| 237 | + "outputs": [], |
| 238 | + "source": [ |
| 239 | + "task_with_gridded_data = task_loader(\"2016-06-25\", context_sampling=[\"all\", \"all\"], target_sampling=245)" |
| 240 | + ] |
232 | 241 | },
|
233 | 242 | {
|
234 | 243 | "cell_type": "code",
|
235 | 244 | "execution_count": 5,
|
| 245 | + "metadata": { |
| 246 | + "ExecuteTime": { |
| 247 | + "end_time": "2023-11-01T14:32:15.628949091Z", |
| 248 | + "start_time": "2023-11-01T14:32:15.611675646Z" |
| 249 | + }, |
| 250 | + "collapsed": false |
| 251 | + }, |
236 | 252 | "outputs": [
|
237 | 253 | {
|
238 | 254 | "name": "stdout",
|
|
249 | 265 | ],
|
250 | 266 | "source": [
|
251 | 267 | "print(task_with_gridded_data)"
|
252 |
| - ], |
253 |
| - "metadata": { |
254 |
| - "collapsed": false, |
255 |
| - "ExecuteTime": { |
256 |
| - "end_time": "2023-11-01T14:32:15.628949091Z", |
257 |
| - "start_time": "2023-11-01T14:32:15.611675646Z" |
258 |
| - } |
259 |
| - } |
| 268 | + ] |
260 | 269 | },
|
261 | 270 | {
|
262 | 271 | "cell_type": "markdown",
|
263 |
| - "source": [ |
264 |
| - "In the above example, the first context set lies on a 141 x 221 grid, and the second context set lies on a 140 x 220 grid." |
265 |
| - ], |
266 | 272 | "metadata": {
|
267 | 273 | "collapsed": false
|
268 |
| - } |
| 274 | + }, |
| 275 | + "source": [ |
| 276 | + "In the above example, the first context set lies on a 141 x 221 grid, and the second context set lies on a 140 x 220 grid." |
| 277 | + ] |
269 | 278 | },
|
270 | 279 | {
|
271 | 280 | "cell_type": "markdown",
|
|
306 | 315 | },
|
307 | 316 | {
|
308 | 317 | "cell_type": "markdown",
|
| 318 | + "metadata": { |
| 319 | + "collapsed": false |
| 320 | + }, |
309 | 321 | "source": [
|
310 | 322 | "Gridded data in a `Task` can be flattened using the `.flatten_gridded_data` method.\n",
|
311 | 323 | "Notice how the `\"X\"` entries are now 2D arrays of shape `(2, M)` rather than tuples of two 1D arrays of shape `(M,)`."
|
312 |
| - ], |
313 |
| - "metadata": { |
314 |
| - "collapsed": false |
315 |
| - } |
| 324 | + ] |
316 | 325 | },
|
317 | 326 | {
|
318 | 327 | "cell_type": "code",
|
319 | 328 | "execution_count": 7,
|
| 329 | + "metadata": { |
| 330 | + "ExecuteTime": { |
| 331 | + "end_time": "2023-11-01T14:32:15.970618528Z", |
| 332 | + "start_time": "2023-11-01T14:32:15.909066194Z" |
| 333 | + }, |
| 334 | + "collapsed": false |
| 335 | + }, |
320 | 336 | "outputs": [
|
321 | 337 | {
|
322 | 338 | "name": "stdout",
|
|
333 | 349 | ],
|
334 | 350 | "source": [
|
335 | 351 | "print(task_with_gridded_data.flatten_gridded_data())"
|
336 |
| - ], |
337 |
| - "metadata": { |
338 |
| - "collapsed": false, |
339 |
| - "ExecuteTime": { |
340 |
| - "end_time": "2023-11-01T14:32:15.970618528Z", |
341 |
| - "start_time": "2023-11-01T14:32:15.909066194Z" |
342 |
| - } |
343 |
| - } |
| 352 | + ] |
344 | 353 | }
|
345 | 354 | ],
|
346 | 355 | "metadata": {
|
|
0 commit comments