You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: htsget.md
+31-1
Original file line number
Diff line number
Diff line change
@@ -340,6 +340,22 @@ For file formats whose specification describes a header and a body, the class in
340
340
Either all or none of the URLs in the response MUST have a class attribute.
341
341
If `class` fields are not supplied, no assumptions can be made about which data blocks contain headers, body records, or parts of both.
342
342
</td></tr>
343
+
<trmarkdown="block"><td>
344
+
345
+
`ETag`
346
+
_optional string_
347
+
</td><td>
348
+
349
+
The _entity-tag_ that would be returned to a request for the URL.
350
+
</td></tr>
351
+
<trmarkdown="block"><td>
352
+
353
+
`Last-Modified`
354
+
_optional string_
355
+
</td><td>
356
+
357
+
The last modification _HTTP-date_ that would be returned to a request for the URL.
358
+
</td></tr>
343
359
</table>
344
360
345
361
</td></tr>
@@ -404,7 +420,7 @@ An example of a JSON response is:
404
420
While the blocks must be finally concatenated in the given order, the client may fetch them in parallel and/or reuse cached data from URLs that have previously been downloaded.
405
421
406
422
When making a series of requests to fetch reads or variants within different regions of the same `<id>` resource, clients may wish to avoid re-fetching the SAM/CRAM/VCF headers each time, especially if they are large.
407
-
If the ticket contains `class` fields, the client may reuse previously downloaded and parsed headers rather than re-fetching the `header`-class URLs.
423
+
If the ticket contains `class` fields and/or cache control fields, the client may reuse previously downloaded and parsed headers rather than re-fetching the `header`-class URLs, as described below.
408
424
409
425
### HTTPS data block URLs
410
426
@@ -429,6 +445,20 @@ The client obtains the data block by decoding the embedded base64 payload.
429
445
430
446
Note: the base64 text should not be additionally percent encoded.
431
447
448
+
### Avoiding re-fetching ticket array URLs
449
+
450
+
Clients may use `class` fields and the usual HTTP cache control mechanisms to avoid re-fetching URLs in the ticket array whose contents the client has already downloaded.
451
+
For example, when making multiple requests to fetch reads (or variants) within different regions of the same `<id>` resource, usually the SAM/CRAM (or VCF) headers will not change between requests.
452
+
When the headers are large and the requested regions are small, the headers will constitute most of the downloaded data and it will be advantageous to avoid re-fetching this unchanged data.
453
+
454
+
If classes are specified in the ticket, zero or more of the entries at the start of the `urls` array will have class `header`.
455
+
When the client has previously downloaded the resource's SAM/VCF headers, it may reuse these known headers rather than re-fetching the `header`-class URLs.
456
+
(The boundary between the contents of the final `header` URL and the first `body` URL must be at the start of the first data record, as described in FIXME FOOTNOTE FOR BAM/CRAM/VCF/BCF.
457
+
If the resource is BGZF-compressed, the end of the contents of the final `header` URL must be the end of a BGZF block.)
458
+
459
+
Clients SHOULD use the usual HTTP caching facilities (`Cache-Control`; `ETag`/`If-None-Match` and/or `Last-Modified`/`If-Modified-Since`) to ensure that reused cached data is still valid.
460
+
If the server has provided `ETag` or `Last-Modified` ticket fields for a particular URL, the client can use them to avoid making even a request/304 round trip for that URL.
461
+
432
462
### Reliability & performance considerations
433
463
434
464
To provide robustness to sporadic transfer failures, servers should divide large payloads into multiple data blocks in the `urls` array. Then if the transfer of any one block fails, the client can retry that block and carry on, instead of starting all over. Clients may also fetch blocks in parallel, which can improve throughput.
0 commit comments