You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Logs do not show up for the steps that ran on GCP.
I've traced it to gs_tail.py. It uses the same blob object (self._blob_client) to get the latest logs. However, after this object is initialised, its generation field is set to the latest generation value of the file. The generation value becomes invalid and the code raises a NotFound error when the file is updated.
404 GET https://storage.googleapis.com/download/storage/v1/b/testbucket/o/tf-full-stack-sysroot%2FSimpleTestFlow%2F18%2Frun_on_cpu_remote%2F274258%2F0.task_stdout.log?alt=media&generation=1709809091417399: No such object: testbucket/tf-full-stack-sysroot/SimpleTestFlow/18/run_on_cpu_remote/274258/0.task_stdout.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
It can be tested like below:
>>>b=storage.Client().bucket("testbucket").blob("hello.json")
>>>b.download_as_bytes()
b'{\n "text": "Hello from the file in the bucket"\n}'re-uploadedthesamefileagainhere/overwritten>>>b.download_as_bytes()
Traceback (mostrecentcalllast):
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/cloud/storage/client.py", line1151, indownload_blob_to_fileblob_or_uri._do_download(
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/cloud/storage/blob.py", line989, in_do_downloadresponse=download.consume(transport, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/resumable_media/requests/download.py", line237, inconsumereturn_request_helpers.wait_and_retry(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/resumable_media/requests/_request_helpers.py", line155, inwait_and_retryresponse=func()
^^^^^^File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/resumable_media/requests/download.py", line219, inretriable_requestself._process_response(result)
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/resumable_media/_download.py", line188, in_process_response_helpers.require_status_code(
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/resumable_media/_helpers.py", line108, inrequire_status_coderaisecommon.InvalidResponse(
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
Duringhandlingoftheaboveexception, anotherexceptionoccurred:
Traceback (mostrecentcalllast):
File"<stdin>", line1, in<module>File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/cloud/storage/blob.py", line1401, indownload_as_bytesclient.download_blob_to_file(
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/cloud/storage/client.py", line1164, indownload_blob_to_file_raise_from_invalid_response(exc)
File"/Users/erdememekligil/miniconda3/envs/gcp-metaflow/lib/python3.11/site-packages/google/cloud/storage/blob.py", line4457, in_raise_from_invalid_responseraiseexceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.NotFound: 404GEThttps://storage.googleapis.com/download/storage/v1/b/testbucket/o/hello.json?alt=media&generation=1709894045845102: Nosuchobject: testbucket/hello.json: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
I'm not sure why this error hadn't happened until now, I've also tested it with older versions of Google components:
Logs do not show up for the steps that ran on GCP.
I've traced it to gs_tail.py. It uses the same blob object (self._blob_client) to get the latest logs. However, after this object is initialised, its generation field is set to the latest generation value of the file. The generation value becomes invalid and the code raises a NotFound error when the file is updated.
metaflow/metaflow/plugins/gcp/gs_tail.py
Line 49 in cbf9b7f
404 GET https://storage.googleapis.com/download/storage/v1/b/testbucket/o/tf-full-stack-sysroot%2FSimpleTestFlow%2F18%2Frun_on_cpu_remote%2F274258%2F0.task_stdout.log?alt=media&generation=1709809091417399: No such object: testbucket/tf-full-stack-sysroot/SimpleTestFlow/18/run_on_cpu_remote/274258/0.task_stdout.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
It can be tested like below:
I'm not sure why this error hadn't happened until now, I've also tested it with older versions of Google components:
setup
Older setup
The text was updated successfully, but these errors were encountered: