Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Paths ContainerNotFound Error #35572

Closed
ivanthewebber opened this issue May 10, 2024 · 6 comments
Closed

Get Paths ContainerNotFound Error #35572

ivanthewebber opened this issue May 10, 2024 · 6 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@ivanthewebber
Copy link

  • Package Name: azure-datalake-store
  • Package Version: 0.0.53
  • Operating System: Windows 11
  • Python Version: 3.12

Describe the bug
A clear and concise description of what the bug is. Despite having a valid container/file-system the SDK get_paths does not work.

To Reproduce
Steps to reproduce the behavior: run this code snippet:

from azure.storage.filedatalake import DataLakeServiceClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

account_name = "mystorageaccount"
container = "mycontainer"
account_url = f"https://{container}@{account_name}.blob.core.windows.net" # f"https://{account_name}.blob.core.windows.net" # f"https://{account_name}.dfs.core.windows.net"
token_credential = InteractiveBrowserCredential() # DefaultAzureCredential(exclude_interactive_browser_credential=False)

service_client = DataLakeServiceClient(account_url, credential=token_credential)
fs_client = service_client.get_file_system_client(file_system=container)
list(fs_client.get_paths("top_folder")) # error

list(service_client.list_file_systems()) # this works
fsc = service_client.get_file_system_client(list(service_client.list_file_systems())[-1])
list(fsc.get_paths("top_folder")) # error

Expected behavior
Get paths would return the contents of the top folder.

Screenshots
If applicable, add screenshots to help explain your problem.

---------------------------------------------------------------------------
ResourceNotFoundError                     Traceback (most recent call last)
File c:\Users\u\Workspace\SplitRuns\temp.py:12
     [10](file:///C:/Users/u/Workspace/SplitRuns/temp.py:10) service_client = DataLakeServiceClient(account_url, credential=token_credential)
     [11](file:///C:/Users/u/Workspace/SplitRuns/temp.py:11) fs_client = service_client.get_file_system_client(file_system=container)
---> [12](file:///C:/Users/u/Workspace/SplitRuns/temp.py:12) list(fs_client.get_paths())
     [14](file:///C:/Users/u/Workspace/SplitRuns/temp.py:14) list(service_client.list_file_systems()) # this works
     [15](file:///C:/Users/u/Workspace/SplitRuns/temp.py:15) fsc = service_client.get_file_system_client(list(service_client.list_file_systems())[-1])

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\core\paging.py:123, in ItemPaged.__next__(self)
    [121](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:121) if self._page_iterator is None:
    [122](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:122)     self._page_iterator = itertools.chain.from_iterable(self.by_page())
--> [123](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:123) return next(self._page_iterator)

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\core\paging.py:75, in PageIterator.__next__(self)
     [73](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:73)     raise StopIteration("End of paging")
     [74](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:74) try:
---> [75](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:75)     self._response = self._get_next(self.continuation_token)
     [76](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:76) except AzureError as error:
     [77](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/paging.py:77)     if not error.continuation_token:

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\storage\filedatalake\_list_paths_helper.py:158, in PathPropertiesPaged._get_next_cb(self, continuation_token)
    [150](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:150)     return self._command(
    [151](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:151)         self.recursive,
    [152](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:152)         continuation=continuation_token or None,
   (...)
    [155](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:155)         upn=self.upn,
    [156](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:156)         cls=return_headers_and_deserialized_path_list)
    [157](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:157) except HttpResponseError as error:
--> [158](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:158)     process_storage_error(error)

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\storage\filedatalake\_deserialize.py:224, in process_storage_error(storage_error)
    [220](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:220) error.args = (error.message,)
    [222](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:222) try:
    [223](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:223)     # `from None` prevents us from double printing the exception (suppresses generated layer error context)
--> [224](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:224)     exec("raise error from None")   # pylint: disable=exec-used # nosec
    [225](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:225) except SyntaxError as exc:
    [226](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_deserialize.py:226)     raise error from exc

File <string>:1

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\storage\filedatalake\_list_paths_helper.py:150, in PathPropertiesPaged._get_next_cb(self, continuation_token)
    [148](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:148) def _get_next_cb(self, continuation_token):
    [149](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:149)     try:
--> [150](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:150)         return self._command(
    [151](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:151)             self.recursive,
    [152](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:152)             continuation=continuation_token or None,
    [153](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:153)             path=self.path,
    [154](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:154)             max_results=self.results_per_page,
    [155](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:155)             upn=self.upn,
    [156](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:156)             cls=return_headers_and_deserialized_path_list)
    [157](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:157)     except HttpResponseError as error:
    [158](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_list_paths_helper.py:158)         process_storage_error(error)

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\core\tracing\decorator.py:78, in distributed_trace.<locals>.decorator.<locals>.wrapper_use_tracer(*args, **kwargs)
     [76](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/tracing/decorator.py:76) span_impl_type = settings.tracing_implementation()
     [77](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/tracing/decorator.py:77) if span_impl_type is None:
---> [78](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/tracing/decorator.py:78)     return func(*args, **kwargs)
     [80](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/tracing/decorator.py:80) # Merge span is parameter is set, but only if no explicit parent are passed
     [81](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/tracing/decorator.py:81) if merge_span and not passed_in_parent:

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\storage\filedatalake\_generated\operations\_file_system_operations.py:750, in FileSystemOperations.list_paths(self, recursive, request_id_parameter, timeout, continuation, path, max_results, upn, **kwargs)
    [747](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_generated/operations/_file_system_operations.py:747) response = pipeline_response.http_response
    [749](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_generated/operations/_file_system_operations.py:749) if response.status_code not in [200]:
--> [750](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_generated/operations/_file_system_operations.py:750)     map_error(status_code=response.status_code, response=response, error_map=error_map)
    [751](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_generated/operations/_file_system_operations.py:751)     error = self._deserialize.failsafe_deserialize(_models.StorageError, pipeline_response)
    [752](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/storage/filedatalake/_generated/operations/_file_system_operations.py:752)     raise HttpResponseError(response=response, model=error)

File c:\Users\u\AppData\Local\Programs\Python\Python312\Lib\site-packages\azure\core\exceptions.py:164, in map_error(status_code, response, error_map)
    [162](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/exceptions.py:162)     return
    [163](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/exceptions.py:163) error = error_type(response=response)
--> [164](file:///C:/Users/u/AppData/Local/Programs/Python/Python312/Lib/site-packages/azure/core/exceptions.py:164) raise error

ResourceNotFoundError: The specified container does not exist.
RequestId:REDACTED
Time:2024-05-10T05:58:27.4100535Z
ErrorCode:ContainerNotFound
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>ContainerNotFound</Code><Message>The specified container does not exist.
RequestId:REDACTED
Time:2024-05-10T05:58:27.4100535Z</Message></Error>

Additional context
Add any other context about the problem here. I've tried a lot of variations because the documentation isn't great but I'm pretty sure this should work.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) labels May 10, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jalauzon-msft @vincenttran-msft.

@jalauzon-msft
Copy link
Member

Hi @ivanthewebber Ivan, a couple of things to check/confirm here:

  • Firstly, you mentioned using azure-datalake-store but that is not the correct library for ADLS Gen 2. (I believe that's ADLS Gen 1 or an old, deprecated library). The library you are looking for is azure-storage-file-datalake. That being said, your code looks to be from the correct library so likely that's not the issue. If you have azure-datalake-store installed though, I would recommend removing it from your environment.
    • Further on that point, also be sure to be using the latest version of azure-storage-file-datalake.
  • The current account_url you have would not be correct but I do see you have some others commented out, which I assume you also tried. To confirm, f"https://{account_name}.dfs.core.windows.net" would be the one you want to use here. The others likely will not work.
  • Double check that you would have correct permissions to see the container. You mention that you can successfully list file systems, so you probably do but just to be sure. You would need at least the "Storage Blob Data Reader" RBAC role.
  • Lastly, please confirm your Storage Account has "Hierarchical Namespace" enabled. That is a requirement to use the datalake SDK. If not, consider using azure-storage-blob instead.

@swathipil swathipil added needs-author-feedback More information is needed from author to address the issue. and removed needs-team-attention This issue needs attention from Azure service team or SDK team labels May 10, 2024
Copy link

Hi @ivanthewebber. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

@ivanthewebber
Copy link
Author

Thanks, I guess I got mixed up on which of the packages the exception was coming from. Yes, I tried all the commented variations.

I have azure-storage-file-datalake version 12.15.0.

I have Contributor role.

I verified that hierarchical namespaces is enabled.

I tried using azure-storage-blob but I am having the same inconsistency (able to list at the container level but can't list blobs). I think it may have something to do with the auth. I've tried using the various credential objects but haven't found one that works.

azure.core.exceptions.HttpResponseError: This request is not authorized to perform this operation using this permission.
RequestId:0bf6e4a0-901e-004b-6129-a3ab51000000
Time:2024-05-10T22:30:00.8573295Z
ErrorCode:AuthorizationPermissionMismatch
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>AuthorizationPermissionMismatch</Code><Message>This request is not authorized to perform this operation using this permission.
RequestId:0bf6e4a0-901e-004b-6129-a3ab51000000
Time:2024-05-10T22:30:00.8573295Z</Message></Error>

@github-actions github-actions bot added needs-team-attention This issue needs attention from Azure service team or SDK team and removed needs-author-feedback More information is needed from author to address the issue. labels May 10, 2024
@jalauzon-msft
Copy link
Member

Hi @ivanthewebber Ivan,

I have Contributor role.

When you say you have the Contributor role, do you mean the built-in role called just "Contributor" or the Storage specific Storage Blob Data Contributor role? You need a Storage specific role to interact with most data inside the Storage account (there are also Storage Blob Data Reader and Storage Blob Data Owner). I don't believe just Contributor is enough. This may explain the issue.

@ivanthewebber
Copy link
Author

Thanks; that was it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

4 participants