Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap #8

Open
3 of 9 tasks
shink opened this issue Sep 19, 2024 · 3 comments
Open
3 of 9 tasks

Roadmap #8

shink opened this issue Sep 19, 2024 · 3 comments
Assignees

Comments

@shink shink self-assigned this Sep 30, 2024
@shink
Copy link
Contributor Author

shink commented Sep 30, 2024

pip install torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu

1. pytorch/examples

BACKEND_DEVICE=npu ./run_python_examples.sh

2. pytorch/benchmark

python run_benchmark.py test_bench --accuracy --device npu --test eval --output ascend_npu_benchmark.json

3. huggingface/timm

4. huggingface/transformers

apt install libsndfile1

pip install -r examples/pytorch/_tests_requirements.txt

pytest examples/pytorch/test_pytorch_examples.py -v

from transformers.testing_util import torch_device

Result:16 failed, 3 passed, 2 skipped
=============================================== short test summary info ================================================
FAILED test_pytorch_examples.py::ExamplesTests::test_run_audio_classification - TypeError: 'NoneType' object is not callable
FAILED test_pytorch_examples.py::ExamplesTests::test_run_clm - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/./tests/fixtures/sample_text.txt'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_clm_config_overrides - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/./tests/fixtures/sample_text.txt'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_glue - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/./tests/fixtures/tests_samples/MRPC/train.csv'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_image_classification - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
FAILED test_pytorch_examples.py::ExamplesTests::test_run_instance_segmentation - RuntimeError: Expected all tensors to be on the same device. Expected NPU tensor, please check whether the input te...
FAILED test_pytorch_examples.py::ExamplesTests::test_run_mlm - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/./tests/fixtures/sample_text.txt'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_ner - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/tests/fixtures/tests_samples/conll/sample.json'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_semantic_segmentation - AssertionError: 0.0 not greater than or equal to 0.1
FAILED test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_ctc - AssertionError: nan not less than 73.33543701171875
FAILED test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_ctc_adapter - AssertionError: nan not less than 75.0298583984375
FAILED test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_seq2seq - AssertionError: nan not less than 3.872235107421875
FAILED test_pytorch_examples.py::ExamplesTests::test_run_squad - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/tests/fixtures/tests_samples/SQUAD/sample.json'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_squad_seq2seq - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/tests/fixtures/tests_samples/SQUAD/sample.json'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_swag - FileNotFoundError: Unable to find '/root/transformers/examples/pytorch/tests/fixtures/tests_samples/swag/sample.json'
FAILED test_pytorch_examples.py::ExamplesTests::test_run_vit_mae_pretraining - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
========================== 16 failed, 3 passed, 2 skipped, 318 warnings in 466.30s (0:07:46) ===========================
Traceback (most recent call last):
  File "/root/transformers/examples/pytorch/instance-segmentation/run_instance_segmentation.py", line 480, in <module>
    main()
  File "/root/transformers/examples/pytorch/instance-segmentation/run_instance_segmentation.py", line 455, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/trainer.py", line 2052, in train
    return inner_training_loop(
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/trainer.py", line 2388, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/trainer.py", line 3485, in training_step
    loss = self.compute_loss(model, inputs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/trainer.py", line 3532, in compute_loss
    outputs = model(**inputs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/models/mask2former/modeling_mask2former.py", line 2517, in forward
    loss_dict = self.get_loss_dict(
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/models/mask2former/modeling_mask2former.py", line 2338, in get_loss_dict
    loss_dict: Dict[str, Tensor] = self.criterion(
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/models/mask2former/modeling_mask2former.py", line 768, in forward
    indices = self.matcher(masks_queries_logits, class_queries_logits, mask_labels, class_labels)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/python3.9/lib/python3.9/site-packages/transformers/models/mask2former/modeling_mask2former.py", line 475, in forward
    cost_matrix = torch.minimum(cost_matrix, torch.tensor(1e10))
RuntimeError: Expected all tensors to be on the same device. Expected NPU tensor, please check whether the input tensor device is correct.
[ERROR] 2024-10-10-12:59:59 (PID:1154599, Device:0, RankID:-1) ERR01002 OPS invalid type

issue: https://gitee.com/ascend/pytorch/issues/IAWAZ1?from=project-issue

@shink
Copy link
Contributor Author

shink commented Oct 9, 2024

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

https://www.cnblogs.com/mrneojeep/p/16252044.html

@shink
Copy link
Contributor Author

shink commented Oct 12, 2024

test_models.py::test_model_backward[2-swinv2_cr_large_224] Fatal Python error: Segmentation fault

Thread 0x0000fff99f87f120 (most recent call first):
<no Python frame>

Thread 0x0000fffe99d3f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe9c54f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe9ed5f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffea156f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe9752f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe94d1f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe9250f120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe8fcff120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 312 in wait
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/queues.py", line 231 in _feed
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 917 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffe579bf120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/connection.py", line 379 in _recv
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/connection.py", line 414 in _recv_bytes
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/connection.py", line 250 in recv
  File "/usr/local/python3.9/lib/python3.9/multiprocessing/managers.py", line 810 in _callmethod
  File "<string>", line 2 in get
  File "/usr/local/Ascend/ascend-toolkit/latest/python/site-packages/tbe/common/repository_manager/utils/multiprocess_util.py", line 91 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Thread 0x0000fffdf02bf120 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 316 in wait
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 581 in wait
  File "/usr/local/python3.9/lib/python3.9/site-packages/tqdm/_monitor.py", line 60 in run
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 980 in _bootstrap_inner
  File "/usr/local/python3.9/lib/python3.9/threading.py", line 937 in _bootstrap

Current thread 0x0000ffffb6ca8640 (most recent call first):
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/_tensor_str.py", line 146 in __init__
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/_tensor_str.py", line 357 in _tensor_str
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/_tensor_str.py", line 625 in _str_intern
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/_tensor_str.py", line 708 in _str
  File "/usr/local/python3.9/lib/python3.9/site-packages/torch/_tensor.py", line 464 in __repr__
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_io/saferepr.py", line 73 in repr_instance
  File "/usr/local/python3.9/lib/python3.9/reprlib.py", line 62 in repr1
  File "/usr/local/python3.9/lib/python3.9/reprlib.py", line 52 in repr
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_io/saferepr.py", line 61 in repr
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_io/saferepr.py", line 112 in saferepr
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 831 in repr_args
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 927 in repr_traceback_entry
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 982 in <listcomp>
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 981 in repr_traceback
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 1057 in repr_excinfo
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/_code/code.py", line 698 in getrepr
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/nodes.py", line 497 in _repr_failure_py
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/python.py", line 1877 in repr_failure
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/reports.py", line 364 in from_item_and_call
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/runner.py", line 372 in pytest_runtest_makereport
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/runner.py", line 228 in call_and_report
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/runner.py", line 133 in runtestprotocol
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/main.py", line 351 in pytest_runtestloop
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/main.py", line 326 in _main
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/main.py", line 272 in wrap_session
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/main.py", line 319 in pytest_cmdline_main
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/python3.9/lib/python3.9/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/config/__init__.py", line 174 in main
  File "/usr/local/python3.9/lib/python3.9/site-packages/_pytest/config/__init__.py", line 197 in console_main
  File "/usr/local/python3.9/bin/pytest", line 8 in <module>
Segmentation fault (core dumped)

@shink shink changed the title Reuse PyTorch's test suite Roadmap Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant