Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: output 1: meta disagrees with real impl #1437

Open
daisyden opened this issue Mar 6, 2025 · 1 comment
Open

RuntimeError: output 1: meta disagrees with real impl #1437

daisyden opened this issue Mar 6, 2025 · 1 comment
Assignees
Labels
bug Something isn't working module: OP Impl

Comments

@daisyden
Copy link
Contributor

daisyden commented Mar 6, 2025

🐛 Describe the bug

These cases failed on PVC CI
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_bfloat16
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float16
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_grad_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_scaled_dot_product_attention_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_bfloat16
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float16
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float32

see https://github.com/intel/torch-xpu-ops/actions/runs/13645060798/job/38146067831

Details:

025-03-04T08:39:15.3170529Z _ TestMetaXPU.test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float32 _
2025-03-04T08:39:15.3292007Z Traceback (most recent call last):
2025-03-04T08:39:15.3292677Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1159, in test_wrapper
2025-03-04T08:39:15.3293222Z     return test(*args, **kwargs)
2025-03-04T08:39:15.3293719Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1534, in wrapper
2025-03-04T08:39:15.3294324Z     fn(*args, **kwargs)
2025-03-04T08:39:15.3294837Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2288, in wrapper
2025-03-04T08:39:15.3295259Z     fn(*args, **kwargs)
2025-03-04T08:39:15.3295776Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 1279, in test_dispatch_symbolic_meta_outplace
2025-03-04T08:39:15.3296388Z     self._run_dispatch_meta_test(device, dtype, op, symbolic_meta=True, inplace=False)
2025-03-04T08:39:15.3297020Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 1257, in _run_dispatch_meta_test
2025-03-04T08:39:15.3297525Z     expected = func(*args, **kwargs)
2025-03-04T08:39:15.3298057Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_methods_invocations.py", line 16237, in <lambda>
2025-03-04T08:39:15.3298641Z     wrapper_set_seed(torch.nn.functional.scaled_dot_product_attention, *args, **kwargs),
2025-03-04T08:39:15.3299147Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_utils.py", line 18, in wrapper_set_seed
2025-03-04T08:39:15.3299589Z     output = op(*args, **kwargs)
2025-03-04T08:39:15.3300050Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 1084, in __torch_dispatch__
2025-03-04T08:39:15.3300540Z     expected = run_meta_crossref(
2025-03-04T08:39:15.3301035Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 632, in run_meta_crossref
2025-03-04T08:39:15.3301587Z     assert_ref_meta_equal(test_case, func, meta_rs, rs, lambda msg: f"""\
2025-03-04T08:39:15.3302151Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 433, in assert_ref_meta_equal
2025-03-04T08:39:15.3302742Z     test_assert(meta_r.shape == r.shape, f"for element {i}, was {meta_r.shape} but real shape was {r.shape}")
2025-03-04T08:39:15.3303344Z   File "/home/sdp/actions-runner-1/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_meta.py", line 428, in test_assert
2025-03-04T08:39:15.3303872Z     raise RuntimeError(f"output {i}: {msg_callable(msg)}")
2025-03-04T08:39:15.3304196Z RuntimeError: output 1: meta disagrees with real impl:
2025-03-04T08:39:15.3304538Z aten._scaled_dot_product_fused_attention_overrideable.default(
2025-03-04T08:39:15.3304890Z   tensor(..., device='meta', size=(4, 4, 3, 8)) stride=(96, 24, 8, 1),
2025-03-04T08:39:15.3305206Z   tensor(..., device='meta', size=(4, 4, 6, 8)) stride=(192, 48, 8, 1),
2025-03-04T08:39:15.3305545Z   tensor(..., device='meta', size=(4, 4, 6, 8)) stride=(192, 48, 8, 1),
2025-03-04T08:39:15.3305836Z   None,
2025-03-04T08:39:15.3306038Z   0.0,
2025-03-04T08:39:15.3306269Z   True,
2025-03-04T08:39:15.3306444Z   
2025-03-04T08:39:15.3306640Z ) = (
2025-03-04T08:39:15.3307184Z   (tensor(..., device='meta', size=(4, 4, 3, 8)) stride=(96, 24, 8, 1), tensor(..., device='meta', size=(4, 4, 3)) stride=(12, 3, 1), None, None, 3, 6, tensor(..., device='meta', size=(), dtype=torch.int64) stride=(), tensor(..., device='meta', size=(), dtype=torch.int64) stride=(), None)
2025-03-04T08:39:15.3307746Z )
2025-03-04T08:39:15.3308035Z for element 1, was torch.Size([4, 4, 3]) but real shape was torch.Size([])
2025-03-04T08:39:15.3308236Z 
2025-03-04T08:39:15.3308240Z 
2025-03-04T08:39:15.3308421Z The above exception was the direct cause of the following exception:
2025-03-04T08:39:15.3308624Z 
2025-03-04T08:39:15.3308752Z Traceback (most recent call last):
2025-03-04T08:39:15.3309189Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3150, in wrapper
2025-03-04T08:39:15.3309648Z     method(*args, **kwargs)
2025-03-04T08:39:15.3310152Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 454, in instantiated_test
2025-03-04T08:39:15.3310647Z     result = test(self, **param_kwargs)
2025-03-04T08:39:15.3311106Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1612, in wrapper
2025-03-04T08:39:15.3311554Z     fn(*args, **kwargs)
2025-03-04T08:39:15.3311974Z   File "/home/sdp/miniforge3/envs/xpu_op_1/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1171, in test_wrapper
2025-03-04T08:39:15.3312484Z     raise e_tracked from e
2025-03-04T08:39:15.3313267Z Exception: Caused by sample input at index 8: SampleInput(input=Tensor[size=(4, 4, 3, 8), device="xpu:0", dtype=torch.float32], args=TensorList[Tensor[size=(4, 4, 6, 8), device="xpu:0", dtype=torch.float32], Tensor[size=(4, 4, 6, 8), device="xpu:0", dtype=torch.float32]], kwargs={'is_causal': 'True', 'dropout_p': '0.0'}, broadcasts_input=False, name='')
2025-03-04T08:39:15.3313932Z 

Versions

stock pytorch: pytorch/pytorch@c21dc11
torch-xpu-ops: b4701a1

@DDEle
Copy link
Contributor

DDEle commented Mar 6, 2025

Should be fixed in pytorch/pytorch#148652 soon

@Stonepia Stonepia added module: OP Impl bug Something isn't working labels Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: OP Impl
Projects
None yet
Development

No branches or pull requests

3 participants