PyTorch stack conversion op doesn't cast like PyTorch does. #1778

siegelaaron94 · 2023-02-20T21:44:43Z

🐞Describing the bug

Exporting a PyTorch model with a https://pytorch.org/docs/stable/generated/torch.stack.html operation with float32 and int32 doesn't work correctly, with native PyTorch the 2 different types are cast to float32.

Stack Trace

Traceback (most recent call last):
  File "/dev/opensource/detectron2/export_test.py", line 20, in <module>
    result = ct.convert(model_traced, inputs=[ct.TensorType(name="x", shape=x.shape)])
  File "/dev/opensource/coremltools/coremltools/converters/_converters_entry.py", line 444, in convert
    mlmodel = mil_convert(
  File "/dev/opensource/coremltools/coremltools/converters/mil/converter.py", line 187, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
  File "/dev/opensource/coremltools/coremltools/converters/mil/converter.py", line 211, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
  File "/dev/opensource/coremltools/coremltools/converters/mil/converter.py", line 281, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
  File "/dev/opensource/coremltools/coremltools/converters/mil/converter.py", line 109, in __call__
    return load(*args, **kwargs)
  File "/dev/opensource/coremltools/coremltools/converters/mil/frontend/torch/load.py", line 57, in load
    return _perform_torch_convert(converter, debug)
  File "/dev/opensource/coremltools/coremltools/converters/mil/frontend/torch/load.py", line 96, in _perform_torch_convert
    prog = converter.convert()
  File "/dev/opensource/coremltools/coremltools/converters/mil/frontend/torch/converter.py", line 281, in convert
    convert_nodes(self.context, self.graph)
  File "/dev/opensource/coremltools/coremltools/converters/mil/frontend/torch/ops.py", line 89, in convert_nodes
    add_op(context, node)
  File "/dev/opensource/coremltools/coremltools/converters/mil/frontend/torch/ops.py", line 1871, in stack
    res = mb.stack(values=values, axis=axis, name=node.name)
  File "/dev/opensource/coremltools/coremltools/converters/mil/mil/ops/registry.py", line 176, in add_op
    return cls._add_op(op_cls_to_add, **kwargs)
  File "/dev/opensource/coremltools/coremltools/converters/mil/mil/builder.py", line 182, in _add_op
    new_op.type_value_inference()
  File "/dev/opensource/coremltools/coremltools/converters/mil/mil/operation.py", line 253, in type_value_inference
    output_types = self.type_inference()
  File "/dev/opensource/coremltools/coremltools/converters/mil/mil/ops/defs/iOS15/tensor_operation.py", line 1262, in type_inference
    raise ValueError(msg)
ValueError: Tensors in 'values' of the stack op (14) should share the same data type. Got [<class 'coremltools.converters.mil.mil.types.type_double.make_float.<locals>.double'>, <class 'coremltools.converters.mil.mil.types.type_int.make_int.<locals>.int'>].

To Reproduce

import torch
import torch.nn as nn
import coremltools as ct

class Broken(nn.Module):
    def __init__(self) -> None:
        super().__init__()

    def forward(self, x):
        y = torch.zeros((1, 3, 512, 512), dtype=torch.int32)
        return torch.stack((x, y))

if __name__ == '__main__':
    model = Broken()
    x = torch.zeros((1, 3, 512, 512), dtype=torch.float32)
    res = model(x)
    print(res.dtype, res.shape)

    model_traced = torch.jit.trace(model, (x, ))
    result = ct.convert(model_traced, inputs=[ct.TensorType(name="x", shape=x.shape)])

System environment (please complete the following information):

coremltools version: main sha1: 51b0003
OS (e.g. MacOS version or Linux type): macOS 13.2.1
Any other relevant version information (e.g. PyTorch or TensorFlow version):

  torch==1.13.1
  torchaudio==0.13.1
  torchvision==0.14.1

Additional context

Rand into this trying to convert detectron2 models with this pull request. The same issue happens if you replace stack with cat btw I assume they are implemented with each other.

The text was updated successfully, but these errors were encountered:

siegelaaron94 · 2023-02-20T22:49:29Z

After digging deeper it looks like cat and stack are not the same code path, cat calls promote_input_dtypes but stack doesn't. Changing

values = inputs[0]

to

values = promote_input_dtypes(inputs[0])

in stack fixes the bug.

healthmatrice · 2023-02-28T23:42:18Z

@siegelaaron94 have the same problem when converting the detectron2. Can you share your fix at some pr?

healthmatrice · 2023-02-28T23:55:06Z

I modified coremltools/coremltools/converters/mil/frontend/torch/ops.py line 1861 accordingly. But it seems does not work. I still get the same error:

Tensors in 'values' of the stack op (stack_0) should share the same data type.

siegelaaron94 added the bug Unexpected behaviour that should be corrected (type) label Feb 20, 2023

aseemw added the triaged Reviewed and examined, release as been assigned if applicable (status) label Feb 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch stack conversion op doesn't cast like PyTorch does. #1778

PyTorch stack conversion op doesn't cast like PyTorch does. #1778

siegelaaron94 commented Feb 20, 2023 •

edited

Loading

siegelaaron94 commented Feb 20, 2023

healthmatrice commented Feb 28, 2023

healthmatrice commented Feb 28, 2023

PyTorch stack conversion op doesn't cast like PyTorch does. #1778

PyTorch stack conversion op doesn't cast like PyTorch does. #1778

Comments

siegelaaron94 commented Feb 20, 2023 • edited Loading

🐞Describing the bug

Stack Trace

To Reproduce

System environment (please complete the following information):

Additional context

siegelaaron94 commented Feb 20, 2023

healthmatrice commented Feb 28, 2023

healthmatrice commented Feb 28, 2023

siegelaaron94 commented Feb 20, 2023 •

edited

Loading