[Feature] Distributed model decompression

## Background ##
#624 Added support for distributed weight compression which parallelizes weight compression for distributed workflows. However, weight decompression has not been parallelized, meaning that it can take a long time to decompress a model during transformers inference or use cases where a user wants to decompress a model.

## Requested Changes ##
Implement distributed decompression (`ModelCompressor.decompress_model`). This involves supporting `BaseCompressor.decompress_module(module)` on modules whose parameters are on the meta device for all subclasses of `BaseCompressor`.

Please add tests to verify that `BaseCompressor.decompress_module` and `BaseCompressor.compress_module` work for meta modules for all subclasses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Distributed model decompression #671

Background

Requested Changes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature] Distributed model decompression #671

Description

Background

Requested Changes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions