Open
Description
Some approaches like Lora have the objective of reducing the memory in the training phase, but in dllib it does not work as expected. Freezing the 99% of the model consumes the same memory as the model without freezing any node. It seems that the only thing that freeze does is to avoid to update the weights, but using more threads increases the memory and it sohuldn´t because weights. Any suggestion of changing this behaviour?