Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to finetune on custom loss function? #5

Closed
gopal86 opened this issue Mar 26, 2024 · 3 comments
Closed

Question: How to finetune on custom loss function? #5

gopal86 opened this issue Mar 26, 2024 · 3 comments
Labels
question Further information is requested

Comments

@gopal86
Copy link

gopal86 commented Mar 26, 2024

Hi, I wanted to fine tune the model on my own dataset however with my own custom loss. Could you give an example on how to do that? It would be very helpful for my research purpose! I am unclear on how to do that on your model

Thanks
Gopal

@gorold
Copy link
Contributor

gorold commented Mar 27, 2024

Firstly, you can check out the fine-tuning section in the README to get familiar with how to run a fine-tuning job. Then, to change the loss function, you can specify the loss function for the config file using model.loss_func._target_=... as shown below:

python -m cli.finetune \
  run_name=example_run \ 
  model=moirai_1.0_R_small \ 
  data=etth1 \ 
  val_data=etth1  \
  model.loss_func._target_=uni2ts.loss.packed.PackedMSELoss

We have already implemented some common loss functions used in time series forecasting, but if you want your own custom loss, you'll have to implement it by subclassing PackedLoss.

@gorold gorold closed this as completed Apr 11, 2024
@gorold gorold reopened this Apr 12, 2024
@gorold gorold changed the title How to finetune on custom loss function? Question: How to finetune on custom loss function? Apr 12, 2024
@gorold gorold added the question Further information is requested label Apr 12, 2024
@jmoffatt32
Copy link

I'm trying to run a fine-tune with PackedMSELoss function. However, when running the fine-tuning script, I am getting an error:

  File "/home/ubuntu/verb-workspace/linkt-uni2ts/.venv/lib/python3.11/site-packages/torch/nn/functional.py", line 3328, in mse_loss
    if not (target.size() == input.size()):
                             ^^^^^^^^^^
AttributeError: 'AffineTransformed' object has no attribute 'size'

From the debugging I have done, the issue seems to be that the model is passing a distribution (defined by the distr_output kwarg) to it's forward method, when the PackedMSELoss is expecting a tensor instead of a distribution. Is this understanding correct? How can I fix this issue, to use point based loss functions instead of distribution based loss funcs?

I have attached my model config below for additional reference:

_target_: uni2ts.model.moirai.MoiraiFinetune
_args_:
  module_kwargs:
    _target_: builtins.dict
    distr_output:
      _target_: uni2ts.distribution.MixtureOutput
      components:
        - _target_: uni2ts.distribution.StudentTOutput
        - _target_: uni2ts.distribution.NormalFixedScaleOutput
        - _target_: uni2ts.distribution.NegativeBinomialOutput
        - _target_: uni2ts.distribution.LogNormalOutput
    d_model: 384
    num_layers: 6
    patch_sizes: ${as_tuple:[8, 16, 32, 64, 128]}
    max_seq_len: 512
    attn_dropout_p: 0.0
    dropout_p: 0.0
    scaling: true
  min_patches: 2
  min_mask_ratio: 0.15
  max_mask_ratio: 0.5
  max_dim: 128
  loss_func:
    _target_: uni2ts.loss.packed.PackedMSELoss
  lr: 1e-3
  weight_decay: 1e-1
  beta1: 0.9
  beta2: 0.98
  num_training_steps: ${mul:${trainer.max_epochs},${train_dataloader.num_batches_per_epoch}}
  num_warmup_steps: 0
  checkpoint_path:
    _target_: huggingface_hub.hf_hub_download
    repo_id: Salesforce/moirai-1.0-R-small
    filename: model.ckpt

@gorold
Copy link
Contributor

gorold commented Apr 25, 2024

You could call distr.mean and feed it to the loss function. Not too sure how that will work out though. Some alternatives:

  1. Fine-tune with the NLL, you can still get a point prediction from a distribution
  2. Replace the output head with a new one for point forecast.

@SalesforceAIResearch SalesforceAIResearch locked and limited conversation to collaborators May 29, 2024
@gorold gorold converted this issue into discussion #57 May 29, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants