Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use whisper as frontend? #5759

Open
mukherjeesougata opened this issue Apr 23, 2024 · 0 comments
Open

How to use whisper as frontend? #5759

mukherjeesougata opened this issue Apr 23, 2024 · 0 comments
Labels
Question Question

Comments

@mukherjeesougata
Copy link

mukherjeesougata commented Apr 23, 2024

I am trying to use whisper as frontend for ASR task. I am using the following config file:-

batch_type: numel
#batch_bins: 12000000
batch_bins: 4000000
accum_grad: 4
#max_epoch: 10
max_epoch: 2000
patience: none
# The initialization method for model parameters
init: xavier_uniform
best_model_criterion:
-   - valid
    - acc
    - max
#keep_nbest_models: 3
keep_nbest_models: 8

encoder: conformer
encoder_conf:
    output_size: 512
    attention_heads: 8
    linear_units: 2048
    num_blocks: 12
    dropout_rate: 0.1
    positional_dropout_rate: 0.1
    attention_dropout_rate: 0.1
    input_layer: conv2d2
    normalize_before: true
    macaron_style: true
    pos_enc_layer_type: "rel_pos"
    selfattention_layer_type: "rel_selfattn"
    activation_type: "swish"
    use_cnn_module:  true
    cnn_module_kernel: 31
    interctc_layer_idx: [3, 6, 9,]
    interctc_use_conditioning: true

decoder: transformer
decoder_conf:
    attention_heads: 8
    linear_units: 2048
    num_blocks: 6
    dropout_rate: 0.1
    positional_dropout_rate: 0.1
    self_attention_dropout_rate: 0.1
    src_attention_dropout_rate: 0.1

model_conf:
    ctc_weight: 0.3
    lsm_weight: 0.1
    interctc_weight: 0.5
    length_normalized_loss: false
    extract_feats_in_collect_stats: false

optim: adam
optim_conf:
    lr: 0.002
    #lr: 0.0002
scheduler: warmuplr
scheduler_conf:
    warmup_steps: 25000

specaug: specaug
specaug_conf:
    apply_time_warp: true
    time_warp_window: 5
    time_warp_mode: bicubic
    apply_freq_mask: true
    freq_mask_width_range:
    - 0
    - 30
    num_freq_mask: 2
    apply_time_mask: true
    time_mask_width_range:
    - 0
    - 40
    num_time_mask: 2

freeze_param: [
"frontend.upstream"
]

frontend: whisper
frontend_conf: 
    whisper_model: large-v2
    freeze_weights: True
    download_dir: ./hub

preencoder: linear
preencoder_conf:
    input_size: 1280  # Note: If the upstream is changed, please change this value accordingly.
    output_size: 80

But it is showing the following error:-

  File "/DATA/anaconda3/envs/espnet/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/DATA/anaconda3/envs/espnet/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/DATA/Sougata/espnet/espnet2/bin/asr_train.py", line 23, in <module>
    main()
  File "/DATA/Sougata/espnet/espnet2/bin/asr_train.py", line 19, in main
    ASRTask.main(cmd=cmd)
  File "/DATA/Sougata/espnet/espnet2/tasks/abs_task.py", line 1154, in main
    cls.main_worker(args)
  File "/DATA/Sougata/espnet/espnet2/tasks/abs_task.py", line 1264, in main_worker
    model = cls.build_model(args=args)
  File "/DATA/Sougata/espnet/espnet2/tasks/asr.py", line 529, in build_model
    frontend = frontend_class(**args.frontend_conf)
TypeError: __init__() got an unexpected keyword argument 'fs'

Can anybody kindly please help me in resolving this error?

@mukherjeesougata mukherjeesougata added the Question Question label Apr 23, 2024
@mukherjeesougata mukherjeesougata changed the title How to use whisper How to use whisper as frontend? Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Question Question
Projects
None yet
Development

No branches or pull requests

1 participant