Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with custom CTC topology (with no blanks) #1222

Open
desh2608 opened this issue Jul 12, 2023 · 1 comment
Open

Training with custom CTC topology (with no blanks) #1222

desh2608 opened this issue Jul 12, 2023 · 1 comment

Comments

@desh2608
Copy link
Contributor

I am trying to train an icefall model with phones as output units with a custom topology which resembles the "modified" CTC topology in k2, but without the blank symbol --> let's call this no-blank CTC. The idea is that instead of the "peaky" behavior that CTC shows, removing blank would force the phones to be better aligned with the acoustic frames.

I created the no-blank CTC topology, converted my texts into phone IDs, and then obtained the graph as follows:

transcript_fsa = k2.linear_fsa(token_ids, self.device)
transcript_fsa_with_self_loops = k2.arc_sort(
    k2.add_epsilon_self_loops(transcript_fsa)
)

res = k2.compose(
    self.ctc_topo,
    transcript_fsa_with_self_loops,
    treat_epsilons_specially=False,
)
res = k2.arc_sort(res)

Since I don't have a blank symbol, I created the nnet with only as many outputs as I have phone tokens. However, when I started training this with k2.ctc_loss(), I get infinity loss. This is not a training issue because I get this right at the start, i.e., when I compute the validation loss at the start. This suggests that the problem is with the arc scores in the composition most likely. On looking around in k2, I found the following:

# The first column of b_fsas.scores is -inf,

Why is the first column of the dense FSA always negative infinity? Also, if I want to train with such a topology, are there other changes which may be needed?

@pkufool
Copy link
Collaborator

pkufool commented Jul 12, 2023

Why is the first column of the dense FSA always negative infinity?

The first column is designed for final arc (label = -1) in k2 fsa, see

class DenseFsaVec(object):
# Note: you can access members self.scores and self.dense_fsa_vec.
# self.scores is a torch.Tensor containing the scores; it will
# contain rows of the `log_probs` arg given to __init__ interspersed
# with rows representing final-arcs. The structure is something like:
#
# [ [ -inf x x x x x x ]
# [ -inf x x x x x x ]
# [ -inf x x x x x x ]
# [ 0 -inf -inf -inf.. ]
# [ -inf x x x x x x ]
# ...
# ]
# where the x's come from the `log_probs` arg, and the 0's and
# -inf's are added by this class (those special rows with no x's
# correspond to the final-arcs in the FSAs; the 0 corresponds to
# symbol -1.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants