Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with best_match_for_query #25

Open
lilila opened this issue Jul 2, 2012 · 7 comments
Open

problem with best_match_for_query #25

lilila opened this issue Jul 2, 2012 · 7 comments

Comments

@lilila
Copy link

lilila commented Jul 2, 2012

Dear all,
Can someone explains me what I am doing wrong?

I am trying to make my own database of fingerprint, on which I want to evaluate the audio identification performance with some specific degradation of the signal.

I have an audio file named : audiofile
I compute the fingerprint of this file using the entire song:

           res = song.util.codegen(audiofile, start = -1, duration = -1)

using the function contained in dedup by lamere I extract trid, raw_code and ingest_data using:

            trid, raw_code, ingest_data = dedup.parse_json_block(res) 

then I ingest the new data:

            fp.ingest([ingest_data], do_commit = True, local = True)

and I add the new song to my dictionary named done:

            done[audiofile] = trid

Now I want to make a test and identify the same song (at a first step), what I really want to do is to retrieve the song given a short excerpt of the song. I do not manage to do that.

When I call :

querykey = song.util.codegen(audioexpert,start = -1, duration = -1)    
trid, raw_code, ingest_data =  dedup.parse_json_block(querykey)
response = fp.best_match_for_query(raw_code, local = True)

I have

response.match() = False :(

What is wrong with what I am doing? I am not even trying to recognize a modified version of the song.

I have a last question, If I compute the codegen using the entire song, is the system suppose to identify the song when I query a short excerpt of the song?

Thank you for your help,
Lila

@alnesbit
Copy link
Contributor

alnesbit commented Jul 2, 2012

Hi Lila,

How long is the query audio? Can you please send me the files audiofile and audioexpert? I will then try to reproduce your issue and see what the problem is.

Thanks,

Andrew

@lilila
Copy link
Author

lilila commented Jul 10, 2012

Dear Andrew,
Sorry for the delay. I guess my error was similar to the error mentioned in issue #22.

I still have a few questions for you:
When I compute the fingerprint of a file with duration > 15 sec, the file is segmented into multiples parts. According to the description of split_codes(fp) (from fp.py), the codes are supposed to overlap every 30 sec. From my experiments the hop size is rather 15 sec than 30. Is this right?

Also, still in fp.py, can you explain me what is the aim of the variable "slop" in the function "actual_matches"?

I have also performed some evaluations on the matching of two fingerprints computed on the same audio file with different "start" values. First, I have noticed that the matching is very low when the fingerprints are computed with different "start" values. Is there a way to improve this? I understand that the values can hardly coincide at the beginning and the end of the file, but I find strange that the fingerprints do not match better in the middle of the file. The only way I have found to increase the recognition rate is the store in the database many fingerprints of the same audiofile starting at different instants :(

Also, I have tried to change the start values by a very small time lapse (2ms) to see if it was possible to get a better matching. I thought the low matching could came from the fact that the moving windows used during the fingerprint process of the reference and the query audio files were not aligned. Then I found that there is no difference for " i < start < i+1 ", where i is an integer value. In other words, the float values for "start" are not recognized. Is this normal? Am I doing something silly again?

Thanks for your help,
Best regards,
Lila

@alnesbit
Copy link
Contributor

Hi Lila,

When I compute the fingerprint of a file with duration > 15 sec, the file is segmented into multiples parts. According to the description of split_codes(fp) (from fp.py), the codes are supposed to overlap every 30 sec. From my experiments the hop size is rather 15 sec than 30. Is this right?

I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds.

Also, still in fp.py, can you explain me what is the aim of the variable "slop" in the function "actual_matches"?

This reduces the resolution of the time codes to reduce the sensitivity to timing jitter when time aligning between query and fingerprint. It is a trade off between sensitivity and timing jitter. (Also, see below.)

I have also performed some evaluations on the matching of two fingerprints computed on the same audio file with different "start" values. First, I have noticed that the matching is very low when the fingerprints are computed with different "start" values. Is there a way to improve this? I understand that the values can hardly coincide at the beginning and the end of the file, but I find strange that the fingerprints do not match better in the middle of the file. The only way I have found to increase the recognition rate is the store in the database many fingerprints of the same audiofile starting at different instants :(

One reason might be that the codegen takes a little while to warm up, so you need to let it run for long enough - right now this is at least 20 seconds, but we're working on getting this down. What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using?

Also, I have tried to change the start values by a very small time lapse (2ms) to see if it was possible to get a better matching. I thought the low matching could came from the fact that the moving windows used during the fingerprint process of the reference and the query audio files were not aligned.

How are you shifting the start values? Are you doing this to the audio file at the signal level, or are you adjusting the time codes after the fingerprint has been generated? The absolute values of the time codes shouldn't really matter too much. The important thing is that the shifts between the query and database fingerprint time codes are consistent, i.e., it's the relative time shifts which are important to get right. This is where the slop factor above comes in; it makes the relative distances between time codes more "sloppy" so that the differences between time codes in query and database fingerprints match to a greater extent.

Then I found that there is no difference for " i < start < i+1 ", where i is an integer value. In other words, the float values for "start" are not recognized. Is this normal? Am I doing something silly again?

I'm not sure what you mean here. What is start?

Best,

Andrew

@lilila
Copy link
Author

lilila commented Jul 25, 2012

Thank you for your answers

I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds.

--> So you change the denominator in segmentlength = 60 * 1000.0 / 23.2,

What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using?

--> I was trying to reduce as much as possible the length of the query (from 30 sec to 5 sec). The results are the following: (duration in sec | accuracy: percentage of audio excerpt correctly identified)
5 sec : 8%
10 sec: 65%
15 sec: 82%
20 sec: 85%
25 sec: 87%
30 sec: 87%
I should mentioned that these results were obtained using a small data set made of radio broadcast recordings.
For information, what I am trying to do is to identify the radio channel someone is listening to. I am also trying to use fingerprinting to align some audio recordings. To do so I use the values computed in actual_match to determine the delay between the query and the reference.

Concerning the time alignment, I noticed something weird: there is a time-lag that increases linearly while the start value increases. (Just to make sure, what I call "start value" is the second argument of song.util.codegen(audioseg, start = 0, duration = 30)).
Sorry if it is not very clear I will try my best to explain this problem. We consider I have an audiofile with length 30 min, this file is fingerprinted and ingested in the data set as a reference. Now I have a few queries, that are basically excerpts of this long file.
If the query has a small "start value" (i.e from 0 to 5 sec approximately) then the time alignment is perfect, when the start value increases (5<start<10) then there is a time-lag between the true start and the estimated start of about 2 sec and so on. At the end of the file I have a time lag of almost 30 sec ! The values given here are of course not exact (the time-lag increases linearly !! ), I gave them just to make the explanation a little bit clearer ( I guess it is still confusing ??? :-/ )
I found a way to avoid this problem (use short segments of the long audiofile instead of the whole file), however it is not a solution. I was wondering if the problem could come from the fact that you use x = 23.2 instead of 256/11025*1000? Anyway it is just to let you know that this problem exists.

I'm not sure what you mean here. What is start?
-> As I saied above, "start " is the second parameters of the song.util.codegen function.

Thank you for your time
lila

@abuharsky
Copy link

Hi Lila,
what are your success in using echoprint for now?

@lilila
Copy link
Author

lilila commented Jan 29, 2015

It has been a while I haven't used it but it used to work perfectly

@picozone
Copy link

Hi Lila,
How did you done for the twice ingestions problem ? How can I start up again for brand new database ?
Thank you for your time,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants