I want to make a "Speaker Recognition". Currently, I have finished MFCC
feature Extraction and Vector Quantization parts. I want to ask about
pattern matching.
I have tried to do pattern matching by looking for the distance of the
nearest clustered input vectors and vector in the database. The problem
is, how can I decide the speaker threshold? I found that the distance is
changed time by time, although the change is not significant. Can anyone
tell me how the pattern matching should be done?
Thank you very much