Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrify RawNet3/Resemblyzer as Keywords & Update READMEs #85

Merged
merged 6 commits into from
Jan 6, 2024

Conversation

Merakist
Copy link
Collaborator

@Merakist Merakist commented Jan 1, 2024

Description

This pull request includes modifications to further integrate Resemblyzer into Amphion's evaluation process, which includes:

  • Modifying the cosine similarity calculation method: Previously, the resemblyzer_similarity.py used the scipy.spatial.distance.cosine method for calculating cosine similarity. This update replaces it with PyTorch's torch.nn.functional.cosine_similarity method, which is already used by other scripts to simplify the codebase and ensure uniformity.

  • Metrifying RawNet3/Resemblyzer as keywords: Altered the --metrics argument handling from speaker_similarity to rawnet3_similarity and resemblyzer_similarity as distinct options to streamline the evaluation process by removing the need for user input during script execution.

Testing

  • The definitions of the scipy and torch method for calculating cosine similarity are intrinsically similar (See Documentations for SciPy and Torch). The results using both methods are calculated, and the difference between the two values is accurate to 7th decimal place, see Scipy - Torch Result Comparison. Therefore, the two methods are interchangeable with accuracies ensured.

Changes

  • Amphion/evaluation/metrics/similarity/resemblyzer_similarity.py:

    • Changed scipy.spatial.distance.cosine to torch.nn.functional.cosine_similarity method.
  • Amphion/bins/calc_metrics.py:

    • Altered speaker_similarity to two distinct options: rawnet3_similarity and resemblyzer_similarity.
    • Fixed bug of missing FAD.
  • Amphion/egs/metrics/README.md:

    • Updated Amphion's Evaluation Recipe to include Resemblyzer, replaced the speaker_similarity keyword with rawnet3_similarity and resemblyzer_similarity, and fixed misspelling.
  • Amphion/README.md:

    • Listing Resemblyzer as an available method for calculating speaker similarity results.

Usage

When calculating speaker similarity with Amphion/egs/metrics/run.sh using the command --metrics, the user shall select the desired model (RawNet3/Resemblyzer) for calculation with the corresponding keyword (rawnet3_similarity / resemblyzer_similarity).

@Merakist Merakist requested a review from lmxue January 1, 2024 14:15
Copy link
Collaborator

@RMSnow RMSnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR description

deg_dir, ref_dir, dump_dir
)
result[metric] = str(similarity_score)
if metric in ["fad", "rawnet3_similarity"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "fad" here? It seems that the original code does not contain the conditional judegment about fad. Is there a bug in the old code?

Copy link
Collaborator Author

@Merakist Merakist Jan 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. When modifying the input-selection part in the old code, the FAD part was mis-deleted. It should be here, as shown in the original commit, at line 64: 9682d0c#diff-4fa833e1c8dd8d05d182f8262a2cc5f727dc72a364db06f8acc5536eff3e6506
Screenshot 2024-01-01 at 22 55 28

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Owing to the firewalled status of huggingface.co on the Aliyun server, it went undetected because prior to testing calc_metrics.py, all parts concerning FAD had to be commented out, or else the script couldn't correctly initialize.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VocodexElysium Please review this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VocodexElysium Please review this.

Basically, if your internet environment does not support getting access to Hugging Face in the terminal, importing FAD will cause errors since it is trying to connect with Hugging Face. You can avoid this by setting the correct VPN environment or just downloading the necessary things yourself and then adjusting the FAD code to import the model on your computer rather than downloading from Hugging Face (I think MingXuan did this successfully). So I don't think removing FAD-related code is necessary since it only involves the internet environment and is solvable.

@Merakist Merakist requested a review from RMSnow January 1, 2024 14:59
@Merakist
Copy link
Collaborator Author

Merakist commented Jan 1, 2024

New commit: fixed a found typo in Amphion/egs/metrics/README.md: Aduio -> Audio

Copy link
Collaborator

@RMSnow RMSnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@VocodexElysium
Copy link
Collaborator

@Merakist Calculating FAD needs a connection to Hugging Face. I think you need to specify that and also attach the methodology of downloading and specifying the folder to the pretrained model for calculating FAD when the internet cannot guarantee a connection to Hugging Face in the README.md.

Copy link
Collaborator

@RMSnow RMSnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Merakist @VocodexElysium Please cooperate to refine the FAD-related doc to make it friendly and easy-to-understand for users

@Merakist
Copy link
Collaborator Author

Merakist commented Jan 2, 2024

Updated README. Proposed a solution for FAD to load local models when huggingface.co is unreachable in the command lines.

@VocodexElysium
Copy link
Collaborator

Updated README. Proposed a solution for FAD to load local models when huggingface.co is unreachable in the command lines.

Here are some advice:

  1. Use another name instead of “Additional Troublesome Information”. It’s weird to use such a term especially when that connection error bug is not caused by us but by some “internet issues”.
  2. Attach Hugging Face links for the pretrained models, people needs to know where to obtain them.
  3. Do not show your absolute path in the script since it will cause leakage on your server’s info.

@Merakist
Copy link
Collaborator Author

Merakist commented Jan 3, 2024

@VocodexElysium Roger. The new commit condensed the title from Additional Troubleshooting Info to Troubleshooting (I believe you have misread), which should be more explicit. The links to the models are included and paths are modified.

@VocodexElysium
Copy link
Collaborator

@VocodexElysium Roger. The new commit condensed the title from Additional Troubleshooting Info to Troubleshooting (I believe you have misread), which should be more explicit. The links to the models are included and paths are modified.

I think you forgot to create folders for bert, roberta, and bart in the Amphion/pretrained directory. Please add them up.

@Merakist
Copy link
Collaborator Author

Merakist commented Jan 4, 2024

@VocodexElysium Roger. The new commit condensed the title from Additional Troubleshooting Info to Troubleshooting (I believe you have misread), which should be more explicit. The links to the models are included and paths are modified.

I think you forgot to create folders for bert, roberta, and bart in the Amphion/pretrained directory. Please add them up.

@VocodexElysium I have updated the README.md under Amphion/pretrained to reflect the model dependencies for the evaluation pipeline with file structure trees, and created folders for bert-base-uncased, facebook/bart-base and roberta-base under Amphion/pretrained.

Copy link
Collaborator

@VocodexElysium VocodexElysium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Able to merge

@RMSnow RMSnow merged commit 9b287c8 into open-mmlab:main Jan 6, 2024
1 check passed
@Merakist Merakist deleted the metrify-resemblyzer branch January 25, 2024 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants