Regression Test Pipeline #120

Abhinay1997 · 2024-04-20T18:14:50Z

Add English text normlaization.
Add WER calculations.
Compare and check norm outputs from Python implementation
Add WER to the regression tests once Memory and Latency Regression Tests #99 is merged

Abhinay1997 · 2024-05-02T16:03:03Z

Code is still messy. Needs cleanup once the normalization starts working.

Abhinay1997 · 2024-05-04T16:29:29Z

Running tests locally. Adding more unit tests for the new normalization code.

atiorh

@Abhinay1997 As discussed offline, here are the last few items before we can merge this one:

Removing example test result JSONs
Removing the Fraction implementation altogether
Adding AudioEncoder latency measurements to LatencyStats

We discussed the following as nice to haves (could defer to a future PR):

Writing the following static attributes to the resulting JSON:

"static_attributes": 
- whisperkit_version (string)
- os (string)
- encoder_compute_units (string)
- decoder_compute_units (string)
- enable_word_timestamps (boolean)
- enable_eager_decoding (boolean)
- enable_vad (boolean)
- silence_threshold 
- is_low_power_mode (boolean)
- is_stream_simulated (boolean)

Periodically (e.g. 1 minute) measuring the following stats in a separate thread and writing the timeseries results to the final JSON:

"system_measurements":
- thermal_state (string)
- device_temperature (int)
- memory_total_available_gb (float)
- memory_total_used_gb (float)
- memory_app_allocated_gb (float)
- memory_app_used_gb (float)
- memory_swap_used_gb (float)
- battery_level (float)
- disk_total_space_gb (float)
- disk_free_space_gb (float)

* Remove Fraction.swift * Remove commented out redundant code

Abhinay1997 · 2024-08-12T16:47:52Z

@atiorh made the changes except for the AudioEncoder latency stats. Need to add a callback for that. Discussing with Zack on this.

Abhinay1997 added 5 commits April 20, 2024 21:29

daa68c8

Merge branch 'main' into wer_utils

e2e5632

Add basic Fraction type to handle Number normalization

6af9f5a

Add EnglishNumberNormalizer

87c230a

Merge branch 'main' into wer_utils

d8cda9f

Abhinay1997 added 2 commits May 4, 2024 21:55

Adds Basic Fraction type for WER

b8c30fe

Refactor + Add english normalizers

06e66e4

ZachNagengast linked an issue May 7, 2024 that may be closed by this pull request

English text normalization utilization for Eager Streaming Mode #111

Open

ZachNagengast removed a link to an issue May 7, 2024

English text normalization utilization for Eager Streaming Mode #111

Open

ZachNagengast mentioned this pull request May 7, 2024

English text normalization utilization for Eager Streaming Mode #111

Open

Abhinay1997 added 8 commits May 9, 2024 00:20

Bug fixes in number normalization. regex, multiplier processing.

3334d44

wer evaluate function + string optimization

da3a719

Add wer test on long audio

acb80ff

Remove Wagner-Fischer, fix normalization bugs.

dbbf9bf

Hirschberg's LCS Algorithm for edit operations

16a5525

Remove warnings in Fraction implementation

70456b3

Add tests

a3c94cc

Merge branch 'main' into wer_utils

b7e52fa

Abhinay1997 marked this pull request as ready for review May 28, 2024 03:22

Abhinay1997 added 7 commits May 29, 2024 07:46

Refactoring

60f8956

Refactor regression tests

89df136

Add WER to regression test results, fix overflow

ad13284

clean up files

47be844

Merge branch 'main' into wer_utils

bf46309

patch overflow for now.

6296506

Re-add file needed for tests

6a28fc1

ZachNagengast changed the title ~~English Normalisation and WER Utils~~ Regression Test Pipeline Jun 25, 2024

ZachNagengast linked an issue Jun 25, 2024 that may be closed by this pull request

Benchmark for WhisperAX & CLI #28

Open

ZachNagengast removed a link to an issue Jun 25, 2024

Benchmark for WhisperAX & CLI #28

Open

ZachNagengast added the enhancement Improves existing code label Jun 25, 2024

ZachNagengast and others added 3 commits July 28, 2024 15:10

Fix xcode test attachment

26bb7c6

Fix overflow when using Int.

01baf7b

Add flag to run only on first audio file of the dataset

cca6f50

atiorh requested changes Aug 5, 2024

View reviewed changes

Abhinay1997 and others added 5 commits August 6, 2024 18:33

3fceef3

PR Clenup:

ad4c7f5

* Remove Fraction.swift * Remove commented out redundant code

Merge branch 'main' into wer_utils

74ad9be

Adds system memory, disk space and battery level tracking.

525657b

Remove sample JSON

83ffc3f

Merge branch 'main' into wer_utils

a8d6e27

Abhinay1997 requested a review from atiorh August 12, 2024 16:54

Abhinay1997 added 3 commits August 12, 2024 22:37

Fix compilation on non macOS

2f3be51

Fix battery checks for watchOS

d9bc43b

Fix imports

c99bd94

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression Test Pipeline #120

Regression Test Pipeline #120

Abhinay1997 commented Apr 20, 2024 •

edited by ZachNagengast

Loading

Abhinay1997 commented May 2, 2024

Abhinay1997 commented May 4, 2024

atiorh left a comment •

edited

Loading

Abhinay1997 commented Aug 12, 2024

Regression Test Pipeline #120

Are you sure you want to change the base?

Regression Test Pipeline #120

Conversation

Abhinay1997 commented Apr 20, 2024 • edited by ZachNagengast Loading

Abhinay1997 commented May 2, 2024

Abhinay1997 commented May 4, 2024

atiorh left a comment • edited Loading

Choose a reason for hiding this comment

Abhinay1997 commented Aug 12, 2024

Abhinay1997 commented Apr 20, 2024 •

edited by ZachNagengast

Loading

atiorh left a comment •

edited

Loading