feat: Support for userCol and itemCol as string types in SAR model #2283

dciborow · 2024-09-07T02:41:01Z

Add support for userCol and itemCol as string types in the SAR model.

Python Files:
- Add core/src/main/python/synapse/ml/recommendation/SAR.py to handle string userCol and itemCol.
- Modify core/src/main/python/synapse/ml/recommendation/SARModel.py to handle string userCol and itemCol in the recommendForUserSubset function.
Scala Files:
- Modify core/src/main/scala/com/microsoft/azure/synapse/ml/recommendation/SAR.scala to handle string userCol and itemCol in the calculateUserItemAffinities and calculateItemItemSimilarity functions.
- Modify core/src/main/scala/com/microsoft/azure/synapse/ml/recommendation/SARModel.scala to handle string userCol and itemCol.
Tests:
- Update core/src/test/python/synapsemltest/recommendation/test_ranking.py to include tests for string userCol and itemCol.
- Update core/src/test/scala/com/microsoft/azure/synapse/ml/recommendation/SARSpec.scala to include tests for string userCol and itemCol.
Documentation:
- Update docs/Quick Examples/estimators/core/_Recommendation.md to include examples with string userCol and itemCol.

For more details, open the Copilot Workspace session.

acrolinxatmsft1 · 2024-09-07T02:41:09Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

acrolinxatmsft1 · 2024-09-07T02:47:36Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

acrolinxatmsft1 · 2024-09-07T02:48:36Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

core/src/main/scala/com/microsoft/azure/synapse/ml/recommendation/SAR.scala

core/src/main/scala/com/microsoft/azure/synapse/ml/recommendation/SARModel.scala

…ion/SARModel.scala

acrolinxatmsft1 · 2024-09-07T02:50:03Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

…ion/SAR.scala

acrolinxatmsft1 · 2024-09-07T02:50:18Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

dciborow · 2024-09-07T02:53:30Z

/azp run

azure-pipelines · 2024-09-07T02:53:40Z

Azure Pipelines successfully started running 1 pipeline(s).

codecov-commenter · 2024-09-07T03:05:53Z

Codecov Report

Attention: Patch coverage is 0% with 21 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (f3953bc) to head (09557ea).

Files with missing lines	Patch %	Lines
...icrosoft/azure/synapse/ml/recommendation/SAR.scala	0.00%	21 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (f3953bc) and HEAD (09557ea). Click for more details.

HEAD has 152 uploads less than BASE

Flag BASE (f3953bc) HEAD (09557ea)

157 5

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #2283       +/-   ##
==========================================
- Coverage   84.53%   0.00%   -84.54%     
==========================================
  Files         327     327               
  Lines       16788   16808       +20     
  Branches     1500    1499        -1     
==========================================
- Hits        14191       0    -14191     
- Misses       2597   16808    +14211

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

acrolinxatmsft1 · 2024-09-07T03:48:17Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

core/src/test/scala/com/microsoft/azure/synapse/ml/recommendation/SARSpec.scala

acrolinxatmsft1 · 2024-09-07T03:48:35Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

docs/Quick Examples/estimators/core/_Recommendation.md

acrolinxatmsft1 · 2024-09-07T03:53:04Z

Acrolinx Scorecards

A minimum total score of 80 is required.

Select the total score link to review all feedback on clarity, consistency, tone, brand, terms, spelling, grammar, readability, and inclusive language. You should fix all spelling errors regardless of your total score. Fixing spelling errors helps maintain customer trust in overall content quality.

Article	Total score (Required: 80)	Words + phrases (Brand, terms)	Correctness (Spelling, grammar)	Clarity (Readability)
✅ docs/Quick Examples/estimators/core/_Recommendation.md	72	100	32	100

More information about Acrolinx

core/src/main/python/synapse/ml/recommendation/SARModel.py

* **SAR.scala** - Update `calculateUserItemAffinities` method to handle integer types for `userId` and `itemId` - Update `calculateItemItemSimilarity` method to handle integer types for `userId` and `itemId` * **test_ranking.py** - Add test cases to verify the functionality of SAR model with integer types for `userId` and `itemId`

dciborow · 2024-09-07T04:29:23Z

/azp run

azure-pipelines · 2024-09-07T04:29:33Z

Azure Pipelines successfully started running 1 pipeline(s).

…/recommendation/SARSpec.scala

…n SARSpec.scala * Add a test case for handling User Column with Strings * Add a test case for handling User Column with different datatypes * Verify the handling of User Column with Strings and other datatypes in SAR.scala * Ensure the new test cases are concise and focused on the new code * Place the new test cases in an appropriate location within the file

dciborow requested review from eisber and mhamilton723 as code owners September 7, 2024 02:41

dciborow changed the title ~~Support for userCol and itemCol as string types in SAR model~~ feat: Support for userCol and itemCol as string types in SAR model Sep 7, 2024

feat: add string to sar

d580344

dciborow force-pushed the dciborow/add-string-support branch from 2bbf3cc to d580344 Compare September 7, 2024 02:47

Delete core/src/main/python/synapse/ml/recommendation/SAR.py

302fb7c