Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Intelligence exception when using Prebuilt tax document model tax.us.1099COMBO.2023 #41867

Open
bjamin5 opened this issue Sep 16, 2024 · 2 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Document Intelligence needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service This issue points to a problem in the service.

Comments

@bjamin5
Copy link

bjamin5 commented Sep 16, 2024

When calling the prebuilt model tax.us.1099COMBO.2023 for consolidated tax statements I'm getting this exception when calling SyncPoller.getFinalResult();

Exception

java.util.concurrent.FutureTask@6a5880a0[Completed exceptionally: java.io.UncheckedIOException: java.io.IOException: java.time.format.DateTimeParseException: Text 'Various' could not be parsed at index 0]

Stack Trace

getResultWithTimeout:480,

ImplUtils (com.azure.core.implementation) pollingLoop:70,

PollingUtil (com.azure.core.util.polling) MyClass that calls getFinalResult()

Code to Reproduce

ExponentialBackoffOptions exponentialBackoffOptions = new ExponentialBackoffOptions()
                    .setMaxRetries(20) 
                    .setBaseDelay(Duration.ofMillis(5))
                    .setMaxDelay(Duration.ofSeconds(20));

RetryOptions retryOptions = new RetryOptions(exponentialBackoffOptions);

client = new DocumentIntelligenceClientBuilder()
          .credential(new AzureKeyCredential(apiKey))
          .endpoint(endpoint)
          .serviceVersion(DocumentIntelligenceServiceVersion.V2024_07_31_PREVIEW)
          .retryOptions(retryOptions)
          .buildClient();
  
SyncPoller<AnalyzeResultOperation, AnalyzeResult> analyzeDocumentPoller = client.beginAnalyzeDocument(
                    "tax.us.1099COMBO.2023", 
                    null,
                    null,
                    null,
                    null,
                    null,
                    null,
                    null,
                    new AnalyzeDocumentRequest().setBase64Source(fileData)
            );
AnalyzeResult analyzeDocumentResult = analyzeDocumentPoller.getFinalResult();

Screenshots
image
image

Expected/desired behavior

It should poll until the AnalyzeResult object is returned.

Versions

JRE: liberica-21

azure sdk client: com.azure/azure-ai-documentintelligence/1.0.0-beta.4

Other Information

I've tried mulitiple examples and it seems to be just the 1099combo model with this bug. 1099Int and 1099div samples did not throw this exception when passed into this combo model. No problem seems to occur when using the Document Intelligence Studio in the browser.

Here is a pdf of a 1099-consolidated statement with fake information that causes this exception:

Standard.Consolidated.pdf

@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Sep 16, 2024
@alzimmermsft alzimmermsft added Document Intelligence Client This issue points to a problem in the data-plane of the library. labels Sep 18, 2024
@github-actions github-actions bot removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Sep 18, 2024
@alzimmermsft
Copy link
Member

Thanks for filing this issue @bjamin5!

Taking a rough look at the PDF you've included, I wonder if there is a mix up with those "Date Acquired" fields with values "various".

@samvaity, @mssfang could you take deeper look into this whether this is an SDK bug.

@samvaity
Copy link
Member

samvaity commented Sep 18, 2024

@bjamin5 I can confirm we are seeing the error on the SDK.
It is due to the incorrect result returned from the service for "Box1b" where the type required is "date" but returned as string "Various".
So in the SDK we fail here:
valueDate = reader.getNullable(nonNullReader -> LocalDate.parse(nonNullReader.getString()));

image
@bojunehsu: Could you take a look at the model returning incorrect type for fields from the service end?

@alzimmermsft: In my opinion, SDK throwing the parsing error is correct. Do you think we need to add better handling here?

@samvaity samvaity added the Service This issue points to a problem in the service. label Sep 18, 2024
@github-actions github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Document Intelligence needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service This issue points to a problem in the service.
Projects
None yet
Development

No branches or pull requests

4 participants