Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gcloud is not working #180

Open
suryacaprice opened this issue Dec 3, 2018 · 16 comments
Open

Gcloud is not working #180

suryacaprice opened this issue Dec 3, 2018 · 16 comments

Comments

@suryacaprice
Copy link

Waiting for the operation to finish.
Traceback (most recent call last):
File "/home/caprice/anaconda3/bin/invoice2data", line 11, in
sys.exit(main())
File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/main.py", line 166, in main
res = extract_data(f.name, templates=templates, input_module=input_module)
File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/main.py", line 90, in extract_data
extracted_str = input_module.to_text(invoicefile).decode('utf-8')
File "/home/caprice/anaconda3/lib/python3.6/site-packages/invoice2data/input/gvision.py", line 79, in to_text
json_string = result_blob.download_as_string()
AttributeError: 'NoneType' object has no attribute 'download_as_string'

@m3nu
Copy link
Collaborator

m3nu commented Dec 3, 2018

Google Vision needs a lot of setup. You need:

  • API key
  • bucket for results

There are no instructions for this as of now, but it should be clear from the source code. Did you do all the setup tasks correctly before encountering this error?

@suryacaprice
Copy link
Author

Hi , I have done all the configuration , Created bucket and the api is mapped to the project with the full access .

@suryacaprice
Copy link
Author

I dont think gcloud config is the problem here
json_string = result_blob.download_as_string()
AttributeError: 'NoneType' object has no attribute 'download_as_string'

this line shows the error

@m3nu
Copy link
Collaborator

m3nu commented Dec 3, 2018

Then I'd check if you have a result in your bucket because this line just reads the result.

If your configuration is wrong there won't be a result in the bucket and this specific line will fail.

@suryacaprice
Copy link
Author

Let me check the configuration again .

@ananthnagan
Copy link

does this pdf have multiple pages? if yes there is a problem in gvision.py where it is hardcoded to output1-1.json if pdf is more than one page gvision will create json file as output1-[no. of pages].json, so i have made a code change and it worked for me please find the code below

image

@m3nu
Copy link
Collaborator

m3nu commented Jun 17, 2019

so i have made a code change and it worked for me please find the code below

You should make a pull request for your fix. Else your improvement will never make it into the official repo and you need to maintain the change during every update.

@ananthnagan
Copy link

so i have made a code change and it worked for me please find the code below

You should make a pull request for your fix. Else your improvement will never make it into the official repo and you need to maintain the change during every update.

created the pull request

@EtienneBerube
Copy link

Hi, I am trying to make the Gvision work. I have my Google credential's json and am trying to figure out how to properly connect the bucket. I see that it is a default argument but none of the API calls refer to a bucket. Where can I specify my bucket?
Thanks

@Venerit
Copy link

Venerit commented Jul 2, 2019

Hey guys,
I was running in the same issue and tried ananthnagan's fix and it works for one multi page pdf but fails for another one.
Can't really figure out what the issue would be.
Any ideas?

Traceback (most recent call last):
File "/usr/local/bin/invoice2data", line 10, in
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/invoice2data/main.py", line 201, in main
res = extract_data(f.name, templates=templates, input_module=input_module)
File "/usr/local/lib/python3.6/dist-packages/invoice2data/main.py", line 82, in extract_data
extracted_str = input_module.to_text(invoicefile).decode('utf-8')
File "/usr/local/lib/python3.6/dist-packages/invoice2data/input/gvision.py", line 35, in to_text
result_blob_name = result_blob_basename + '/output-1-to-'+str(PdfFileReader(open(path, "rb")).getNumPages())+'.json'
File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1084, in init
self.read(stream)
File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1697, in read
line = self.readNextEndLine(stream)
File "/usr/local/lib/python3.6/dist-packages/PyPDF2/pdf.py", line 1938, in readNextEndLine
x = stream.read(1)

@ananthnagan
Copy link

Hi, I am trying to make the Gvision work. I have my Google credential's json and am trying to figure out how to properly connect the bucket. I see that it is a default argument but none of the API calls refer to a bucket. Where can I specify my bucket?
Thanks

its at top of the gvision.py there you can give your bucket name
image

@m3nu
Copy link
Collaborator

m3nu commented Jul 3, 2019

Right. When you integrate the lib in your own script, you can pass your bucket as optional keyword arg, as shown by @ananthnagan above.

@EtienneBerube
Copy link

This might work for a local solution, but if the code is in a docker which runs pip install the changes would be overridden.
@ananthnagan seems to go and get the bucket from the environment variables, which would be a good alternative. Could a PR for this be justifiable?

@EtienneBerube
Copy link

a PR is created regarding @ananthnagan's fix
#241

@bosd
Copy link
Collaborator

bosd commented Oct 24, 2022

I'm starting to look into gvision.
As there are no instructions, can someone point me which steps to take to make it work.
@rmilecki have you looked into the gvision input module?

@rmilecki
Copy link
Collaborator

@bosd: I have zero experience with OCR inputs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants