Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to extract a table from a PDF to CSV? #1033

Open
martynjlewis opened this issue Aug 12, 2024 · 1 comment
Open

Is there a way to extract a table from a PDF to CSV? #1033

martynjlewis opened this issue Aug 12, 2024 · 1 comment

Comments

@martynjlewis
Copy link

Hi all

I have tried using pdfminer.six to extract a table from a pdf to a csv file to use in Excel but have been unsuccessful so far; I either get each entry on a separate line or I get each heading, then the corresponding cell but they run vertically rather than horizontally.
I've attached the pdf I created to test and the resulting output.

Can anyone help please?

test3.pdf
test3.csv

@devwasabi2
Copy link

devwasabi2 commented Aug 24, 2024

I've come across nice paddle models that can extract tables from pdf's and save them into a .csv file. Follow this link paddle. I hope it helps ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants