Skip to content

Commit

Permalink
templates: pl: add templates for InsERT's software issued invoices
Browse files Browse the repository at this point in the history
InsERT is one of the most popular Polish accounting software company.
They have two very common softwares:
1. Subiekt nexo
2. Subiekt GT

Add 2 templates to parse invoices generated by above softwares. Those
templates are software-specific so they have priority set to 3.

This allows parsing a lot invoices issued in Poland.

Signed-off-by: Rafał Miłecki <[email protected]>
  • Loading branch information
Rafał Miłecki authored and bosd committed Feb 25, 2023
1 parent f21e55a commit 7bed841
Show file tree
Hide file tree
Showing 3 changed files with 77 additions and 0 deletions.
43 changes: 43 additions & 0 deletions src/invoice2data/extract/templates/pl/pl.insert.subiekt-gt.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# SPDX-License-Identifier: MIT
keywords:
- 'Miejsce wystawienia:'
- 'Data wystawienia:'
- 'Sprzedawca:'
- 'Nabywca:'
- 'według stawki VAT'
- 'Razem do zapłaty:'
- 'Wystawił\(a\)'
- 'Odebrał\(a\)'
- 'Podpis osoby upoważnionej'
fields:
issuer:
parser: regex
regex: Sprzedawca:.*\n(.*?)\s{3,}
vatin:
parser: regex
regex: NIP:\s+(\d{10})
type: int
group: first
date:
parser: regex
regex:
- Data wystawienia:\n.*(\d{2}\.\d{2}\.\d{4})
- Data wystawienia:\n.*(\d{4}-\d{2}-\d{2})
type: date
invoice_number:
parser: regex
regex: Faktura VAT\s+(.*?)\s+oryginał
amount:
parser: regex
regex: Razem do zapłaty:\s+([\d\s]+,[\d][\d])
type: float
nrb:
parser: regex
regex: PLN:\s+([0-9]{2}(?:\s?[0-9]{4}){6})
options:
currency: PLN
date_formats:
- '%d.%m.%Y'
- '%Y-%m-%d'
decimal_separator: ','
priority: 3
33 changes: 33 additions & 0 deletions src/invoice2data/extract/templates/pl/pl.insert.subiekt-nexo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# SPDX-License-Identifier: MIT
keywords:
- 'InsERT nexo'
fields:
issuer:
parser: regex
regex: Sprzedawca.*\n(.*?)\s{3,}
vatin:
parser: regex
regex: NIP:\s+(\d{10})
type: int
group: first
date:
parser: regex
regex: Data wystawienia\s+(\d{2}-\d{2}-\d{4})
type: date
invoice_number:
parser: regex
regex: Faktura VAT sprzedaży\s+(.*)
group: first
amount:
parser: regex
regex: Razem do zapłaty:\s+([\d\s]+,[\d][\d])
type: float
nrb:
parser: regex
regex: PL\s+([0-9]{2}(?:\s?[0-9]{4}){6})
options:
currency: PLN
date_formats:
- '%d-%m-%Y'
decimal_separator: ','
priority: 3
1 change: 1 addition & 0 deletions src/invoice2data/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ def extract_data(invoicefile, templates=None, input_module=None):
optimized_str = t.prepare_input(extracted_str)
return t.extract(optimized_str, invoicefile, input_module)


def create_parser():
"""Returns argument parser """

Expand Down

0 comments on commit 7bed841

Please sign in to comment.