Skip to content

Commit

Permalink
Split handling of HTML attributes & style CSS properties (#1211)
Browse files Browse the repository at this point in the history
  • Loading branch information
Lucas-C committed Jun 28, 2024
1 parent b959923 commit 3b6cd42
Show file tree
Hide file tree
Showing 22 changed files with 858 additions and 341 deletions.
11 changes: 7 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
* support for quadratic and cubic Bézier curves with [`FPDF.bezier()`](https://py-pdf.github.io/fpdf2/fpdf/Shapes.html#fpdf.fpdf.FPDF.bezier) - thanks to @awmc000
* feature to identify the Unicode script of the input text and break it into fragments when different scripts are used, improving [text shaping](https://py-pdf.github.io/fpdf2/TextShaping.html) results
* [`FPDF.image()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.image): now handles `keep_aspect_ratio` in combination with an enum value provided to `x`
* file names are mentioned in errors when `fpdf2` fails to parse a SVG image
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): now supports CSS page breaks properties : [documentation](https://py-pdf.github.io/fpdf2/HTML.html#page-breaks)
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): spacing before lists can now be adjusted via the `HTML2FPDF.list_vertical_margin` attribute - thanks to @lcgeneralprojects
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): spacing before lists can now be adjusted via the `tag_styles` attribute - thanks to @lcgeneralprojects
* file names are mentioned in errors when `fpdf2` fails to parse a SVG image
### Fixed
* [`FPDF.local_context()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.local_context) used to leak styling during page breaks, when rendering `footer()` & `header()`
* [`fpdf.drawing.DeviceCMYK`](https://py-pdf.github.io/fpdf2/fpdf/drawing.html#fpdf.drawing.DeviceCMYK) objects can now be passed to [`FPDF.set_draw_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_draw_color), [`FPDF.set_fill_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_fill_color) and [`FPDF.set_text_color()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.set_text_color) without raising a `ValueError`: [documentation](https://py-pdf.github.io/fpdf2/Text.html#text-formatting).
Expand All @@ -38,10 +38,13 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
* default values for `top_margin` and `bottom_margin` in `HTML2FPDF._new_paragraph()` calls are now correctly converted into chosen document units.
* In [text_columns()](https://py-pdf.github.io/fpdf2/extColumns.html), paragraph top/bottom margins didn't correctly trigger column breaks; [issue #1214](https://github.com/py-pdf/fpdf2/issues/1214)
### Removed
* an obscure and undocumented [feature](https://github.com/py-pdf/fpdf2/issues/1198) of [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html), which used to magically pass local variables as arguments.
* an obscure and undocumented [feature](https://github.com/py-pdf/fpdf2/issues/1198) of [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html), which used to magically pass instance attributes as arguments.
### Deprecated
* `fpdf.TitleStyle` has been renamed into `fpdf.TextStyle`
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): `tag_indents` introduced in the last version - Now the indentation can be provided through the `tag_styles` parameter, using the `.l_margin` of `TextStyle` instances
### Changed
* [`FPDF.table()`](https://py-pdf.github.io/fpdf2/Tables.html) now raises an error when a single row is too high to be rendered on a single page
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): `tag_indents` can now be non-integer. Indentation of HTML elements is now independent of font size and bullet strings.
* [`FPDF.write_html()`](https://py-pdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): indentation of HTML elements can now be non-integer (float), and is now independent of font size and bullet strings.
* improved performance of font glyph selection by using functools cache

## [2.7.9] - 2024-05-17
Expand Down
24 changes: 19 additions & 5 deletions docs/HTML.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,16 +97,16 @@ pdf.write_html("""
<p>Hello world!</p>
</section>
""", tag_styles={
"h1": FontFace(color=(148, 139, 139), size_pt=32),
"h2": FontFace(color=(148, 139, 139), size_pt=24),
"h1": FontFace(color="#948b8b", size_pt=32),
"h2": FontFace(color="#948b8b", size_pt=24),
})
pdf.output("html_styled.pdf")
```

Similarly, the indentation of several HTML tags (`<blockquote>`, `<dd>`, `<li>`) can be set globally, for the whole HTML document, by passing `tag_indents` to `FPDF.write_html()`:
Similarly, the indentation of several HTML tags (`<blockquote>`, `<dd>`, `<li>`) can be set globally, for the whole HTML document, by passing `tag_styles` to `FPDF.write_html()`:

```python
from fpdf import FPDF
from fpdf import FPDF, TextStyle

pdf = FPDF()
pdf.add_page()
Expand All @@ -115,10 +115,23 @@ pdf.write_html("""
<dt>Term</dt>
<dd>Definition</dd>
</dl>
""", tag_indents={"dd": 5})
<blockquote>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non risus.
Suspendisse lectus tortor, dignissim sit amet, adipiscing nec, ultricies sed, dolor.
Cras elementum ultrices diam.
</blockquote>
""", tag_styles={
"dd": TextStyle(l_margin=5),
"blockquote": TextStyle(color="#ccc", font_style="I",
t_margin=5, b_margin=5, l_margin=10),
})
pdf.output("html_dd_indented.pdf")
```

⚠️ Note that this styling is currently only supported for a subset of all HTML tags,
and that some [`FontFace`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.FontFace) or [`TextStyle`](https://py-pdf.github.io/fpdf2/fpdf/fonts.html#fpdf.fonts.TextStyle) properties may not be honored.
However, **Pull Request are welcome** to implement missing features!


## Supported HTML features

Expand All @@ -143,6 +156,7 @@ pdf.output("html_dd_indented.pdf")
* `<td>`: cells (with `align`, `bgcolor`, `width`, `rowspan`, `colspan` attributes)

### Page breaks

_New in [:octicons-tag-24: 2.7.10](https://github.com/py-pdf/fpdf2/blob/master/CHANGELOG.md)_

Page breaks can be triggered explicitly using the [break-before](https://developer.mozilla.org/en-US/docs/Web/CSS/break-before) or [break-after](https://developer.mozilla.org/en-US/docs/Web/CSS/break-after) CSS properties.
Expand Down
5 changes: 3 additions & 2 deletions fpdf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
* `fpdf.enums.YPos`
* `fpdf.errors.FPDFException`
* `fpdf.fonts.FontFace`
* `fpdf.fpdf.TitleStyle`
* `fpdf.fonts.TextStyle`
* `fpdf.prefs.ViewerPreferences`
* `fpdf.template.Template`
* `fpdf.template.FlexTemplate`
Expand All @@ -25,7 +25,7 @@
FPDF_FONT_DIR as _FPDF_FONT_DIR,
FPDF_VERSION as _FPDF_VERSION,
)
from .fonts import FontFace
from .fonts import FontFace, TextStyle
from .html import HTMLMixin, HTML2FPDF
from .prefs import ViewerPreferences
from .template import Template, FlexTemplate
Expand Down Expand Up @@ -74,6 +74,7 @@
"Template",
"FlexTemplate",
"TitleStyle",
"TextStyle",
"ViewerPreferences",
# Deprecated classes:
"HTMLMixin",
Expand Down
8 changes: 8 additions & 0 deletions fpdf/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,14 @@ def style(self):
name for name, value in self.__class__.__members__.items() if value & self
)

def add(self, value: "TextEmphasis"):
return self | value

def remove(self, value: "TextEmphasis"):
return TextEmphasis.coerce(
"".join(s for s in self.style if s not in value.style)
)

@classmethod
def coerce(cls, value):
if isinstance(value, str):
Expand Down
78 changes: 77 additions & 1 deletion fpdf/fonts.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
in non-backward-compatible ways.
"""

import re
import re, warnings

from bisect import bisect_left
from collections import defaultdict
Expand All @@ -31,6 +31,7 @@ def __deepcopy__(self, _memo):
except ImportError:
hb = None

from .deprecation import get_stack_level
from .drawing import convert_to_device_color, DeviceGray, DeviceRGB
from .enums import FontDescriptorFlags, TextEmphasis
from .syntax import Name, PDFObject
Expand Down Expand Up @@ -109,6 +110,81 @@ def combine(default_style, override_style):
)


class TextStyle(FontFace):
"""
Subclass of `FontFace` that allows to specify vertical & horizontal spacing
"""

def __init__(
self,
font_family: Optional[str] = None, # None means "no override"
# Whereas "" means "no emphasis"
font_style: Optional[str] = None,
font_size_pt: Optional[int] = None,
color: Union[int, tuple] = None, # grey scale or (red, green, blue),
fill_color: Union[int, tuple] = None, # grey scale or (red, green, blue),
underline: bool = False,
t_margin: Optional[int] = None,
l_margin: Optional[int] = None,
b_margin: Optional[int] = None,
):
super().__init__(
font_family,
((font_style or "") + "U") if underline else font_style,
font_size_pt,
color,
fill_color,
)
self.t_margin = t_margin or 0
self.l_margin = l_margin or 0
self.b_margin = b_margin or 0

def __repr__(self):
return (
super().__repr__()[:-1]
+ f", t_margin={self.t_margin}, l_margin={self.l_margin}, b_margin={self.b_margin})"
)

def replace(
self,
/,
font_family=None,
emphasis=None,
font_size_pt=None,
color=None,
fill_color=None,
t_margin=None,
l_margin=None,
b_margin=None,
):
return TextStyle(
font_family=font_family or self.family,
font_style=self.emphasis if emphasis is None else emphasis.style,
font_size_pt=font_size_pt or self.size_pt,
color=color or self.color,
fill_color=fill_color or self.fill_color,
t_margin=self.t_margin if t_margin is None else t_margin,
l_margin=self.l_margin if l_margin is None else l_margin,
b_margin=self.b_margin if b_margin is None else b_margin,
)


class TitleStyle(TextStyle):
def __init__(self, *args, **kwargs):
warnings.warn(
(
"fpdf.TitleStyle is deprecated since 2.7.10."
" It has been replaced by fpdf.TextStyle."
),
DeprecationWarning,
stacklevel=get_stack_level(),
)
super().__init__(*args, **kwargs)


__pdoc__ = {"TitleStyle": False} # Replaced by TextStyle


class CoreFont:
# RAM usage optimization:
__slots__ = ("i", "type", "name", "up", "ut", "cw", "fontkey", "emphasis")
Expand Down
93 changes: 31 additions & 62 deletions fpdf/fpdf.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ class Image:
YPos,
)
from .errors import FPDFException, FPDFPageFormatException, FPDFUnicodeEncodingException
from .fonts import CoreFont, CORE_FONTS, FontFace, TTFFont
from .fonts import CoreFont, CORE_FONTS, FontFace, TextStyle, TitleStyle, TTFFont
from .graphics_state import GraphicsStateMixin
from .html import HTML2FPDF
from .image_datastructures import (
Expand Down Expand Up @@ -142,36 +142,6 @@ class Image:
}


class TitleStyle(FontFace):
def __init__(
self,
font_family: Optional[str] = None, # None means "no override"
# Whereas "" means "no emphasis"
font_style: Optional[str] = None,
font_size_pt: Optional[int] = None,
color: Union[int, tuple] = None, # grey scale or (red, green, blue),
underline: bool = False,
t_margin: Optional[int] = None,
l_margin: Optional[int] = None,
b_margin: Optional[int] = None,
):
super().__init__(
font_family,
((font_style or "") + "U") if underline else font_style,
font_size_pt,
color,
)
self.t_margin = t_margin
self.l_margin = l_margin
self.b_margin = b_margin

def __repr__(self):
return (
super().__repr__()[:-1]
+ f", t_margin={self.t_margin}, l_margin={self.l_margin}, b_margin={self.b_margin})"
)


class ToCPlaceholder(NamedTuple):
render_function: Callable
start_page: int
Expand Down Expand Up @@ -307,7 +277,7 @@ def __init__(
self._toc_placeholder = None # optional ToCPlaceholder instance
self._outline = [] # list of OutlineSection
self._sign_key = None
self.section_title_styles = {} # level -> TitleStyle
self.section_title_styles = {} # level -> TextStyle

self.core_fonts_encoding = "latin-1"
"Font encoding, Latin-1 by default"
Expand Down Expand Up @@ -413,25 +383,24 @@ def write_html(self, text, *args, **kwargs):
Args:
text (str): HTML content to render
image_map (function): an optional one-argument function that map <img> "src"
to new image URLs
li_tag_indent (int): [**DEPRECATED since v2.7.8**]
numeric indentation of <li> elements - Set tag_indents instead
dd_tag_indent (int): [**DEPRECATED since v2.7.8**]
numeric indentation of <dd> elements - Set tag_indents instead
table_line_separators (bool): enable horizontal line separators in <table>
ul_bullet_char (str): bullet character preceding <li> items in <ul> lists.
li_prefix_color (tuple | str | drawing.Device* instance):
color for bullets or numbers preceding <li> tags.
This applies to both <ul> & <ol> lists.
heading_sizes (dict): [**DEPRECATED since v2.7.8**]
font size per heading level names ("h1", "h2"...) - Set tag_styles instead
pre_code_font (str): [**DEPRECATED since v2.7.8**]
font to use for <pre> & <code> blocks - Set tag_styles instead
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags
tag_indents (dict):
mapping of HTML tag names to numeric values representing their horizontal left identation
tag_styles (dict): mapping of HTML tag names to colors
image_map (function): an optional one-argument function that map `<img>` "src" to new image URLs
li_tag_indent (int): [**DEPRECATED since v2.7.9**]
numeric indentation of `<li>` elements - Set `tag_styles` instead
dd_tag_indent (int): [**DEPRECATED since v2.7.9**]
numeric indentation of `<dd>` elements - Set `tag_styles` instead
table_line_separators (bool): enable horizontal line separators in `<table>`. Defaults to `False`.
ul_bullet_char (str): bullet character preceding `<li>` items in `<ul>` lists.
Can also be configured using the HTML `type` attribute of `<ul>` tags.
li_prefix_color (tuple, str, fpdf.drawing.DeviceCMYK, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): color for bullets
or numbers preceding `<li>` tags. This applies to both `<ul>` & `<ol>` lists.
heading_sizes (dict): [**DEPRECATED since v2.7.9**]
font size per heading level names ("h1", "h2"...) - Set `tag_styles` instead
pre_code_font (str): [**DEPRECATED since v2.7.9**]
font to use for `<pre>` & `<code>` blocks - Set `tag_styles` instead
warn_on_tags_not_matching (bool): control warnings production for unmatched HTML tags. Defaults to `True`.
tag_indents (dict): [**DEPRECATED since v2.7.10**]
mapping of HTML tag names to numeric values representing their horizontal left identation. - Set `tag_styles` instead
tag_styles (dict[str, fpdf.fonts.TextStyle]): mapping of HTML tag names to `fpdf.TextStyle` or `fpdf.FontFace` instances
"""
html2pdf = self.HTML2FPDF_CLASS(self, *args, **kwargs)
with self.local_context():
Expand Down Expand Up @@ -5033,18 +5002,18 @@ def set_section_title_styles(
After calling this method, calls to `FPDF.start_section` will render section names visually.
Args:
level0 (TitleStyle): style for the top level section titles
level1 (TitleStyle): optional style for the level 1 section titles
level2 (TitleStyle): optional style for the level 2 section titles
level3 (TitleStyle): optional style for the level 3 section titles
level4 (TitleStyle): optional style for the level 4 section titles
level5 (TitleStyle): optional style for the level 5 section titles
level6 (TitleStyle): optional style for the level 6 section titles
level0 (TextStyle): style for the top level section titles
level1 (TextStyle): optional style for the level 1 section titles
level2 (TextStyle): optional style for the level 2 section titles
level3 (TextStyle): optional style for the level 3 section titles
level4 (TextStyle): optional style for the level 4 section titles
level5 (TextStyle): optional style for the level 5 section titles
level6 (TextStyle): optional style for the level 6 section titles
"""
for level in (level0, level1, level2, level3, level4, level5, level6):
if level and not isinstance(level, TitleStyle):
if level and not isinstance(level, TextStyle):
raise TypeError(
f"Arguments must all be TitleStyle instances, got: {type(level)}"
f"Arguments must all be TextStyle instances, got: {type(level)}"
)
self.section_title_styles = {
0: level0,
Expand Down Expand Up @@ -5115,7 +5084,7 @@ def start_section(self, name, level=0, strict=True):
)

@contextmanager
def _use_title_style(self, title_style: TitleStyle):
def _use_title_style(self, title_style: TextStyle):
if title_style:
if title_style.t_margin:
self.ln(title_style.t_margin)
Expand Down Expand Up @@ -5177,7 +5146,7 @@ def table(self, *args, **kwargs):
relative to the page, when it's not using the full page width.
borders_layout (str, fpdf.enums.TableBordersLayout): optional, default to ALL. Control what cell
borders are drawn.
cell_fill_color (int, tuple, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): optional.
cell_fill_color (int, tuple, fpdf.drawing.DeviceCMYK, fpdf.drawing.DeviceGray, fpdf.drawing.DeviceRGB): optional.
Defines the cells background color.
cell_fill_mode (str, fpdf.enums.TableCellFillMode): optional. Defines which cells are filled
with color in the background.
Expand Down
Loading

0 comments on commit 3b6cd42

Please sign in to comment.