Skip to content

Fix for set_fallback_fonts() when markdown=True #712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 16, 2023

Conversation

Lucas-C
Copy link
Member

@Lucas-C Lucas-C commented Mar 1, 2023

As a follow-up of #669,
fix the rendering issue visible in this test reference file:
https://github.com/PyFPDF/fpdf2/blob/89c528d/test/fonts/font_fallback.pdf

@Lucas-C Lucas-C mentioned this pull request Mar 1, 2023
5 tasks
@Lucas-C Lucas-C force-pushed the font-fallback-markdown branch 2 times, most recently from 07f4e0b to efc5b67 Compare March 1, 2023 08:01
@codecov
Copy link

codecov bot commented Mar 1, 2023

Codecov Report

Patch coverage: 69.81% and project coverage change: -0.22 ⚠️

Comparison is base (f8de21c) 93.75% compared to head (602fbd9) 93.53%.

❗ Current head 602fbd9 differs from pull request most recent head c5284a9. Consider uploading reports for the commit c5284a9 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #712      +/-   ##
==========================================
- Coverage   93.75%   93.53%   -0.22%     
==========================================
  Files          26       26              
  Lines        7009     7051      +42     
  Branches     1258     1269      +11     
==========================================
+ Hits         6571     6595      +24     
- Misses        260      271      +11     
- Partials      178      185       +7     
Impacted Files Coverage Δ
fpdf/enums.py 95.51% <57.89%> (-4.49%) ⬇️
fpdf/fpdf.py 91.83% <100.00%> (-0.09%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Lucas-C
Copy link
Member Author

Lucas-C commented Mar 3, 2023

Quoting #669 (comment) :

What about introducing a new ignore_style optional argument to set_font_fallback()?
It would be False by default, and you original behaviour would be implemented, but settng it to True
would allow bold/italics text to be replaced by a non-bold fallback font.

I introduced ignore_style in my latest commit on this PR, that should address both of your commens @andersonhc & @gmischler

@Lucas-C Lucas-C force-pushed the font-fallback-markdown branch from 3ebebea to 485efd2 Compare March 3, 2023 11:08
@gmischler
Copy link

I introduced ignore_style in my latest commit on this PR, that should address both of your commens @andersonhc & @gmischler

What other possible selection criteria to find the best replacement font can you think of?

Let's say someone later wants to add the possibility to match grotesk/serif fonts with each other. Is it better to introduce another match_serifs=False|True|"closest" argument then, or could both be combined in a general "search_strategy" argument?
(I'm not sure if the serif/sans-serif distinction is listed in the font file, but just take this as a placeholder for a characteristic that can actually be found there.)

@andersonhc
Copy link
Collaborator

I introduced ignore_style in my latest commit on this PR, that should address both of your commens @andersonhc & @gmischler

What other possible selection criteria to find the best replacement font can you think of?

Let's say someone later wants to add the possibility to match grotesk/serif fonts with each other. Is it better to introduce another match_serifs=False|True|"closest" argument then, or could both be combined in a general "search_strategy" argument? (I'm not sure if the serif/sans-serif distinction is listed in the font file, but just take this as a placeholder for a characteristic that can actually be found there.)

My idea was preserving the order the user specify the fonts and go with the first match.

@gmischler
Copy link

My idea was preserving the order the user specify the fonts and go with the first match.

So when you have original text in both eg. regular and bold style, then you don't want to take that distinction into account in the replacement font seleced?

@Lucas-C
Copy link
Member Author

Lucas-C commented Mar 3, 2023

Let's say someone later wants to add the possibility to match grotesk/serif fonts with each other. Is it better to introduce another match_serifs=False|True|"closest" argument then, or could both be combined in a general "search_strategy" argument?

Currently fpdf2 does not make use of such font charactertics, whereas it does manage font style (bold/italics) through fonts swapping.
This is why I'm not sure to see the point of a more generic mechanism, as it would only apply hypothetically to future usages.
But it you have a practical alternative to suggest in that case, I'd be curious to know it 😊
Maybe an option could be to make get_fallback_font() public and let users override it in child classes?

Also, as mentioned in #637, for complex "fallback" logic it is probably best to craft your own font files combining characters from several fonts.

@gmischler
Copy link

This is why I'm not sure to see the point of a more generic mechanism, as it would only apply hypothetically to future usages.
But it you have a practical alternative to suggest in that case, I'd be curious to know it 😊

This is not about implementing hypothetical features now.

It's about considering features that might possibly become interesting over time, and trying to avoid API choices tha would make adding things later unnecessarily complicated. There are many aspect to any choice of font, and most of them are probably orthogonal to each other. So I think that using a dedicated argument for style criteria (regular/bold/italic) is ok.

However, I'm not sure if a simple "apply/ignore style" distinction is good enough in the long run. I can think of three possible strategies:

  • exact match: Use a font with matching style, drop character if none available.
  • best match: Search for a font with matching style, otherwise use a default (probably first match in list).
  • first match: Ignore style and use the first match in list.

That would give us an argument like: select_style="EXACT"|"BEST"|"IGNORE"
"BEST" shouldn't be complicated to implement. But if you don't want to do that now, simply leave it away, then it can be added later.

@Lucas-C
Copy link
Member Author

Lucas-C commented Mar 6, 2023

Interesting approach.
Thank you for the practical suggestion @gmischler!
In fact, I started implementing it and then realized that would still be very limited.

On a complex document, you may have a scenario like this :

  • text sections in Roboto, with fallback to DejaVuSans
  • other text sections in DroidSans, with fallback to NotoSansSymbols

The select_style="EXACT"|"BEST"|"IGNORE" approach won't be able to cover this case,
or any other case where the fallback font must depend on the current font family.

@gmischler: you did not mention my suggestion of letting users override get_fallback_font().
What is your take on this idea?

To me it currently seems like the best solution, as it would allow maximum flexibility to end users,
while keeping the fpdf2 API very simple, without having to implement a "best match" strategy (that would in fact be a disappointingly basic rule).
We could still have clear documentation, with an example of how to define get_fallback_font() to have a "best match"

@gmischler
Copy link

To me it currently seems like the best solution, as it would allow maximum flexibility to end users,
while keeping the fpdf2 API very simple, without having to implement a "best match" strategy (that would in fact be a disappointingly basic rule).
We could still have clear documentation, with an example of how to define get_fallback_font() to have a "best match"

The ability to override get_fallback_font() is of course a good idea, I must have overlooked that. It gives the ultimate flexibility to implement whatever advanced typographical rules a user might come up with.

But I don't think that is a good excuse for unnecessarily limiting the built-in functionality. The "best match" strategy I have outlined is trivial to understand and implement. It will cover every situation where the goal is to display as many characters as close to the desired style as possible. Replacing a font with the wrong face type (regular/bold/italic) is a real issue, and those three strategies offer a complete and flexible solution to that. Just because it doesn't also match font families is not a valid argument against solving the problem id does solve.

untested concept:

    def get_fallback_font(self, char, style="", _font_style_pat = re.compile("[BI]*$")):
        first_match = None
        for fallback_id in self._fallback_font_ids:
            if ord(char) in self.fonts[fallback_id]["cmap"]:
                if self._fallback_font_style_strategy == StyleStrategy.IGNORE:
                    return fallback_id  # first match, style ignored
                if _font_style_pat.search(fallback_id).match == style:
                    return fallback_id  # exact match
                if not first_match:
                    first_match = fallback_id
        if self._fallback_font_style_strategy == StyleStrategy.BEST:
            return first_match  # best match (or None)
        return None  # no (good enough) match

The iterative approach is probably more efficient here than juggling with several levels of iterators, and definitively more readable.
Note that your if self.fonts[font]["fontkey"].endswith(style) is actually a bug, because if style is "", it will match every string. Maybe we should add a seperate "style" attribute to our font data, so that we don't need to do pattern matching in cases like this.

@Lucas-C Lucas-C force-pushed the font-fallback-markdown branch from 1660fa4 to 764d7c1 Compare March 7, 2023 11:44
@Lucas-C
Copy link
Member Author

Lucas-C commented Mar 7, 2023

@gmischler I have added a commit to this PR implementing what we discussed and adding 2 extra unit tests.
Could you please review this new version?

In the end I went with a boolean exact_match flag, as I really did not see the point of the IGNORE enum value.
With this value removed, there was really no much benefit in introducing an enum with only 2 possible values,
so I switched to a simple boolean value, with proper documentation.

@Lucas-C Lucas-C force-pushed the font-fallback-markdown branch 6 times, most recently from b5978af to c5284a9 Compare March 8, 2023 21:32
@Lucas-C
Copy link
Member Author

Lucas-C commented Mar 12, 2023

Hi @gmischler, @andersonhc

I'm planning on merging this PR next week.
I would welcome your feedbacks if you any remaining remarks!
If need be, we will still be able to amend the code in later PRs 😊

@Lucas-C Lucas-C force-pushed the font-fallback-markdown branch from c5284a9 to 44ad330 Compare March 16, 2023 14:59
@Lucas-C Lucas-C merged commit 6a73797 into master Mar 16, 2023
@Lucas-C Lucas-C deleted the font-fallback-markdown branch March 16, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants