TUG 2024 — Ondřej Sojka — Expanding hyphenation patterns across Slavic languages

preview_player
Показать описание
So far, \TeX\ hyphenation patterns, even for related languages, have been developed separately for each language, splitting scarce human resources. As languages develop and especially English terms creep into formerly monolingual texts, hyphenation patterns, especially for low-resource languages which often lack quality generated patterns, are due for an update.

In this article, I explore the possibilities for transfer learning of hyphenation rules between related Slavic languages. I present new hyphenation patterns for multiple Slavic languages, developed using transfer learning from various sources.