Friday, 5 November 2010

XeTeX: Breaking long lines of Japanese (and Chinese) text.

While LaTeX/XeTeX is very good at hyphenating Latin words, I recently came across the problem of overflowing boxes when attempting to typeset a long paragraph consisting exclusively of Japanese characters:


When you render this with XeTeX, you've got a warning that your hbox is overfull, and the rendered result shows an awful truncated Japanese line:

Rendering of an non hyphen-able Japanese line
We clearly need to fix this, because most of the time your data will be in the form of a long paragraph without any linebreaks.

In XeTeX-notes.pdf, we learn that we can activate the line-breaking in XeTeX by using:


The locale used doesn't matter much, the important thing is that it activates line breaking for too long lines where the hyphenation mechanism cannot do its job (a long Japanese/Chinese line for instance).

By adding this to your preamble, here's what you get:

A Japanese paragraph with line breaks.
Et voilà, lines are broken so all the characters fit in the page width!

Ps: Thanks to Google news Japan for providing example data.


  1. Thanks for this - my very ugly-looking bibliography has been rescued!

    1. Glad it helps! Took me a while to find this one :)