Word-breaking solution with SEO & UX in mind, without JavaScript

Word-breaking solution with SEO & UX in mind, without JavaScript

There are JavaScript and CSS solutions out there to break words. However, they often don't break words in an organic way. Today, I am sharing a piece of code which I use to break words the way I want, maintaining SEO value at the same time.

Having built my own CMS, I have run into several challenges across different disciplines. From pagespeed to accessibility, but also admin user experience. They don't want to be bothered with testing every new written article on a mobile device to determine which words should be splitted or hyphenated. At the same time, if the browser window is wide enough, words should just stick together.

I run into this topic today on LinkedIn and thought like sharing my approach. No rocket science, but very suitable for CMS solutions, even working retroactively.

Word breaking for your CMS users

To make life of CMS users easier, I just created a text field where users could fill in parts of long words, or in other words: their syllables. For example, anything that is often a combination of a longer word, such as -agreements and -statements. Users have to indicate where a hyphenation should take place in case a word is too long to fit the screen, including a regular expression to indicate the position of where other syllables might appear:

  • pseudo-(w)
  • hypo-(w)
  • para-(w)
  • (w)-statement
  • (w)-agreement

Pseudopseudo­hypo­para­thyroidism

Pseudo-what? This is a a genetic medical condition, but that's not really important for this article. The most important part is that this word will be too long for small screens and devices. And, its abbreviation equals the language I used to fix undesired word-breaking, without having to use additional CSS or JavaScript. Correct, I am talking about PHP.

Above you see how I already indicated where hyphenation should take place in a way I think is the best place. This way, there is no chance that hyphenation is done in an incorrect way.

Word breaking on small screens and mobile devices

Next, I am using the following code to loop all the words and prepare search and replace keys for a regular expression, to finally replace all occurences.

$longWords = array();
foreach ( $words as $word ) {
$longWords[ '/' . str_replace('-',null,$word) . '/' ] = str_replace( array('-','(w)'),array('­','1'),$word);
}
$content = preg_replace( array_keys( $longWords ), $longWords, $content);

Another advantage of this approach: words will only break when they are too long to fit the screen. So, chances are words might not break on desktop devices, for example. Obviously, this also depends on the position of the word within a sentence.

To test this, here is the same long word on more time, so you can test the outcome of hyphenation by resizing your browser's window: Pseudopseudo­hypo­para­thyroidism.

Reverting word breaking within hyperlinks

Obviously, hyperlinks but also any other values which aren't displayed on the screen right away don't really need word breaking. Moreover, it would only break internal linking when long words are part of the hyperlink. This basically means that any long words in attribute values don't need any word breaking.

I just fix this by executing the following PHP code, to roll back any hyphenation which took place within attribute values:

$content = preg_replace_callback( array('/json='([^']*)'/','/="([^"]*)"/'), function ($matches) {
return str_replace('­', '', $matches[0]);
}, $content);

This is very specific for my situation where I sometimes use single quotes for JSON strings, but feel free to alter in any way that suits your situation.

SEO value on word breaking

I found an article describing shy-entity solution as well. This article uses the word "super­califragilistic­expialidocious" to demonstrate it. The best part? That article is still ranking on the word "Supercalifragilistic­expialidocious" when you would Google "super­califragilistic­expialidocious zooma", proving that you can still rank on long keywords, even when using soft hyphenation as a word breaking solution.

Obviously, this very article is also an SEO keyword word breaking test. It will take a few days, but if this article is ranking for the following words, then we have additional SEO proof:

  • Pseudopseudo­hypo­para­thyroidism (used as a heading and in a sentence);
  • Supercalifragilistic­expialidocious (as I have now used this word a few times as well).

Do note that I did not research the impact on keyword relevance. Although you can still rank on long words, there is still a chance that the exact SEO relevance is slightly lower.