Optimizing HTML for Safari Reader

0

I’m looking for information on how Safari Reader parses documents and decides how to lay them out. It does a pretty good job with most of our content, but I’ve seen some other sites where it inserts headers or footers at the top and bottom of each page of the document, and I’m wondering if there’s some additional science here.

I’ve read this article which has some hints, and I’ve looked through the technical documentation for Safari on the Apple developer site, but I haven’t located any official info.

Tags: asked September 3, 2010

Leave a Reply

1 Answer

0

The Readability's source is well commented. Summarizing:

It likes the content that is wrapped by an element with some of these ids or class names: /article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/

Taking into account the intern density of the text like paragraphs, number of characters, line breaks, commas, etc.

Negative points are also attributed for elements with id or class name of "comment", "sidebar", etc, or with high link density.

If the parser can't determine an element that has the content, it just use the body element (cleaned up).

  1. Interesting! I thought they had cloned Readability, I didn’t realize they had used the same source or algorithm.

Leave a Reply

40

Your Answer

Please login to post questions.