Optimizing HTML for Safari Reader
I’m looking for information on how Safari Reader parses documents and decides how to lay them out. It does a pretty good job with most of our content, but I’ve seen some other sites where it inserts headers or footers at the top and bottom of each page of the document, and I’m wondering if there’s some additional science here.
I’ve read this article which has some hints, and I’ve looked through the technical documentation for Safari on the Apple developer site, but I haven’t located any official info.
Leave a Reply
You must be logged in to post a comment.
1 Answer
The Readability's source is well commented. Summarizing:
It likes the content that is wrapped by an element with some of these ids or class names: /article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/
Taking into account the intern density of the text like paragraphs, number of characters, line breaks, commas, etc.
Negative points are also attributed for elements with id or class name of "comment", "sidebar", etc, or with high link density.
If the parser can't determine an element that has the content, it just use the body element (cleaned up).
Leave a Reply
You must be logged in to post a comment.
Your Answer
Please login to post questions.

We can’t know how much of it is based on Readability, but it’s in Safari’s acknowledgments http://bit.ly/bMYqAO
Interesting! I thought they had cloned Readability, I didn’t realize they had used the same source or algorithm.