posted on 2016-02-16
When I started work on TBRSS (in 2013) it seemed like better web typography was just around the corner. CSS hyphenation, Knuth and Plass justification, legible text rendering – the future seemed bright.
Things have not gone that way. Support for typography in web browsers has stalled as browsers focus on becoming better platforms for mobile applications, rather than improving as vehicles for presenting documents.
(Not that the browsers are doing anything wrong; it’s just a letdown.)
Speaking of typography: I’ve also decided to stop serving a web font for the display of article text. Web fonts are fine, I think, for static content, or for dynamic content in a single language. But for dynamic content in multiple languages, they are probably too heavy. So from now on we will simply defer to the system’s serif.
posted on 2016-01-20
When browsing feeds, you can now order them by how often they are updated – or, more usefully, order them so the least often updated feeds are shown first. Now you can easily check on feeds that are rarely updated, and might otherwise be overlooked.
This rates an announcement because finding an algorithm that matches our intuitive sense of what it means to be updated more or less frequently is not as easy as it sounds.
posted on 2015-09-16
When reading a book, you read a page at a time, top to bottom. Accordingly the text can, and should, be a unified block, to balance the appearance of the two opposed pages. No shape is required beyond the frame of the page.
But on the screen there is only one column of text, and you read not by turning pages, but by scrolling. And every time you scroll, you suffer a brief period of disorientation: you have to find the line where you stopped reading. Even if you use page-up and page-down, you are still scrolling by pixels, not lines, and usually have to make small adjustments to keep the bottom or top line within the viewport.
The worst form of scrolling-induced disorientation is when you arrive near the bottom of a page without realizing it and try to scroll down: I, for one, tend to lose my place completely. In fact TBRSS provides for this particular problem by adding extra vertical whitespace to the end of every article so it is always possible to scroll down without bottoming out.
To deal with scrolling, the reader needs more cues than they would need when reading a book. Spacing between paragraphs helps, because it makes the individual paragraphs easier to recognize. Even when using indentation, some spacing helps: use it, Bringhurst be damned. But ragged right is essential, because the line endings form a shape that is easy to recognize when the whole page is scrolled down.
So: in any medium where the reader reads by scrolling – instead of within the frame of a page – prefer ragged right, because it helps the reader keep their place as they scroll.
(Whether reading by scrolling is a good idea is outside the scope of this post; it is what we have.)
posted on 2015-08-07
TBRSS is SSL-only, but many feeds are HTTP-only, and many of those feeds have images. In modern browsers, insecure assets – including images – generate mixed-content warnings, at varying levels of severity.
In the context of a feed reader, such warnings are useless. But as browsers are moving towards more secure defaults, it seems wise to fix it before it becomes a problem.
The basic technique is nothing new. You can read about the details in a Github blog post from 2010 – in short, insecure URLs are fetched through a secure proxy.
This is what TBRSS does, with one bandwidth-saving complication: we parse the rules from the HTTPS Everywhere extension and, when possible, directly re-write insecure URLs to their secure equivalents – in which case no proxy is needed.
posted on 2015-08-05
I am pleased to report that, after a long (but not unreasonable) delay, that second pull request has been made and merged, and stock Drakma can be used with SNI-enabled hosts.
posted on 2015-05-30
Quick note: you can now mute feeds. A muted feed will be updated normally, and appear normally in your feed list, but entries from it will never appear in your reading list.
This can be useful if you want to use TBRSS for reading a particular feed (because TBRSS is a pleasant environment for reading!) but aren’t interested enough that you want even its best posts to appear in your reading list.
posted on 2015-01-12
Not every site that publishes articles has a feed. Even when feed readers were at their height, not every site had a feed, and now that feed readers are, if not declining, certainly marginalized, it cannot be safely expected that every interesting new site will have a feed.
And while they usually do have feeds, in many if not most cases it is an unintentional side effect of the use of a platform or CMS that has feeds on by default. We cannot assume that this will be so forever.
On the other hand, social networks are on the ascendant, and search
engines are not so much ascendant as enthroned. Semantic markup is
increasingly common, to improve presentation in search engines and
social media – a motivation that seems unlikely to slacken. (Not to
mention the new semantic elements introduced by HTML5 –
Widespread use of semantic markup means that a large subset of the information in a feed can now reliably be gleaned directly from ordinary web pages. And, even if semantic markup is lacking, we have good general algorithms for content extraction.
So TBRSS is introducing a new and highly experimental feature: synthetic feeds. If you try to add a page’s feed, and that page does not contain a link to a feed, you now have the option to create a feed directly from that page.
The semantics are simple: we scan the page for metadata and links; we resolve a certain number of those links in a certain order (this is the most experimental part), fetch the linked pages, and process them into entries.
(This is completely different from something like page2rss. They monitor a single page for changes and report the differences within that page. For TBRSS the root page of the synthetic feed is only of interest as a source of metadata and links. The entries are one-to-one with the pages in the site linked to from the root page.)
This is crawling, and we respect the same rules as any crawler:
robots.txt and the
ROBOTS meta header, rate-limiting our requests,
There is of course also the question of copyright. Sites that provide feeds are implicitly giving us permission to display their content. We respect this for synthetic feeds the same way as for truncated feeds: we do not display the content (only a summary) unless the user explicitly asks to see the full content.
Again, synthetic feeds are experimental; but the experiment is in progress.
posted on 2014-08-06
TBRSS has a new feature: a reading history. That is, when you read an article, or start to read it, it is added to your reading history, and you can look back by date and see what you have been reading.
The reading history is not forever; there is a limit (at present, 1000) to how many entries are stored. And of course you can clear it yourself.
To my mind, this solves, directly and humanely, the essential problem that starring and full-text search, only solve obliquely: “what was that interesting article I read the other day?”
posted on 2014-06-23
Some things I may have forgotten to mention: