Recent Content

Hyphens

posted on 2016-02-16

When I started work on TBRSS (in 2013) it seemed like better web typography was just around the corner. CSS hyphenation, Knuth and Plass justification, legible text rendering – the future seemed bright.

Things have not gone that way. Support for typography in web browsers has stalled as browsers focus on becoming better platforms for mobile applications, rather than improving as vehicles for presenting documents.

(Not that the browsers are doing anything wrong; it’s just a letdown.)

Accordingly I’m removing JavaScript hyphenation support from TBRSS. Hyphenating in JavaScript works surprisingly well – using the Hypher library, and with careful use of caching, it can be made very fast. But it was never intended as more than a stopgap, and it represents needless complexity I am not, in the long term, willing to maintain.

Speaking of typography: I’ve also decided to stop serving a web font for the display of article text. Web fonts are fine, I think, for static content, or for dynamic content in a single language. But for dynamic content in multiple languages, they are probably too heavy. So from now on we will simply defer to the system’s serif.

Update frequency

posted on 2016-01-20

When browsing feeds, you can now order them by how often they are updated – or, more usefully, order them so the least often updated feeds are shown first. Now you can easily check on feeds that are rarely updated, and might otherwise be overlooked.

This rates an announcement because finding an algorithm that matches our intuitive sense of what it means to be updated more or less frequently is not as easy as it sounds.

Ragged right

posted on 2015-09-16

When reading a book, you read a page at a time, top to bottom. Accordingly the text can, and should, be a unified block, to balance the appearance of the two opposed pages. No shape is required beyond the frame of the page.

But on the screen there is only one column of text, and you read not by turning pages, but by scrolling. And every time you scroll, you suffer a brief period of disorientation: you have to find the line where you stopped reading. Even if you use page-up and page-down, you are still scrolling by pixels, not lines, and usually have to make small adjustments to keep the bottom or top line within the viewport.

The worst form of scrolling-induced disorientation is when you arrive near the bottom of a page without realizing it and try to scroll down: I, for one, tend to lose my place completely. In fact TBRSS provides for this particular problem by adding extra vertical whitespace to the end of every article so it is always possible to scroll down without bottoming out.

To deal with scrolling, the reader needs more cues than they would need when reading a book. Spacing between paragraphs helps, because it makes the individual paragraphs easier to recognize. Even when using indentation, some spacing helps: use it, Bringhurst be damned. But ragged right is essential, because the line endings form a shape that is easy to recognize when the whole page is scrolled down.

So: in any medium where the reader reads by scrolling – instead of within the frame of a page – prefer ragged right, because it helps the reader keep their place as they scroll.

(Whether reading by scrolling is a good idea is outside the scope of this post; it is what we have.)

Mixed content no more

posted on 2015-08-07

TBRSS is SSL-only, but many feeds are HTTP-only, and many of those feeds have images. In modern browsers, insecure assets – including images – generate mixed-content warnings, at varying levels of severity.

In the context of a feed reader, such warnings are useless. But as browsers are moving towards more secure defaults, it seems wise to fix it before it becomes a problem.

The basic technique is nothing new. You can read about the details in a Github blog post from 2010 – in short, insecure URLs are fetched through a secure proxy.

This is what TBRSS does, with one bandwidth-saving complication: we parse the rules from the HTTPS Everywhere extension and, when possible, directly re-write insecure URLs to their secure equivalents – in which case no proxy is needed.

Drakma, now with SNI

posted on 2015-08-05

A while ago, I was alerted by a blog post to the fact that Drakma, the HTTP client that TBRSS uses, did not support SNI.

At that time, I added support in both CL+SSL and Drakma for TBRSS’s own use, and made a pull request for CL+SSL, intending to make another pull request for Drakma once the first had been merged.

I am pleased to report that, after a long (but not unreasonable) delay, that second pull request has been made and merged, and stock Drakma can be used with SNI-enabled hosts.

Blog overhaul

posted on 2015-07-24

Octopress is out, Coleslaw is in.

Muting feeds

posted on 2015-05-30

Quick note: you can now mute feeds. A muted feed will be updated normally, and appear normally in your feed list, but entries from it will never appear in your reading list.

This can be useful if you want to use TBRSS for reading a particular feed (because TBRSS is a pleasant environment for reading!) but aren’t interested enough that you want even its best posts to appear in your reading list.

Synthetic feeds

posted on 2015-01-12

Not every site that publishes articles has a feed. Even when feed readers were at their height, not every site had a feed, and now that feed readers are, if not declining, certainly marginalized, it cannot be safely expected that every interesting new site will have a feed.

And while they usually do have feeds, in many if not most cases it is an unintentional side effect of the use of a platform or CMS that has feeds on by default. We cannot assume that this will be so forever.

(An inauspicious sign: the Roundtable blog at Lapham’s Quarterly, which uses the feed icon in its branding, does not actually offer a feed.)

On the other hand, social networks are on the ascendant, and search engines are not so much ascendant as enthroned. Semantic markup is increasingly common, to improve presentation in search engines and social media – a motivation that seems unlikely to slacken. (Not to mention the new semantic elements introduced by HTML5 – article, time, &c.)

Widespread use of semantic markup means that a large subset of the information in a feed can now reliably be gleaned directly from ordinary web pages. And, even if semantic markup is lacking, we have good general algorithms for content extraction.

So TBRSS is introducing a new and highly experimental feature: synthetic feeds. If you try to add a page’s feed, and that page does not contain a link to a feed, you now have the option to create a feed directly from that page.

The semantics are simple: we scan the page for metadata and links; we resolve a certain number of those links in a certain order (this is the most experimental part), fetch the linked pages, and process them into entries.

(This is completely different from something like page2rss. They monitor a single page for changes and report the differences within that page. For TBRSS the root page of the synthetic feed is only of interest as a source of metadata and links. The entries are one-to-one with the pages in the site linked to from the root page.)

This is crawling, and we respect the same rules as any crawler: robots.txt and the ROBOTS meta header, rate-limiting our requests, &c.

There is of course also the question of copyright. Sites that provide feeds are implicitly giving us permission to display their content. We respect this for synthetic feeds the same way as for truncated feeds: we do not display the content (only a summary) unless the user explicitly asks to see the full content.

Again, synthetic feeds are experimental; but the experiment is in progress.

Reading History

posted on 2014-08-06

TBRSS has a new feature: a reading history. That is, when you read an article, or start to read it, it is added to your reading history, and you can look back by date and see what you have been reading.

The reading history is not forever; there is a limit (at present, 1000) to how many entries are stored. And of course you can clear it yourself.

To my mind, this solves, directly and humanely, the essential problem that starring and full-text search, only solve obliquely: “what was that interesting article I read the other day?”

Update

posted on 2014-06-23

Some things I may have forgotten to mention:

  • Billing has changed from a yearly to a monthly schedule.
  • You can subscribe to Facebook pages.
  • You can subscribe to feeds by name, partial URIs (like domain names), or, to some extent, by topic. (This feature is in the crude-but-useful stage.)
  • YouTube and Vimeo videos are now preserved (but with lazy-loading to preserve privacy and save bandwidth).

This blog covers lisp, code


Unless otherwise credited all material copyright by Paul M. Rodriguez