Content from 2013-07

Languages

posted on 2013-07-17

Another surprise from seeing actual people’s actual lists of subscriptions is how common polyglots are. Better support for languages was always something I planned for, but it turned out to be urgent.

The problem is that an algorithm that correctly ranks content for substance within one language is not necessarily valid for ranking content across different languages.

The primary reading list – the one you see when you first log in – will remain multilingual. But, when applicable, the reading list will now display a menu for filtering your reading list by language.

You can see what the menu looks like on the Top page.

The language is controlled by a lang query parameter, so you can bookmark the reading list for a particular language by appending the abbreviated language name as an argument to the URL. Your English reading list, for example, is at https://tbrss.com/reading-list?lang=en.

Robots.txt

posted on 2013-07-01

Part of the reasoning behind offering temporary accounts is to get my hands on as many actual feeds as possible. Since yesterday I’ve learned another dozen ways a feed can go wrong.

The biggest lesson so far is that it’s a mistake to pay attention to robots.txt. Roughly 7% of new subscriptions are for feeds on hosts where robot exclusion policies forbid access to the feed.

This is presumably not what the feed authors intend. So, although TBRSS will still obey crawl-delay, if one is specified, it will ignore the access rules.

This blog covers lisp, code


Unless otherwise credited all material copyright by Paul M. Rodriguez