Request for app: smart RSS client that understands editor's publishing choices

2 points by simon_acca 11 hours ago

Essentially I am looking for a smart RSS client; can you recommend one?

RSS readers work well to track simple blogs that publish in a "stream of consciousness" fashion (that is, one idea after another, roughly in the order that they are thought up by the author).

News sites (think BBC, the Economist, foreign affairs, the Atlantic, etc) are a different beast... they publish a lot of content all the time and it's not all equally important. The top stories linger on top of the home page for a long time, while minor ones either get pushed to the bottom or leave the front page quickly altogether.

I think there's value in the editor's judgement of their own content, so I'm looking for an app that understands and keeps track of it in order to make it accessible to the user. For example I'd want to ask "what were the 10 stories that lingered on the front page of the bbc the longest in the last week" or "all stories that made it to the top 2 spots in the last 3 days" or some combination thereof.

Do you know of any such app? (Or do you want to build one? I'd certainly be a paying customer!)

Thanks!

stop50 10 hours ago

I would recommend looking at the raw xml of those sites. There is no information about what you want. Most rss readers display the article/episode and allow you to read it or to save it.

stareatgoats 8 hours ago

You would likely need to combine the RSS feed articles with data provided by a service that scrapes the website in order to identify where the article is placed at any one point. Sounds doable even if scraping is always fraught with numerous pitfalls. I haven't heard of any such solution, and building one is not on my todo-list, sorry.

pavel_lishin 8 hours ago

Newsblur allegedly allows you to "train" it to suggest certain stories and ignore others, but I've never used that particular functionality.

jonathanyc 8 hours ago

I haven't seen any information that could be used for this in the RSS feeds I've looked at. You could scrape the website, especially if it's all running on your own computer, but if you do it on a server you'll almost certainly be blocked unless you use a third-party scraping service. The WSJ in particular is super aggressive; you'll probably be OK with the NYT, which has a personal use exemption.

Unfortunately Anthropic and OpenAI have kind of ruined scraping for everyone else.