I subscribe to Hacker News’ front page on my feed reader with hnrss. While I skip over most Hacker News items, I always stop when I see something about RSS feeds or related formats. On Labor Day, I read an interesting post on Dogsec about a method for not only gathering full text feeds, but also full archive feeds, meaning everything that the site in question has published. While the tutorial was written with cyber security research in mind, the open source tools are available to all. I will let those of you who are interested read about the method and how to implement it, but the post inspired me to go on my own tangent.
I recently wrote about disagreements regarding the best way to read feed items. Some prefer reading full text feeds in a stripped-down reader mode provided by a feed reader or designated read-it-later tool. Others prefer navigating from links in a feed reader to the original website. I took a middle-path, stating that while I generally prefer reading full text feeds in my feed reader, there are some sites that are either pretty enough or have site-specific elements that make it worth visiting (writing that inspired me to overhaul the aesthetics of The New Leaf Journal to make the site proper more worth visiting). In the spirit of offering readers a choice, I ensure that The New Leaf Journal has full-text feeds so that our feed subscribers can read our articles from the comfort of their feed readers or swing by a for a visit (I think we are worth the occasional visit, but consider my take biased).
However, I only have half of the points deemed ideal in the Dogsec article. While I do offer full text feeds, our feeds are limited to our 12 most recent articles. This is our 1,048th full article, so it suffices to say that our feed comes up well short of being a full archive feed. This reminded me of an interesting article I read on Yukinu Blog titled Guidelines for designing RSS feeds. Yukinu, like The New Leaf Journal, places feeds front and center. Also like The New Leaf Journal, Yukinu Blog offers RSS, ATOM, JSON, and Twtxt feeds (it was actually part of what inspired me to implement Twtxt feeds). But there is one key difference between our feeds. While we both offer full-text feeds, Yukinu Blog has full text and full archive feeds. I quote Yukinu’s reasoning for recommending full archive feeds:
When possible, include all of your articles in the feed instead of limiting the feed to just the most recent articles. The rationale behind this guideline is that by including all of your articles, a new site visitor can subscribe to the feed and quickly get an overview of all of your posts. Furthermore, since all articles are downloaded locally, they can quickly flip through the articles instead of loading potentially hundreds of web pages to read each individual article.
I recall considering the matter when I read Yukinu’s feed guide – having noted to myself that I already followed the other six feed recommendations in the post (you should consider the post essential reading if you run a site and are interested in offering feeds). But I ultimately opted against offering full archive feeds or anything close. Every site is different, but I think many potential subscribers to our site would be taken aback if they added our RSS, ATOM, or JSON feed to their reader and suddenly had more than 1000 unread articles. While I would be honored if someone reacted to finding 1000 New Leaf Journal articles in his or her feed reader by taking the time to read even a quarter of them, I suspect that the vast majority of subscribers would either mark everything as read to get things under control or remove our feed for having made their “unread articles” list explode.
However – do not take my particular concerns as a criticism of the concept of full archive feeds. When I first subscribed to Yukinu’s feed, I took the opportunity to pick out some older blog posts that I was interested in reading and catch up on all the good writing there. It works in that case and in many others. But I think it would be a bad idea for my site given our facts and circumstances.
Even though I am not following the full archive feed advice, I am sympathetic to some of the concerns Yukinu raised about offering partial feeds. For example, I am always concerned that new potential readers may not have a clear and distinct idea of what The New Leaf Journal is (an issue which goes back to our earliest days) based on a small selection of articles – given that my writing interests (not to mention my friend and sometimes-author, Victor V. Gurbo’s) are all over the map. I also indirectly consider the issue of “loading potentially hundreds of web pages to read each individual article.” Below, I offer my own solutions to these problems in lieu of full archive feeds for webmasters who may be considering different options:
- Light, fast-loading web pages: I took an interest in site speed and performances as I gradually learned how to administer a WordPress site. After I moved the site from cheap shared hosting to a cheap unmanaged VPS server (a significant upgrade for what was actually a discount compared to continuing with the shared hosting), I worked on further stripping down the site despite having more resources to work with. We do not use third-party libraries. Instead of self-hosting our fonts, we call system font stacks. I monitor the site to make sure that we do not load unnecessary things and I optimize images before uploading them to our server. I also have a good caching set-up to ensure that our site loads quickly without needing to rely on CDNs or anything of the sort. I have in the past confirmed that our site loads well on everything down to your PocketBook Color e-reader. While I am sure there is still room for me to improve, I think that I have demonstrated that one can make a very light WordPress site that respects its users.
- I will note additionally that I would have some performance concerns about having a full archive feed with a few hundred thousand words and many images, which is part of why I have no plans to make a separate full archive feed. But Yukinu has some sensible recommendations for webmasters who have some performance concerns but nevertheless want to offer full archive feeds.
- Internal linking: I am sympathetic to some arguments against internal linking, but I make ample use of it for two reasons. Firstly, I try to keep The New Leaf Journal in conversation with itself. While I write about many wholly unrelated subjects, I develop ideas over time. Secondly, I use internal linking to help new readers discover older New Leaf Journal articles of interest in the same way I use external linking to point readers to other interesting websites. Along these lines, I also started curating our related posts section by hand – a process I wrote about not too long ago. Our categories and tags serve similar purposes.
- Search: I have recently worked on enhancing our on-site search functionality. First, my own poor experience with our search inspired me to make it possible to search with boolean operators, which I have quickly found greatly enhances our search for people who already have an idea of what they are looking for. Secondly, as part of my recent site overhaul, I have placed a prominent (and much prettier than before) search bar above all of our posts. You do not even have to be on our site to search. Simply type https://thenewleafjournal.com/search/?s into your address bar with ?s being your query and you will be directed to the search results page on our site.
We have a couple of quasi-alternatives to full-archive feeds available. Firstly, our plain text Twtxt feed includes links to our 200 most-recent articles, which I recently noted could be a very minimalist way to dig into a good chunk of our archive. Secondly, I make our full XML sitemap available. This not only includes links to our regular articles but, unlike our main feed, also has links to our custom post type posts and pages. One could combine these tools with something like SingleFile (a web extension which downloads an HTML copy of a webpage), Zotero (SingleFile plus a reference manager), or MarkDownload (the same but for markdown) to get offline copies of our articles (for your own use, of course, copyright et al).
If you are running your website (or considering doing so in the future), I strongly encourage you to not only offer feeds but also make sure that your readers know what they are and how they can be used. Beyond that, I think different feed configurations are appropriate for different sites. I agree entirely with Yukinu in strongly favoring full text feeds, but if one decides for whatever reason that full text feeds are unacceptable as a default option (granting that it is usually not difficult for an end-user to turn a partial feed into a full text feed), partial feeds are better than no feeds at all. As for full archive feeds, I do not think they are the best choice in all cases, but for sites that do not have too many articles or that are likely to have feed subscribers who may expect full archive feeds (or at least not be alarmed by a sudden dump of unread articles), full archive feeds may be the right choice. If you, like me, do not offer full archive feeds – I would still recommend considering the case for said feeds because the benefits of full archive feeds can inform site design decisions for sites such as mine that do not offer them.
Of course, all of this leaves one unanswered question. Why do I only include our 12 most-recent articles in our feeds? Surely there is a middle ground between 12 and more than 1000. I confess that no great thought went into settling on 12 articles. As a general matter, I tend to organize things in numbers divisible by three (note our article archive pages include 12 articles, so new feed subscribers will always start with our full homepage in their readers). I think I used to have 9 articles in the feed and at some point increased it to 12, where it has been for long enough that I do not remember when I made the change. Intuitively, I could see going as high as 21. But as a general matter – I think that any given set of 12 articles should give readers an idea of what we write about and whether we are subjectively worth keeping in a feed reader. I like to hope that at least a quarter of our articles should be of some interest to a classy independent writing website connoisseur, with the particulars depending on the interests of the classic independent writing website connoisseur in question. But perhaps I am wrong. Maybe I should include more articles in our feed. If you subscribed to our feeds (or are planning to) and want more than twelve articles, feel free to tell me why I should bump up the number using our contact page or Guestbook.
I will wrap things up by coming full circle. While I am not going to offer full archive feeds for the reasons I noted above – Dogesec offers a nice guide to creating your own. If you want to use the Dogesec guide to create a full archive feed for The New Leaf Journal, be my guest and do tell me how it works.