Frustrations With Blogging Platforms
John David Pressman
With Twitter, Reddit and possibly YouTube in fundamental decline now seems like a good time to reexamine writing on the Internet. I’ve been occasionally asked why I stopped doing public long-form writing and part of the answer is I feel there’s less and less to write about. Part of the answer is that I didn’t stop, I just started posting long threads to Twitter instead of blog posts when The Discourse frustrates me. The reason for this is something like platform dysphoria: Every time I consider writing long-form, I get hung up on where to post it. I’d probably get a lot of interest on LessWrong, but then my audience would be LessWrong. In fact almost every venue is characterized by a kind of AI derangement syndrome where sharing ideas based on the deep learning literature invites people to respond in a broken dialect of outdated 80’s cognitive science and 90’s pop culture connectionism that I’m somehow expected to pretend forms a coherent thought. As for my own platforms and sites, I’m not really satisfied with any of them. There’s just some fundamental dissatisfaction or ick factor that keeps me from posting to the blogs I have.
So what do I want out of a platform, anyway?
-
Aesthetics And Readability: It doesn’t really matter what you write if nobody reads it, so a basic priority is to make sure the content is readable. On the other hand it doesn’t matter how good the site hosting your content looks if it’s garbage nobody would want to read. Historically I think I’ve overthought this part. The site hosting Paul Graham’s essays is barebones even by Web 1.0 standards. Yet I notice that the awkward left alignment, default link style post index, and small font size didn’t stop me from reading every post when I was 15. The Books Of Sand used a crappy Blogger template and enjoyed my readership anyway. At one point my own blog literally didn’t have CSS and I still got to the top of Hacker News with a Common Lisp tutorial. Unless the graphic design is so bad it makes my writing physically uncomfortable to read I doubt it’s going to make or break things for my audience.
-
Accumulation: My writing should accumulate, not just be ephemera for someone else’s amusement. I want to write in a way that builds on itself, that can be updated and improved as my ideas get better. This naturally suggests something like a wiki with topic pages exploring a particular subject in the vein of gwern.net, but as I’ll explain I think this solution is incomplete.
-
Smooth Gradient Of Effort: I should be able to start with a very short post or unit of effort and gradually expand until it’s a mature work without switching platforms. One of the things I really enjoy about microblogging is that I can publish basic ideas and insights without having to dress them up. The big problem with microblogging is that it doesn’t really scale well once I have a longer thought. Of the microblogging platforms I’ve tried so far, Pleroma has been the most comfortable because I can set the character limit per post to 1024. 1024 seems to be about the natural length of a short post for me. It’s an anachronism that essayists like Scott Alexander are now frequently regarded as the archetypal blogger. When blogs were popular it was on platforms like LiveJournal that encouraged frequent short posts. Even blogs like LessWrong that were infamous for their verbosity were built on the back of short, frequent posts (e.g. The Parable of the Dagger). Short frequent posts are what builds the audience that lets you get readers for your longer thoughts. They’re not optional and ideally I wouldn’t have to publish them on a separate system that readers have to manage as a 2nd subscription.
-
Subscription: It should be easy for my audience to follow me, get updates when I’ve changed something, and engage only with the content they haven’t seen before. This is one of the primary points on which wikis fall down. When I was subscribed to Gwern’s RSS feed it would send you a diff of changes from one page update to the next. I remember finding these diffs inscrutable so I would have to visit the original page to find what changed. But the page is large so finding it meant awkwardly doing control-f to search for a string from the diff and in practice I just didn’t follow updates to Gwern’s site.
-
Machine Readability: One of the things that’s changed since I last did regular blogging is the widespread training and deployment of large language models such as GPT-N. I write to multiply the number of agents with my knowledge and values in the universe, so I would obviously like to be included in training sets. Any platform that will try to “protect” my writing from its purpose is working against me. I don’t want any login walls, paywalls, anti-scraping ‘features’, etc. My writing should be easily indexed by search engines and included in any AI training set that wants it.
-
Easily Mirrored: I would like to avoid any features that mean my writing is hard to archive, mirror on distributed protocols like IPFS, or even right click and save. This along with machine readability implies I want simply formatted static text files, minimizing server side scripting and JavaScript gunk.
Keeping all these properties in mind we roughly score different platforms on how well they satisfy each on a scale from 1-51:
Platform | Design | Accumulation | Length | Subs | Indexing | Mirror | Total |
---|---|---|---|---|---|---|---|
My Site | 3 | 3 | 2 | 4 | 4 | 5 | 21 |
gwern.net | 5 | 4 | 2 | 2 | 5 | 5 | 23 |
Medium | 4 | 3 | 2 | 3 | 3 | 3 | 18 |
Substack | 4 | 3 | 2 | 3 | 3 | 3 | 18 |
DreamWidth | 3 | 2 | 3 | 3 | 4 | 4 | 19 |
4 | 2 | 2 | 3 | 2 | 3 | 16 | |
Pleroma | 5 | 2 | 3 | 4 | 3 | 4 | 21 |
BlueSky | 4 | 2 | 2 | 2 | 2 | 4 | 16 |
To me the clear standouts are gwern.net and Pleroma. While gwern.net is probably the best designed personal website on the Internet, it still doesn’t have a good answer to subscriptions and the short posting that builds audience. And while Pleroma is the best microblogging platform, it still doesn’t let my writing accumulate properly or scale my efforts from a basic observation to a full article. I imagine the best blog platform design would start from one of these templates and evolve to include what it’s lacking from the other.
I can sketch a proposal from either direction.
Towards An Adequate Platform From Gwern’s Direction
Ultimately the problem that has to be solved with a site like Gwern’s is how you let writing accumulate while still making it possible to ergonomically follow your new ideas. When new ideas get lumped into a large corpus of existing ideas, it becomes hard to parse what’s new unless a reader is intimately familiar with the existing content. And on a site that explores many ideas it is simply not feasible to expect readers to be familiar with every subject covered. From one perspective this is a nonissue: After all if a reader doesn’t know about a certain subject, they can just not read that update. From a audience building perspective however it’s a real problem, consistent frequent updates are how a readership is built. And without a readership, the influence of writing is limited.
To get straight to the point I think the problem that has to be solved is this: Updates want to be their own kind of self contained document. This is why the blog format won out over the older style of personal site with topic pages in the first place: Blogging encourages the author to write about updates to their ideas as self contained posts. This of course has the downside that your writing doesn’t accumulate into full articles, but memetically if this style of writing builds audience and topic pages don’t then the popular sites on the Internet are going to be blogs. The high-effort way to square this circle is to simply write the content once for your ongoing article, and then again as an update or blog post. But if the abandoned changes tab on Gwern’s site is anything to go by, it’s simply not sustainable to write content twice. Writing is difficult enough the first time, so it would be ideal if we only did it once.
I see two realistic ways to avoid writing articles and updates as redundant content: The first is to just have AI do it, services like ChatGPT are now advanced enough that it’s plausible you can feed them the diffs of your page and ask it to write a mediocre summary post about the parts that have changed. This won’t exactly be riveting reading, and you’ll have to read and review to make sure the machine doesn’t subtly botch it, but at least it’s more accessible than a raw diff. The second way is to write your ideas in some kind of update format to begin with that’s easy to merge into larger essays and articles. This is what I was trying to do with Liber Augmen but I never really figured out the “merge into larger essays and articles” part. I’m not even really sure it’s possible with any less effort than just writing the content twice, honestly.
Towards An Adequate Platform From Pleroma’s Direction
My first big problem with Pleroma is that there’s a limit on how many posts I can pin. I know that people traditionally fix this problem by having a “thread of threads” or “best posts” thread that they update with the posts and threads they consider good, but this seems ergonomically tedious to me for both the author and the reader. Ultimately what I want is to be able to do something like Paul Graham’s simple list of posts for the writing that is not ephemera. And of course if you want to actually accumulate your writing into articles you’ll need to solve the update problem now that you have forms of writing on your platform that are not updates.
On the other hand, LLMs also bring a new affordance to microblogging. Namely: It’s not clear that you even need to accumulate your writing into articles and topic pages 5 minutes from now. Given the sheer economic demand for a chatbot that can be customized to answer questions about a corpus (e.g. software documentation) it’s reasonable to expect that if it’s technically possible with current models it will happen soon. And given the common interest people have in maintaining control over their own documentation, if it can be done with open local models it will be. I can imagine a scenario where instead of accumulating into articles you just accumulate into a LoRa and publish it to be used with some standard local model(s) that people can ask questions. The LoRa and the corpus used to train it are updated from a known repository that is published much like git repositories are now. It may in fact literally be a git repository with the large file extension.
Implementation Thoughts
I continue to be struck by the fundamental simplicity of what a ‘microblog’ is. Ultimately something like a personal Twitter feed could be represented by indices of flat text files hosted on IPFS. Each file could contain a 280 character chunk with authorship information, who is being replied to, and pointers to other IPFS chunks with the rest of the thread. Authentication could be handled through cryptographic message signing. The only part that would need to be dynamically served is the updated post index for a user. The post index can be handled with the traditional domain name system and a python script that updates what IPFS hash it points to upon receipt of a signed message. Things like recommendation algorithms would be dynamic layers on top that users subscribe to in their client separately from the core messaging service. From the perspective of the user nothing would change. All the information is still there, and can be rendered exactly as it was before using the same interfaces.
BlueSky seems to be the closest thing to this. An implementation of the BlueSky protocol based mostly on IPFS and flat text files would be a strong way to implement the updates portion of the platform. I know I gave BlueSky a low score in my rubric but that’s mostly because it’s more or less modeled after Twitter. It being in private alpha means it also shares the access problems Twitter has but worse. However from an implementation standpoint the BlueSky protocol was designed with IPFS and static hosting in mind, so that almost the entire protocol can function without dynamic rendering. The use of traditional DNS to verify distributed identity is a clever way to bootstrap the network. And the fact that it lets you participate in the BlueSky network directly might make it appealing over just a plain RSS feed. The RSS feed should of course continue to be offered.
-
I’m deliberately ignoring things like network effects because my intuition is that overfocusing on them distorts incentives. I can afford not to maximize the size of my audience and should spend that on prosocial moves like avoiding platform lock-in. ↩