1/4
to mean "first out of four".
@bender@twtxt.net I try to avoid editing. I guess I would write 5/4, 6/4, etc, and hopefully my audience would be sympathetic to my failing.
Anyway, I donât think my eccentric decision to number my twts in the style of other social media platforms is the only context where someone might write Âź not meaning a quarter. E.g. January 4, to Americans.
Iâm happy to keep overthinking this for as long as you are :-P
@bender@twtxt.net @prologic@twtxt.net Iâm not exactly asking yarnd to change. If you are okay with the way it displayed my twts, then by all means, leave it as is. I hope you wonât mind if I continue to write things like 1/4
to mean âfirst out of fourâ.
What has text/markdown
got to do with this? I donât think Markdown says anything about replacing 1/4
with Âź, or other similar transformations. Itâs not needed, because Âź is already a unicode character that can simply be directly inserted into the text file.
Whatâs wrong with my original suggestion of doing the transformation before the text hits the twtxt.txt file? @prologic@twtxt.net, I think it would achieve what you are trying to achieve with this content-type thing: if someone writes 1/4
on a yarnd instance or any other client that wants to do this, it would get transformed, and other clients simply wouldnât do the transformation. Every client that supports displaying unicode characters, including Jenny, would then display Âź as Âź.
Alternatively, if you prefer yarnd to pretty-print all twts nicely, even ones from simpler clients, thatâs fine too and you donât need to change anything. My 1/4
-> Âź thing is nothing more than a minor irritation which probably isnât worth overthinking.
@prologic@twtxt.net Iâm not a yarnd user, so it doesnât matter a whole lot to me, but FWIW Iâm not especially keen on changing how I format my twts to work around yarndâs quirks.
I wonder if this kind of postprocessing would fit better between composing (via yarndâs UI) and publishing. So, if a yarnd user types Âź, it could get changed to Âź in the twtxt.txt file for everyone to see, not just people reading through yarnd. But when I type Âź, meaning first out of four, as a non-yarnd user, the meaning wouldnât get corrupted. I can always type Âź directly if thatâs what I really intend.
(This twt might be easier to understand if you read it without any transformations :-P)
Anyway, again, Iâm not a yarnd user, so do what you will, just know you might not be seeing exactly what I meant.
Recent #fiction #scifi #reading:
The Memory Police by YĹko Ogawa. Lovely writing. Very understated; reminded me of Kazuo Ishiguro. Sort of like Nineteen Eighty-Four but not. (I first heard it recommended in comparison to that work.)
Subcutanean by Aaron Reed; https://subcutanean.textories.com/ . Every copy of the book is different, which is a cool idea. I read two of them (one from the library, actually not different from the other printed copies, and one personalized e-book). I donât read much horror so managed to be a little creeped out by it, which was fun.
The Wind from Nowhere, a 1962 novel by J. G. Ballard. A random pick from the sci-fi section; I think I picked it up because it made me imagine some weird 4-dimensional effect (âfrom nowhereâ meaning not in a normal direction) but actually (spoiler) it was just about a lot of wind for no reason. The book was moderately entertaining but there was nothing special about it.
Currently reading Scale by Greg Egan and Inversion by Aric McBay.
More thoughts about changes to twtxt (as if we havenât had enough thoughts):
- There are lots of great ideas here! Is there a benefit to putting them all into one document? Seems to me this could more easily be a bunch of separate efforts that can progress at their own pace:
1a. Better and longer hashes.
1b. New possibly-controversial ideas like edit: and delete: and location-based references as an alternative to hashes.
1c. Best practices, e.g. Content-Type: text/plain; charset=utf-8
1d. Stuff already described at dev.twtxt.net that doesnât need any changes.
We wonât know what will and wonât work until we try them. So Iâm inclined to think of this as a bunch of draft ideas. Maybe later when weâve seen it play out it could make sense to define a group of recommended twtxt extensions and give them a name.
Another reason for 1 (above) is: I like the current situation where all you need to get started is these two short and simple documents:
https://twtxt.readthedocs.io/en/latest/user/twtxtfile.html
https://twtxt.readthedocs.io/en/latest/user/discoverability.html
and everything else is an extension for anyone interested. (Deprecating non-UTC times seems reasonable to me, though.) Having a big long âtwtxt v2â document seems less inviting to people looking for something simple. (@prologic@twtxt.net you mentioned an anonymous comment âyouâve ruined twtxtâ and while I donât completely agree with that commenterâs sentiment, I would feel like twtxt had lost something if it moved away from having a super-simple core.)All that being said, these are just my opinions, and Iâm not doing the work of writing software or drafting proposals. Maybe I will at some point, but until then, if youâre actually implementing things, youâre in charge of what you decide to make, and Iâm grateful for the work.
#fzf is the new emacs: a tool with a simple purpose that has evolved to include an #email client. https://sr.ht/~rakoo/omail/
Iâm being a little silly, of course. fzf doesnât actually check your email, but it appears to be basically the whole user interface for that mail program, with #mblaze wrangling the emails.
Iâve been thinking about how I handle my email, and am tempted to make something similar. (When I originally saw this linked the author was presenting it as an example tweaked to their own needs, encouraging people to make their own.)
This approach could surely also be combined with #jenny, taking the place of (neo)mutt. For example mblazeâs mthread tool presents a threaded discussion with indentation.
@lyse@lyse.isobeef.org Iâd suggest making the whole content-type thing a SHOULD, to accommodate people just using some hosting service they donât have much control over. (The same situation could make detecting followers hard, but IMO âplease email me if you follow meâ is still legit twtxt, even if inconvenient.)
@prologic@twtxt.net Thanks for writing that up!
I hope it can remain a living document (or sequence of draft revisions) for a good long time while we figure out how this stuff works in practice.
I am not sure how I feel about all this being done at once, vs. letting conventions arise.
For example, even today I could reply to twt abc1234 with â(#abc1234) Edit: âŚâ and I think all you humans would understand it as an edit to (#abc1234). Maybe eventually it would become a common enough convention that clients would start to support it explicitly.
Similarly we could just start using 11-digit hashes. We should iron out whether itâs sha256 or whatever but thereâs no need get all the other stuff right at the same time.
I have similar thoughts about how some users could try out location-based replies in a backward-compatible way (append the replyto: stuff after the legacy (#hash) style).
However I recognize that Iâm not the one implementing this stuff, and itâs less work to just have everything determined up front.
Misc comments (I havenât read the whole thing):
Did you mean to make hashes hexadecimal? You lose 11 bits that way compared to base32. Iâd suggest gaining 11 bits with base64 instead.
âClients MUST preserve the original hashâ â do you mean they MUST preserve the original twt?
Thanks for phrasing the bit about deletions so neutrally.
I donât like the MUST in âClients MUST follow the chain of reply-to referencesâŚâ. If someone writes a client as a 40-line shell script that requires the user to piece together the threading themselves, IMO we shouldnât declare the client non-conforming just because they didnât get to all the bells and whistles.
Similarly I donât like the MUST for user agents. For one thing, you might want to fetch a feed without revealing your identty. Also, it raises the bar for a minimal implementation (Iâm again thinking again of the 40-line shell script).
For âwho followsâ lists: why must the long, random tokens be only valid for a limited time? Do you have a scenario in mind where they could leak?
Why canât feeds be served over HTTP/1.0? Again, thinking about simple software. I recently tried implementing HTTP/1.1 and it wasnât too bad, but 1.0 would have been slightly simpler.
Why get into the nitty-gritty about caching headers? This seems like generic advice for HTTP servers and clients.
Iâm a little sad about other protocols being not recommended.
I donât know how I feel about including markdown. I donât mind too much that yarn users emit twts full of markdown, but Iâm more of a plain text kind of person. Also it adds to the length. I wonder if putting a separate document would make more sense; that would also help with the length.
@david@collantes.us Thanks, thatâs good feedback to have. I wonder to what extent this already exists in registry servers and yarn pods. I havenât really tried digging into the past in either one.
How interested would you be in changes in metadata and other comments in the feeds? Iâm thinking of just permanently saving every version of each twtxt file that gets pulled, not just the twts. It wouldnât be hard to do (though presenting the information in a sensible way is another matter). Compression should make storage a non-issue unless someone does something weird with their feed like shuffle the comments around every time I fetch it.
@prologic@twtxt.net Do you have a link to some past discussion?
Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I donât think I have to honour that request, no matter how European they are.
I am really bothered by the idea that someone could force me to delete my private, personal record of my interactions with them. Would I have to delete my journal entries about them too if they asked?
Maybe a public-facing client like yarnd needs to consider this, but that also bothers me. I was actually thinking about making an Internet Archive style twtxt archiver, letting you explore past twts, including long-dead feeds, see edit histories, deleted twts, etc.
I wrote some code to try out non-hash reply subjects formatted as (replyto ), while keeping the ability to use the existing hash style.
I donât think we need to decide all at once. If clients add support for a new method then people can use it if they like. The downside of course is that this costs developer time, so I decided to invest a few hours of my own time into a proof of concept.
With apologies to @movq@www.uninformativ.de for corrupting jennyâs beautiful code. I donât write this expecting you to incorporate the patch, because it does complicate things and might not be a direction you want to go in. But if you like any part of this approach feel free to use bits of it; I release the patch under jennyâs current LICENCE.
Supporting both kinds of reply in jenny was complicated because each email can only have one Message-Id, and because itâs possible the target twt will not be seen until after the twt referencing it. The following patch uses an sqlite database to keep track of known (url, timestamp) pairs, as well as a separate table of (url, timestamp) pairs that havenât been seen yet but are wanted. When one of those âwantedâ twts is finally seen, the mail file gets rewritten to include the appropriate In-Reply-To header.
Patch based on jenny commit 73a5ea81.
https://www.falsifian.org/a/oDtr/patch0.txt
Not implemented:
- Composing twts using the (replyto âŚ) format.
- Probably other important things Iâm forgetting.
@prologic@twtxt.net I wouldnât want my client to honour delete requests. I like my computerâs memory to be better than mine, not worse, so it would bug me if I remember seeing something and my computer canât find it.
Hmm, but yarnd also isnât showing these twts as being part of a thread. @prologic@twtxt.net you said yarnd respects customs subjects. Shouldnât these twts count as having a custom subject, and get threaded together?
@sorenpeter@darch.dk I like this idea. Just for fun, Iâm using a variant in this twt. (Also because Iâm curious how it non-hash subjects appear in jenny and yarn.)
URLs can contain commas so I suggest a different character to separate the url from the date. Is this twt Iâve used space (also after âreplytoâ, for symmetry).
I think this solves:
- Changing feed identities: although @mckinley@twtxt.net points out URLs can change, I think this syntax should be okay as long as the feed at that URL can be fetched, and as long as the current canonical URL for the feed lists this one as an alternate.
- editing, if you donât care about message integrity
- finding the root of a thread, if youâre not following the author
An optional hash could be added if message integrity is desired. (E.g. if you donât trust the feed author not to make a misleading edit.) Other recent suggestions about how to deal with edits and hashes might be applicable then.
People publishing multiple twts per second should include sub-second precision in their timestamps. As you suggested, the timestamp could just be copied verbatim.
(#hash;#originalHash)
would also work.
Maybe Iâm being a bit too purist/minimalistic here. As I said before (in one of the 1372739 posts on this topic â or maybe I didnât even send that twt, I donât remember đ ), I never really liked hashes to begin with. They arenât super hard to implement but they are kind of against the beauty of the original twtxt â because you need special client support for them. Itâs not something that you could write manually in your
twtxt.txt
file. With @sorenpeter@darch.dkâs proposal, though, that would be possible.
Tangentially related, I was a bit disappointed to learn that the twt subject extension is now never used except with hashes. Manually-written subjects sounded so beautifully ad-hoc and organic as a way to disambiguate replies. Maybe Iâll try it some time just for fun.
@prologic@twtxt.net earlier you suggested extending hashes to 11 characters, but hereâs an argument that they should be even longer than that.
Imagine I found this twt one day at https://example.com/twtxt.txt :
2024-09-14T22:00Z Useful backup command: rsync -a â$HOMEâ /mnt/backup
and I responded with â(#5dgoirqemeq) Thanks for the tip!â. Then Iâve endorsed the twt, but it could latter get changed to
2024-09-14T22:00Z Useful backup command: rm -rf /some_important_directory
which also has an 11-character base32 hash of 5dgoirqemeq. (Iâm using the existing hashing method with https://example.com/twtxt.txt as the feed url, but Iâm taking 11 characters instead of 7 from the end of the base32 encoding.)
Thatâs what I meant by âspoofingâ in an earlier twt.
I donât know if preventing this sort of attack should be a goal, but if it is, the number of bits in the hash should be at least two times log2(number of attempts we want to defend against), where the âtwo timesâ is because of the birthday paradox.
Side note: current hashes always end with âaâ or âqâ, which is a bit wasteful. Maybe we should take the first N characters of the base32 encoding instead of the last N.
Code I used for the above example: https://fossil.falsifian.org/misc/file?name=src/twt_collision/find_collision.c
I only needed to compute 43394987 hashes to find it.
@lyse@lyse.isobeef.org This looks like a nice way to do it.
Another thought: if clients canât agree on the url (for example, if we switch to this new way, but some old clients still do it the old way), that could be mitigated by computing many hashes for each twt: one for every url in the feed. So, if a feed has three URLs, every twt is associated with three hashes when it comes time to put threads together.
A client stills need to choose one url to use for the hash when composing a reply, but this might add some breathing room if thereâs a period when clients are doing different things.
(From what I understand of jenny, this would be difficult to implement there since each pseudo-email can only have one msgid to match to the in-reply-to headers. I donât know about other clients.)
@movq@www.uninformativ.de Another idea: just hash the feed url and time, without the message content. And donât twt more than once per second.
Maybe you could even just use the time, and rely on @-mentions to disambiguate. Not sure how that would work out.
Though I kind of like the idea of twts being immutable. At least, itâs clear which version of a twt youâre replying to (assuming nobody is engineering hash collisions).
In fact, maybe your public key idea is compatible with my last point. Just come up with a url scheme that means âthis feedâs primary URL is actually a public keyâ, and then feed authors can optionally switch to that.
@prologic@twtxt.net Some criticisms and a possible alternative direction:
Key rotation. Iâm not a security person, but my understanding is that itâs good to be able to give keys an expiry date and replace them with new ones periodically.
It makes maintaining a feed more complicated. Now instead of just needing to put a file on a web server (and scan the logs for user agents) I also need to do this. What brought me to twtxt was its radical simplicity.
Instead, maybe we should think about a way to allow old urls to be rotated out? Like, my metadata could somehow say that X used to be my primary URL, but going forward from date D onward my primary url is Y. (Or, if you really want to use public key cryptography, maybe something similar could be used for key rotation there.)
Itâs nice that your scheme would add a way to verify the twts you download, but https is supposed to do that anyway. If you donât trust https to do that (maybe you donât like relying on root CAs?) then maybe your preferred solution should be reflected by your primary feed url. E.g. if you prefer the security offered by IPFS, then maybe an IPNS url would do the trick. The fact that feed locations are URLs gives some flexibility. (But then rotation is still an issue, if I understand ipns right.)
@movq@www.uninformativ.de @prologic@twtxt.net Another option would be: when you edit a twt, prefix the new one with (#
@movq@www.uninformativ.de thanks for getting to the bottom of it. @prologic@twtxt.net is there a way to view yarndâs copy of the raw twt? The edit didnât result in a visible change; being able to see what yarnd originally downloaded would have helped me debug.
@prologic@twtxt.net I guess I thought they were search engines. Anyway, the registry API looks like a decent one for searching for tweets. Could/should yarn.social pods implement the same API?
@movq@www.uninformativ.de Thanks! Looking forward to trying it out. Sorry for the silence; I have become unexpectedly busy so no time for twtxt these past few days.
(@anth@a.9srv.netâs feed almost never works, but I keep it because they told me they want to fix their server some time.)
I guess I can configure neomutt to hide the feeds I donât care about.
@movq@www.uninformativ.de Is there a good way to get jenny to do a one-off fetch of a feed, for when you want to fill in missing parts of a thread? I just added @slashdot@feeds.twtxt.net to my private follow file just because @prologic@twtxt.net keeps responding to the feed :-P and I want to know what heâs commenting on even though I donât want to see every new slashdot twt.
@bender@twtxt.net Based on my experience so far, as a user, I would be upset if my client dropped someone from my follower list, i.e. stopped fetching their feed, without me asking for that to happen.
@bender@twtxt.net Iâm not a yarnd user, but automatically unfollowing on 404 doesnât seem right. Besides @lyse@lyse.isobeef.orgâs example, I could imagine just accidentally renaming my own twtxt file, or forgetting to push it when I point my DNS to a new web server. Iâd rather not lose all my yarnd followers in a situation like that (and hopefully they feel the same).
@movq@www.uninformativ.de The success of large neural nets. People love to criticize todayâs LLMs and image models, but if you compare them to what we had before, the progress is astonishing.
Does anyone care about the 140-char limit recommended by the #twtxt spec? I have been trying to respect it but wonder if itâs wasted effort.
Hello twtxt! Iâm James (or @falsifian@www.falsifian.org). I live in Toronto. Recent interests include space complexity, simple software, and science fiction.