is there consensus on what characters should(n’t) be allowed in nick
s? i remember reading somewhere whitespace should not be allowed, but i don’t see it in the spec on twtxt.dev — in fact, are there any other resources on twtxt extensions outside of twtxt.dev?
@lyse@lyse.isobeef.org @movq@www.uninformativ.de bbycll’s nickname regex is /^([-_\p{N}\p{L}])+$/iu
because i don’t like how english-centric only allowing ascii letters/numbers is though this only applies to local users as of now, currently all nicknames are tolerated when parsing remote feeds and i just do mentions how yarn does (just the feed url)
in the wild, i’ve noticed a texedus feed with spaces in the nick (where its spec explicitly disallows whitespace in the nick) and feeds with other symbols in the nick too. honestly, i think we should just tolerate arbitrary nicknames for sake of user expression (while stripping or converting unreasonable characters) and just leave them out of mentions
@zvava@twtxt.net @lyse@lyse.isobeef.org @movq@www.uninformativ.de I also was wondering how to handle this.
Currently my regex is like this: /@<((?<nick>[^\s]+)\s)?(?<url>\w+:\/\/[^>]+)>/g
It takes everything until the space and the nick is optional.
@zvava@twtxt.net In tt
, I recognize umlauts in nicks, but they cannot include whitespace, @
, !
, #
, (
, )
, [
, ]
, <
, >
, "
(but '
is okay). Whitespace also acts as a separator between nick and URL. @<Hello World http://example.com>
ends up exactly like that and is not a mention.