@Leaf Are there any strong arguments against adopting UTS #31 ("Identifier and Pattern Syntax") in the next version of C (C23)? Talking with someone who is apparently discussing the proposal tomorrow, and I recall you had Opinions about emoji in identifiers (which that would seem to preclude).


@aschmitz i think that's roughly what javascript does so that's where i'd look for more details/discussion regarding that

the main argument against is that it means the set of characters allowed in identifiers is constantly expanding. the reason why languages like swift or XML don't take that approach is because they want the lexical grammar to be relatively fixed. (support for emoji is usually an accident of that which springs from allowing anything greater than U+FFFF.)


@aschmitz it's also very important to not do what python does and interpret things like mathematical characters as the same as their non-mathematical counterparts. please don't do that.

it's been a while since i've looked at it in depth but i think the identifier syntax might be geared more towards things like URLs, where minimizing confusables is important, and not things like programming languages, where those distinctions are often relevant and important


@aschmitz i think generally my take is that you should have a good reason for disallowing characters and programming languages generally don't. i'm not sure what the justification is for not allowing people to just use whatever character they like in identifiers, or at least excluding special characters and punctuation, and why any processing is necessary in a programming language context beyond canonical equivalence and composition/decomposition.

those justifications exist for things like URLs, where you need something which is printable and reinputtable, and probably apply to things like databases and so forth. but identifiers in a programming language generally never leave the computer. so i'm perfectly happy with basic rules like “don't start an identifier with a combining character” and “don't include punctuation or whitespace” over complicated identifier requirements.

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!