#elliottcable on 2019-08-24 — irc logs at freenode.irclog.whitequark.org

2017-02-04 23:23 ec changed the topic of #elliottcable to: a 𝕯𝖊𝖓 𝖔𝖋 𝕯𝖊𝖙𝖊𝖗𝖒𝖎𝖓𝖊𝖉 𝕯𝖆𝖒𝖘𝖊𝖑𝖘 slash s͔̞u͕͙p͙͓e̜̺r̼̦i̼̜o̖̬r̙̙ c̝͉ụ̧͘ḷ̡͙ţ͓̀ || #ELLIOTTCABLE is not about ELLIOTTCABLE

01:47 Sgeo_ has quit [Read error: Connection reset by peer]

02:08 Sgeo has joined #elliottcable

02:22 <ec> i wrote a thing, if anybody wants to proofread my technical writing, lol

02:23 <ec> https://twitter.com/ELLIOTTCABLE/status/1165084203039563776

02:23 <ec> 'specially if you don't Know Unicode Crap that well.

02:24 <ec> sigh i love and miss programming

02:51 <ljharb> ec: interesting. so this is for when you’re calling directly into Ocaml-compiled native modules?

02:52 <ec> ehhhhhhm, close. I see where I went wrong with the word "native"

02:52 <ec> there's no native component here; but there's code that was *written for* native platforms.

02:52 <ljharb> but invoked in node? or, js compiled to run in Ocaña

02:52 <ec> which you, the (ab)user, is compiling via BuckleScript to JavaScript instead. and thus need my little horror to adapt it.

02:52 <ljharb> ocaml

02:53 <ljharb> so is this lib something the compiler could inject?

02:53 <ec> I wish lol

02:53 <ljharb> but like ideally

02:53 <ec> so now that I have it working, and can publish working libraries for Current Real-World BuckleScript stuff, as I need to,

02:53 <ec> I'm definitely going to go complain to People

02:53 <ljharb> presumably you could write a babel transform that could be applied to the js output tho

02:53 <ec> but part of the problem is that this is a Can't-Make-Everyone-Happy situation

02:53 <ljharb> so that nobody ever has to manually use your lib

02:54 <ec> out of BuckleScript users, there's "people writing their JavaScript in OCaml", and then there's "people compiling their OCaml to JavaScript" — which sound similar, but aren't the same.

02:54 <ec> the former group know, and expect, one string-semantic; the latter group another

02:55 <ec> the real problem here, it boils down, is not BuckleScript, or JavaScript; it's OCaml. OCaml *doesn't have* a Unicode-handling story. All the UTF-8 handling stuff is very … ad-hoc, and ‘just what people do.’

02:55 <ec> there actually *is* no String type (in the JavaScript, fully-featured, Unicode-aware sense) in OCaml; only a "char array" type that's *named* `string`.

02:56 <ec> unfortunately, people expect to. y'know. use strings. and do string-y stuff with them. so, BuckleScript took a reasonable-if-annoying stance of "We're gonna leverage all of the JavaScript string-machinery, so most of the time, things function as you expect … and so code transpiles to clean, minimal, obvious operations"

02:56 <ec> but, yeah, that totally fucks up Unicode-handling in all these ancient rickety OCaml libraries.

02:57 <ec> in an ideal world it's not BuckleScript, or me, that comes up with a solution, but the *OCaml* community.

02:58 <ec> I'm trying to find a venerable GitHub Issue about this

02:59 <ec> but yeah *ideally* we'd collectively stop using, and maybe even eventually deprecate, the `string` type. (we've already started this in a different direction, for a different reason, with the new `bytes` type.) and have real, type-level encoding information and tooling ......

03:00 <ec> which is exactly the painful transition Ruby made from 1.9 to 2.0, btw. this is a well-documented growing pain for language designers: turns out, you can't make a language without already knowing Literally Everything about encoding and human language and ughhhhhhhhhhhhhhhhhh; otherwise, you're just, just *gonna* have to rebuild everything from scratch after community input from people who Actually Know Encoding

03:00 <ec> Thingies™

04:12 <ljharb> i mean tho, how did the ocaml designers not know about this

04:13 <ec> you mean, in the '80s, before Unicode existed? :P

04:13 <ljharb> is ocaml that old?

04:14 <ec> that's a somewhat facetious response, of course; OCaml, as opposed to the progenitor languages it extended, is younger than that … but also *not* so much, because Unicode also wasn't actually, well, universal, for a long time

04:14 <ec> *but* that said it's not just a matter of knowing Unicode exists. It's more … 'how do we allow developers to ergonomically deal with the real-world landscape of encodings?'

04:15 <ec> which is just a specific instantiation of the single, only Language Development Question that encompasses all language decisions: ‘How much do we hide from our user? How much do we abstract, how much power do we take away for their safety?’

04:16 <ec> what it boils down to is people building *programming* languages are somewhat rarely *human*-language nerds; and tend to belong to the tribe of programmers borne of silicon valley: "eh, I can type "LOL", it's good enough"

04:17 <ec> aaaaaaaaand then their languages grow and gain users that have to deal with real-world things like higher-plane glyphs, combining characters, legacy encodings or even outright malformed input, interoperability with systems that won't transit *well*-formed output … and those users get pissed, and kinda by definition-of-the-problem the language is now popular and established enough that those mistakes can't be unmade …

04:17 <ec> aaaaaaaaaaaand now your popular tool is a part of that ecosystem-of-other-shitty-tools-making-encoding-horrible-for-everyone, doing its very darnedest to make everything worse for everybody. great!

04:18 <ec> tl;dr I strongly respect Ruby for literally making the first large breaking-backwards-compatibility (1.0 to 2.0, after what, fifteen years? woah.) because the maintainers finally realised how important this was to The World As A Whole, lol

04:19 <ec> ANYWAY re: ocaml specifically: this is fixable of course, but OCaml is a community of crotchety academics, prolly mostly white, prolly mostly male, not exactly brimming with SJW culture and wokeness … everyone seems to think "uhhh just install Camomile if you have to 'deal with' some unicode crap ... idk? worry about it when it breaks." is good enough

05:19 <jfhbrook> lgpl huh?

05:19 <ec> hm?

05:19 <jfhbrook> idk in python I have to use byte strings and unicode strings

05:20 <jfhbrook> and like ok you have to pull in a lgpl library (camomile) to get unicode strings, but now you have unicode strings and it's fine, right?

05:20 <jfhbrook> though to be fair in your case

05:20 <jfhbrook> probably every library is written to use bytestrings so you'd have to convert in and out all the time anyway

05:21 <ec> nnnnnot quite — it's more "what's the interop story". Are all 'unicode strings' UTF-8 bytes in a byte-array? that should be something the language standardises (and, ideally, provides alternatives/escape-hatches to, as well), not something Some library authors Sometimes do.

05:21 <jfhbrook> it's fair to say that standardization is useful

05:22 <jfhbrook> when something's already a de facto standard is when you need it the least tho

05:22 <ec> Daniel Bünzli had a thing on this that I'm trying to find, in the docs to one of his Unicode-handling modules

05:23 <ec> ugh anyway I've already spent too long on this today, time to actually *use* this effort I put in, back in the place where I unearthed the bug, and get something shipped 🙄

05:23 <jfhbrook> hah, I hear that

05:24 <jfhbrook> work's been a little hectic lately

05:24 <jfhbrook> I mean hectic is the wrong word - busy I guess, stressful

05:24 <ec> oh but anyway you can see one of the fallout effects of that sort of agnosticism-based choice, right now

05:24 <jfhbrook> but I've been learning emacs, that's fun!

05:24 <ec> if BuckleScript were working off of a base that *inherently* differentiated, then it could sanely compile the two things two different ways.

05:25 <jfhbrook> predictably malformed - that's good

05:25 <ec> "the byte-string type" gets compiled to array-handling JavaScript, effort can be expended to maintain semantics for existing byte-array-manipulating-OCaml-code *and* produce idiomatic output; whereas "the user-input-string type" gets compiled with encoding/decoding machinery to massage it into JavaScript UCS-2 yadda yadda yadda.

05:26 <ec> but with this design? from the *language* perspective, the two are indistinguishable. there's no way to satisfy both requirements.

05:26 <ec> this exact thing is playing out with mutation — having mutable strings was causing serious problems for both the compiler and the community;

05:26 <ec> sure, you can just say "hey this is a string, and we're not gonna mutate it, and you shouldn't either", and document that at the library-level, maybe mint a type,

05:26 <ec> but that's just not the same.

05:27 <ec> finally things snapped in favour of breaking backwards-compatibility (in a really well-thought-out way, btw, imo!)

05:28 <ec> OCaml 4.05 introduced a new type, `bytes`, for mutable strings, just an alias to `string` … then 4.06 introduced an optional compiler-flag, `-safe-string`, to make `string` immutable, thus opting-in to breaking code that should have already switched from `string` to the explicit `bytes` type if they needed mutation ...

05:28 <ec> then 4.07 swapped the default, leaving iirc `-unsafe-string` to make legacy code work, but defaulting to `string` being an immutable type … and finally 4.08 removed the flag, breaking code that wasn't fixed in the intervening years

05:29 <ec> I might be off by one on all those numbers idk lmao

05:29 <ec> but. I appreciated that careful approach. I think processes like that are a good candidate for a replacement for the effectively-defunct SemVer, may ye rest in peace

05:31 <ec> anybody know if you can export/import typescript types? I still don't use typescript often enough to keep any of it in my head between forays ;_;

13:40 englishm has quit [Excess Flood]

13:41 englishm has joined #elliottcable

15:17 Rurik has joined #elliottcable

15:57 <ljharb> ec: lol things that don't follow semver make me facepalm so hard

15:57 <ljharb> ec: yes, you can import and export type space values

15:57 <ljharb> ec: sadly, TS doesn't have `import type` like flow does, so you have no way of knowing lexically at the callsite

17:46 Rurik has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

18:35 Sgeo_ has joined #elliottcable

18:39 Sgeo has quit [Ping timeout: 248 seconds]

20:46 Sgeo__ has joined #elliottcable

20:49 Sgeo_ has quit [Ping timeout: 245 seconds]

22:42 Sgeo has joined #elliottcable

22:44 Sgeo__ has quit [Ping timeout: 245 seconds]

23:03 Sgeo_ has joined #elliottcable

23:06 Sgeo has quit [Ping timeout: 248 seconds]