<FromGitter>
<thelinuxlich> hey guys, help me believe in Crystal potential: I'm converting a heavy scraper implemented in Node.js but I found Crystal consumes the same amount of resources and is much more slower...I tried optimizing everything according to official docs but now I'm feeling like fighting with the language to get the holy grail...if you have 5 minutes to spare, please take a look at my code:
<FromGitter>
<watzon> @thelinuxlich the first advice I can give is not to use Halite if you want speed
<FromGitter>
<watzon> I was using it initially for my Telegram bot library. As soon as I refactored it out I saw a major speed boost.
<FromGitter>
<watzon> Another issue that I can see is that you're spawning a new client in every time a new worker is created. Actually I see you calling `Halite.get` 3 times in each loop which is going to create a new `Halite::Client` instance each time. It's much better to use something like https://github.com/watzon/pool to create a connection pool. Then you can check out clients and check them back in when you're done with them.
<FromGitter>
<watzon> Lastly (for now anyway) you should take any regular expressions/db queries and separate them out into constants so you're not reinitialize them every time. Probably won't make the biggest difference, but it will definitely at least make some difference.
<FromGitter>
<watzon> Oh also `Logger` is deprecated
<FromGitter>
<ImAHopelessDev_gitlab> that's a lot of damage, wait, channels
<FromGitter>
<didactic-drunk> @thelinuxlich I think most of your time is in the HTTP requests or possibly pg. How does changing WORKERS effect runtime? What does your profiler say?
<FromGitter>
<didactic-drunk> Also possibly redis or rabbit if they're not local.
<FromGitter>
<watzon> Yeah my guess is probably the HTTP requests. Halite is convenient, but it's slow. And they're creating 3 new client instances per loop.
<FromGitter>
<thelinuxlich> @watzon I thought buffering the response would add anything, but it doesn't, the 10 seconds are for waiting the rabbitMQ container to go up
<FromGitter>
<thelinuxlich> @watzon thanks for the advices, I will replace Halite with raw HTTP::client, if I reuse the HTTP::client instance between fibers, will I have trouble?
<FromGitter>
<watzon> Ahh gotcha. Well yeah, the other suggestions will probably improve performance a lot.
<FromGitter>
<watzon> In general you need to be careful with allocations
<FromGitter>
<thelinuxlich> @didactic-drunk If I raise more the workers I get a lot of SSL i/o error, HTTP EOF error, SSL connection reset by the peer, etc
<FromGitter>
<thelinuxlich> I think PG is not the problem because if I strip everything and just insert data it will go fast
<FromGitter>
<thelinuxlich> the body I'm using .scan is big(some MB)
<FromGitter>
<thelinuxlich> but anyway in Node.js this isn't a problem so it shouldn't be in Crystal, right?
<FromGitter>
<thelinuxlich> ok, actually I was using Halite just because of the follow redirect
<FromGitter>
<watzon> @thelinuxlich the issue is likely a difference in underlying architecture. I'd be willing to bet that nodejs reuses requests, whereas with Halite you're creating a new instance every time.
<FromGitter>
<thelinuxlich> Node.js uses keepalive too, I don't see how to do that in Crystal
<FromGitter>
<watzon> Unfortunately not possible with the current `HTTP::Client` I don't think
<FromGitter>
<watzon> I believe there's an issue for it. Don't know what's holding it up.
<FromGitter>
<thelinuxlich> but even with Connection: close I don't think it should be *that* slower, my code is like 10x slower than Node.js
<FromGitter>
<watzon> Like I said, it's the recreation of clients in each loop more than likely. Each time you invoke `Halite.get` isn't spinning up a new `Halite::Client` instance in the background which really slows things down and increases memory usage.
<FromGitter>
<watzon> You're potentially creating up to 3000 `Halite::Client` instances simultaneously with your 1000 workers
<FromGitter>
<watzon> Another thing that will speed things up is adding the `-Dpreview_mt` flag if you're not already
<FromGitter>
<watzon> That will turn on multithreading
<FromGitter>
<wyhaines> Node.js is quite a lot slower than Crystal, so my expectation is that once you fix the architectural problems (everyone else has covered them pretty well) with your Crystal code, you will see significantly improved performance.
<FromGitter>
<didactic-drunk> More workers means more concurrent requests and more open file descriptors. You can probably hack `.close` in to the HTTP requests and increase WORKERS.
<FromGitter>
<didactic-drunk> Someone had worse performance with msgpack in crystal vs Ruby (I think). Maybe large JSON is causing a problem? Can you `curl` a file and benchmark the parsing without network calls?
<FromGitter>
<thelinuxlich> JSONs are small, the http body used for regex is big
<FromGitter>
<didactic-drunk> Are you compiling with --release -Dpreview_mt?
zorp_ has joined #crystal-lang
<FromGitter>
<didactic-drunk> I think @watzon probably solved your problems except for potential network/file descriptor issues.
<FromGitter>
<didactic-drunk> If it's still slow use a profiler.
<FromGitter>
<watzon> Yeah that could definitely do it too
<FromGitter>
<watzon> Speaking of queues though, I'm trying to figure out my own. This question inspired me to fix arachnid and get it current and there's a whole slew of things to update, including the way I implemented the url queue.
<FromGitter>
<watzon> I was using `future` and putting them into a pool, then pulling them out when the pool reached a certain size, but tbh I don't know how this code ever worked lol.
<FromGitter>
<thelinuxlich> @watzon turning out mt without doing anything else will add perf?
<FromGitter>
<watzon> Perf? I can guarantee that just turning on mt won't fix your problems.
<FromGitter>
<thelinuxlich> sure, but turning on mt I need to do something to make the code multithreaded right?
<FromGitter>
<naqvis> you are doing concurrent executions via Fibers, `-Dpreview_mt` will enable parallelism.
<FromGitter>
<naqvis> without this flag, all Fibers are run on single thread
<FromGitter>
<naqvis> `mt` enables multi-threaded execution of Fibers
<FromGitter>
<wyhaines> @didactic-drunk It was me who was having performance problems with msgpack. They are 100% due to how the msgpack shard implements it's io reads. So, I just implemented some code myself to handle it, and zoom! Back to the races.
<FromGitter>
<wyhaines> So I use the msgpack shard for ONLY packing and unpacking of data, and leave putting that stuff out to a socket, and getting it back from the socket to my own code.
<FromGitter>
<naqvis> @wyhaines interesting topic. Is that msgpack protocol implications? or its implementation which was the bottle-neck?
<FromGitter>
<wyhaines> Well, maybe a little of both. The problem with msgpack is that until you read the message, you don't know how much you have to read. So, there are times when one has to advance a single byte at a time. ⏎ ⏎ I haven't dug into the Ruby library (which uses a C extension) to see if they do anything clever to optimize for this, that the crystal library isn't doing, but since I am using msgpack simply for an
<FromGitter>
... efficient serializer, I just wrote a little framing code that sends the length of the packed message, followed by that message. So, it is reduced to two reads. One very small one of a fixed size to get the length, and then a second longer one to read the packed message.
<FromGitter>
<naqvis> thanks for the details
<FromGitter>
<wyhaines> I experimented with monkey patching the crystal library to read from the socket in big chunks and operate from an in-memory buffer as much as possible, but in the end it was too much work for too little gain compared to just adjusting my protocol a little bit, and that worked great.
<jhass>
sorcus: the compiler is not smart enough for this. _value could be nil in the else branch, so it's nilable when being closured as something might change it until the proc is invoked
<frojnd>
I would like to create a function which has 2 arguments, first is mendatory, second is optional 1) What kind of functions should I create? Can you point me to docs? 2) Inside function I would like to craete a logic if no second param then ... else ....
<frojnd>
I know crystal has some neat definitions for functions just can't remember how are they called
<jhass>
sure, keep asking if something is unclear :)
<FromGitter>
<rishavs> I am trying to nest string interpolations using a foreach loop in the parent one. But I am not getting any output here. What am I doing wrong? ⏎ https://carc.in/#/r/9aoc
<jhass>
yes that doesn't work, "each" does not return the block body value in any way. The minimal change to make that work would be: https://carc.in/#/r/9aoe. Better yet: https://carc.in/#/r/9aoh
<FromGitter>
<rishavs> Thanks @jhass!
<FromGitter>
<rishavs> So, I was actually trying to benchmark nested ECRs vs nested string interpolations. I am used to vanilla js where I use nested interpolations often for creating web app views. ⏎ Anyway, pretty happy to say that ECR and string interpolations have roughly the same performance. ⏎ ⏎ ```code paste, see link``` [https://gitter.im/crystal-lang/crystal?at=5eec99413a0d3931fa9fe6d1]
<jhass>
that's not too surprising considering they're both implemented in essentially the same way :)
<FromGitter>
<rishavs> For reason I was biased against ECRs, thinking that they must have some perf overhead. So decided to check. This is the code I used. https://carc.in/#/r/9aom
<FromGitter>
<rishavs> I love how I can simply benchmark things so quickly in Crystale whenever I have any doubts
<jhass>
we can also just compare implementation :)
<raz>
that's the gist basically. doesn't amount to much but took an eternity to find a combination that has recent enough library versions (zstd/sodium) because the auto-build in the shards failed in strange ways that i didn't feel like debugging.
<raz>
plus relative refs in shard.yml are a bit of a headache (maybe some kind of vendoring mechanism would be good in the future)
<raz>
anyway... next stop, turning this into a static build on alpine. but i'll need a few more beers for that. and aspirine.
<jhass>
mmh
<jhass>
/home/jhass/crystal/src/atomic.cr:125: undefined reference to `__sync_fetch_and_max_4'
<jhass>
wanna trade?
<jhass>
:P
<raz>
pffft. just add __def_fetch_and_max_4, duh
<raz>
:p
<jhass>
the fun part there's no mention of that in the entire codebase :P
<raz>
how many times have we told you to stop randomly pulling other peoples branches :p
<raz>
jk. that looks and sounds nasty. working on MT stuff?
<jhass>
nah
<jhass>
32 bit arm
<jhass>
I guess it's just an instruction unavailable there
<jhass>
and we still didn't figure out a nice way to include compiler-rt
<raz>
hm yea, that's beyond my area of expertise by at least 32 bits
<raz>
phew. well, at least the comment may give a hint; Different instantiations will generate appropriate assembly for ARM and Thumb-2 versions of the functions.
<jhass>
ooor I link /usr/lib/clang/10.0.0/lib/linux/libclang_rt.builtins-armhf.a for now and worry later
<raz>
sounds good. meanwhile i'll just pray to never bump into that type of error message
<Elouin>
Hi, I installed crystal on opensuse tumbleweed as described on the website and now i get on every zypper run: "Access to 'https://dist.crystal-lang.org/rpm/media.1/media' denied."
<frojnd>
Interesting
<frojnd>
I included module
<frojnd>
At the top of the file
<frojnd>
I then issued user_input = gets
<frojnd>
puts get_verser user_input
<frojnd>
And I get error: no overload matches 'get_verse' with type (String | Nil)
<FromGitter>
<Blacksmoke16> `gets` can return `nil`
<frojnd>
I mean I haven't even entered the id yet
<frojnd>
It didn't even ask me for input and it's already complaining..
<FromGitter>
<Blacksmoke16> prob fine to just do `gets.not_nil!`
<FromGitter>
<Blacksmoke16> right, that was a compile time check
<frojnd>
Ok,.. not sure why appending `.not_nil!` helped
<FromGitter>
<Blacksmoke16> because it tells the compiler that you know it wont return `nil`
<FromGitter>
<Blacksmoke16> i.e. removes `Nil` from the union
<frojnd>
Aa
<frojnd>
Smart
<FromGitter>
<Blacksmoke16> usually `.not_nil!` is a bit of a smell, but in this case its fine, since you know it wont
<FromGitter>
<Blacksmoke16> be nil*
<FromGitter>
<watzon> Yeah antipattern for sure unless you're 100% sure it's not going to be nil, and even then it can be nice (for safety sake) to use a guard clause. But this should be fine.
<FromGitter>
<watzon> Just don't make a habit of it 😉
<raz>
sadly in some situations there is no way to avoid it
<FromGitter>
<watzon> Sadly
<yxhuvud>
There are cases but as you get more experienced the amount of times that happens get more rare.
deavmi has quit [Quit: Eish! Load shedding.]
<FromGitter>
<watzon> Definitely
<sorcus>
If i increment a counter `@counter += 1` from multiple fibers / threads will it be a thread-safe by default? :-D
deavmi has joined #crystal-lang
<FromGitter>
<Blacksmoke16> i doubt it
<sorcus>
Blacksmoke16: X-)
<oprypin>
why does nobody ever know about `read_line`
<oprypin>
that's just strictly better, it can't return nil
<FromGitter>
<watzon> sorcus: probably better to use an `Atom`
<sorcus>
watzon: You mean a code editor?
<FromGitter>
<watzon> 😂
<sorcus>
watzon: :-D
<FromGitter>
<watzon> Sorry, `Atomic`
<sorcus>
watzon: Ok, thanks :-)
deavmi has quit [Read error: Connection reset by peer]
<FromGitter>
<j8r> I was seeking something like this for python, I found pyoxidizer
<oz>
re: Alpine package disappearing, I heard that nix was about solving these kind of issues 🤔
<oz>
(or Guix if you're less into haskell and more into schemes)
<oprypin>
nice
<jhass>
it's essentially a minimal snap/flatpak I guess
<jhass>
raz: btw, 0.35 is only in in edge, 3.12 still has 0.34
<sorcus>
Can be `spawn` used for parallels jobs? X-)
<jhass>
concurrent or parallel?
<sorcus>
jhass: parallel.
<jhass>
with -Dpreview_mt yes
<jhass>
but note that's still sorta experimental
<sorcus>
jhass: But this limited to number of cores, right?
<jhass>
it runs on a thread pool, there's an ENV var to set the size of that iirc, but the default is core count, yes
<sorcus>
jhass: So i can't run a hundreds of jobs? :-(
<jhass>
where did I say so?
<jhass>
what's your payload anyways?
<jhass>
if it's IO bound it'll block and Crystal will run something else meanwhile, even without -Dpreview_mt
<sorcus>
jhass: i assumed :-D
<jhass>
if it's CPU bound, what's the point of running more than number of cores at the same time anyways? They'll just compete with each other for the CPU
<sorcus>
jhass: "what's your payload anyways?" - a thousands of hash sums for strings.
<jhass>
so CPU bound
<jhass>
see above
<jhass>
running a thousand of them in parallel will make things slower, not faster
<sorcus>
jhass: Hmmm... I didn't thought about this.
<jhass>
parallelization is by no means a magic solution, it's a trade off
<jhass>
you can easily make things slower compared to smart concurrent execution
<jhass>
because parallelization has a higher synchronization overhead
<jhass>
for most workloads
<sorcus>
jhass: Ok, thank for explanation. :-)
zorp_ has quit [Ping timeout: 264 seconds]
<FromGitter>
<wyhaines> @sorcus: If you are CPU bound, your limit is ultimately the number of cores, in any language. Depending on the source of your strings, you might be able to just run multiple processes -- 1 per core -- if you aren't using -Dpreview_mt, and get good results. For IO Bound stuff, multiple threads can actually cost you performance. ⏎ ⏎ I have a project that I am working on that, in it's heart of hears, just
<FromGitter>
... receives data packets from myriad clients, does some mild magic to those data packets, and shoves them somewhere else. ⏎ ⏎ In my crude benchmarks, I can run 4 clients that each hammer a million messages to the server as fast as they can (with everything running under Ubuntu/WSL1 on a Windows 10 laptop), and because the major ... [https://gitter.im/crystal-lang/crystal?at=5eed0b6b405be935cdaef986]
<FromGitter>
<wyhaines> If I run it multithreaded, the performance drops/
<FromGitter>
<wyhaines> It takes about 11 seconds to handle 4 million records sent by 4 clients.
<FromGitter>
<wyhaines> In case that wasn't clear. Single threaded -- server handles at least 4000000 records in 8 seconds. Multithreaded, it handles them in about 11 seconds.
<FromGitter>
<watzon> I really do love how easy this is
bcardiff has joined #crystal-lang
bcardiff has quit [Client Quit]
<FromGitter>
<dscottboggs_gitlab> meh, it's just patches. think I'll wait for it to hit the repos
<yxhuvud>
what do you mean? I just got it through apt update.
<FromGitter>
<dscottboggs_gitlab> I'm on manjaro. Repo maintainers give packages a couple weeks of no bug reports on the arch bug tracker before passing it on. Woes of having Crystal *actually* in your stable distro's repos
<FromGitter>
<Blacksmoke16> snap ftw
<FromGitter>
<Blacksmoke16> 😉
<FromGitter>
<dscottboggs_gitlab> (rather than added as a PPA/snap)
<FromGitter>
<dscottboggs_gitlab> not a big snap fan
<FromGitter>
<dscottboggs_gitlab> Had too much trouble with it. Especially on non-ubuntu distros
<FromGitter>
<Blacksmoke16> oh?
<FromGitter>
<dscottboggs_gitlab> yeah I used to run ubuntu and had TG and FF installed from snap. Fonts kept breaking. I'd have to do some weird stuff that took several minutes and I hated it. Plus running into permissions weirdness. I get that that's a good thing because security and all but I still don't like to have to think about it.
<FromGitter>
<Blacksmoke16> i really only use it for crystal and some other small stuf
<FromGitter>
<dscottboggs_gitlab> When I tried snapd on manjaro I tried starting it before rebooting and it broke snapd (I think permanently for the installation)
<FromGitter>
<Blacksmoke16> 😬
<FromGitter>
<dscottboggs_gitlab> > mostly cli stuff actually ⏎ ⏎ Yeah I just use docker for that
<FromGitter>
<dscottboggs_gitlab> I have to use docker all the time anyway so it's just easier that way
<FromGitter>
<dscottboggs_gitlab> might not be so for people who aren't used to docker though
ua_ has quit [Ping timeout: 260 seconds]
ua_ has joined #crystal-lang
ua_ is now known as ua
<FromGitter>
<thelinuxlich> @watzon tried using -Dpreview_mt, but I'm getting a lot of invalid pointers
<FromGitter>
<watzon> Happens if things aren't thread safe
<FromGitter>
<watzon> Sometimes even if they are. It's still in preview for a reason 😄
<FromGitter>
<watzon> You actually inspired me to rework my own web crawler framework. Working on that right now.
<FromGitter>
<thelinuxlich> I think it can't save more RAM due to the pools
<FromGitter>
<watzon> Ahh nice, you took my advice and used `pool`
<FromGitter>
<watzon> Yeah I didn't realize you were working with so many urls
<FromGitter>
<watzon> You could save some RAM if you're willing to sacrifice a little speed. May not even need to sacrifice anything. I'd take the number of initial pools down to something like 10.
<FromGitter>
<thelinuxlich> no, I actually want to trade RAM for speed
<FromGitter>
<thelinuxlich> Oh, I forgot to use the new Log
<FromGitter>
<watzon> What I'd actually do is try and see what the maximum size of your pools ends up being. Put the initial size really low, let it run, and then at the end `pp http_pools` and check what size they all end up being.
<FromGitter>
<watzon> It's possible that you don't even end up using all that pool space and that you have unnecessarily allocated clients just sitting in there
<FromGitter>
<thelinuxlich> but it is on the bootstrap so it won't affect the performance right?
<FromGitter>
<watzon> It still will affect memory usage since those are being allocated at runtime
<FromGitter>
<watzon> Granted if you're ok with high RAM usage it's fine. It won't kill anything, but I like to squeeze performance out of my apps where I can haha.
<FromGitter>
<bcardiff> @dscottboggs_gitlab your manjaro snap experience was prior May 2019? If so, I recall that there were some updates at that time regarding manjaro distro and its integration with snap.
bcardiff has quit [Client Quit]
<FromGitter>
<dscottboggs_gitlab> > Sometimes even if they are. It's still in preview for a reason ⏎ No, seriously `-Dpreview_mt` is unsound and unsafe. Don't use it unless you're hoping to create more bug reports
<FromGitter>
<dscottboggs_gitlab> @bcardiff not sure TBH. perhaps
<FromGitter>
<watzon> Hopefully that will be fixed soon? I mean we are almost to v1.0.
<FromGitter>
<dscottboggs_gitlab> @watzon the move to 1.0 is largely due to semantic stability. I haven't been around too much lately but AFAIK MT is still unsound for the forseeable future
<FromGitter>
<watzon> Sad
<FromGitter>
<dscottboggs_gitlab> Yeah, I was hoping that by reimplementing libcsp (a CSP-style MT lib written in C) I could help a bit, but I got stuck on getting tests to pass on my thread-safe RBQ implementation
<FromGitter>
<dscottboggs_gitlab> (turns out channels are just a `RingBufferQueue(Atomic(T))`)