avsm changed the topic of #mirage to: mirage 2 released! party on!
demonimin has quit [Ping timeout: 276 seconds]
demonimin has joined #mirage
demonimin has joined #mirage
demonimin has quit [Remote host closed the connection]
demonimin has joined #mirage
rgrinberg has joined #mirage
rgrinberg has quit [Ping timeout: 244 seconds]
copy` has quit [Quit: Connection closed for inactivity]
tizoc has quit [Ping timeout: 240 seconds]
seangrove has joined #mirage
tizoc has joined #mirage
dexterph has joined #mirage
AltGr has joined #mirage
kensan has quit [Read error: Connection reset by peer]
seangrove has quit [Ping timeout: 252 seconds]
dexterph has quit [Remote host closed the connection]
dexterph has joined #mirage
kensan has joined #mirage
kensan has quit [Client Quit]
kensan has joined #mirage
andreas23 has joined #mirage
mort___ has joined #mirage
mort___ has left #mirage [#mirage]
<mato>
hannes: got a minute? i have an interesting problem i could use some help with
<mato>
hannes: am load-testing static_website, which is always a good way to find bugs, and it seems io-pages are never being recycled :-(
<mato>
hannes: at least that's what i can see by tracing calls to sbrk() in Solo5 -- the unikernel asks for a new io-page for each packet sent or received
<hannes>
mato: uh :/
<hannes>
mato: in the xen world, the allocation is done in mirage-net-xen, and a pool shared between dom0 and domU is used
<mato>
hannes: well, the solo5 netif code comes from the mirage-unix code
<hannes>
mato: ic. is it then running out of memory?
<mato>
eventually, yes
<mato>
and i see several sbrk(0x1000) calls per packet, e.g. when testing with ping
<mato>
which seems wrong
<hannes>
(and how recent was your mirage-net-unix checkout? last summer thomasga and myself digged down into some leak there... where recv was recursive, but non-tail-recursive)
<hannes>
mato: who calls sbrk? io-page allocator?
<mato>
hannes: dlmalloc
<mato>
hannes: io-page allocator calls posix_memalign() which is part of dlmalloc
<mato>
just trying to make sense of the mirage-net-solo5 history now, it seems to have the tail-recursion fixes in it, but not clear how they got there
<hannes>
mato: I've to swap in those libraries into my brain... it is currently unclear to me how the OCaml GC should know about the memory allocated by io-page in order to free it.. (since it is allocated out-of-band)
<hannes>
s/out-of-band/directly by calls to malloc and not registered to the GC/
<mato>
right, i gathered that much from the comments
<mato>
i'm wondering if i can just kill the io-page stuff, solo5 does not need the buffers to be page-aligned
<hannes>
yes
<hannes>
I argued to kill io-page for a long time
<hannes>
avsm wants to keep it for unknown reasons
<hannes>
you can just delegate to Cstruct.create in the Io_page.get
<mato>
do i need to change page_aligned_buffer in netif.mli?
<mato>
also, what deals with allocating the io-page on the write path?
<hannes>
oh I guess it is a rabbit hole to get rid of io_page properly... yes, the page_aligned_buffer is different... Cstruct.t instead of io_page.t... (which are not the same, damn)
<hannes>
on the write path something the tcpip library calls Io_page.get and shifts it a bit around to fill tcp / ip / ethernet headers
sknebel has quit [Quit: sknebel]
sknebel has joined #mirage
<mato>
hmm, except i can't substitute Cstruct.t for page_aligned_buffer, since then the interfaces don't match up with types/V1_LWT.mli :(
<mato>
ok, with that it survives the load test longer, but still eventually runs out of memory
<mato>
also, it seems to happily allocate all the heap given to it (2GB with ukvm) before (guessing) any kind of gc kicks in
<mato>
the fact that it still runs out of memory eventually suggests that those buffers are not being GC'd
mort___ has joined #mirage
mort___ has quit [Client Quit]
<hannes>
mato: I'd assume that the mirage-net-unix code is not well tested...
<hannes>
since nobody uses it in production... on unix you'd use the socket stack, or use the xen backend and then mirage-net-xen
<hannes>
(and as far as I can tell the mirage-net-xen (1.4.2 is what I use) does not leak)
mort___ has joined #mirage
<hannes>
mato: you can manually force a GC (call `Gc.full_major ()`) and get some GC stats (`Gc.stat ()`, see http://caml.inria.fr/pub/docs/manual-ocaml/libref/Gc.html) to gather evidence whether a) GC does not kick in or b) it is leaking
<mato>
hannes: thx, yeah, just goint to experiment with that now.
<hannes>
mato: I used to call it every other second and look into the live_words data.. (ignoring the minor_words)
<hannes>
if there's a unix version of that which reproducible leaks, maybe the spacetime https://github.com/ocaml/ocaml/pull/585 memory profiler helps (would be convenient to have this available on solo5&xen as well, but not sure how much work that is)
mort___ has joined #mirage
<mato>
i'll test that, and also see if i can get samoht or someone else to help take a look
<mato>
it might be best to debug together around a computer when i'm in cambridge next week
<mato>
going to fix some minor bugs in solo5 sbrk() / malloc() i found along the way...
<hannes>
sure... I'm in .cam and happy to help out on that issue
<hannes>
I might also find some time at some moment to look into the mirage-net-solo5..
<hannes>
while looking into solo5 the other day, there's some (non-exported) code like memcmp and friends, which are both in solo5/ and in ocaml-freestanding/nolibc... not sure whether it makes sense to unify / share the code between those entities...
mort___ has quit [Quit: Leaving.]
mort___ has joined #mirage
agarwal1975 has joined #mirage
<mato>
hannes: that's deliberate -- the code that's in Solo5 is private (used by Solo5 itself) and not intended to be exported.
<hannes>
makes sense... though the implementations differ ;) (sorry for my code duplication OCD) ;)
<mato>
hannes: Oh, they differ, yes. That's a different issue -- the implementations in Solo5 are Dan's and the ocaml-freestanding ones are what I lifted from musl.
<hannes>
ic
<mato>
hannes: I will probably unify them at some stage, although by copying, not via a dependency.
<hannes>
ack
<mato>
hannes: Since the musl implementations are much better.
<hannes>
likely makes sense to look at compiler output whether musl's implementations actually make a difference
<mato>
I'd just treat them as "known good" and go with them. The attention to detail and edge cases in musl is impressive.
<hannes>
:)
rgrinberg has joined #mirage
mort___ has quit [Quit: Leaving.]
mort___ has joined #mirage
copy` has joined #mirage
andreas23 has quit [Quit: Leaving.]
mort___ has quit [Ping timeout: 258 seconds]
mort___ has joined #mirage
andreas23 has joined #mirage
andreas231 has joined #mirage
agarwal1975 has quit [Quit: agarwal1975]
andreas23 has quit [Ping timeout: 276 seconds]
agarwal1975 has joined #mirage
brson has joined #mirage
rgrinberg has quit [Ping timeout: 260 seconds]
mort___ has quit [Quit: Leaving.]
dexterph has quit [Ping timeout: 250 seconds]
agarwal1975 has quit [Quit: agarwal1975]
agarwal1975 has joined #mirage
mort___ has joined #mirage
mort___1 has joined #mirage
mort___ has quit [Read error: Connection reset by peer]
mort___1 has quit [Quit: Leaving.]
rgrinberg has joined #mirage
agarwal1975 has quit [Quit: agarwal1975]
agarwal1975 has joined #mirage
StrykerKKD has joined #mirage
jermar has joined #mirage
insitu has joined #mirage
insitu has quit [Ping timeout: 260 seconds]
jermar has quit [Ping timeout: 240 seconds]
insitu has joined #mirage
insitu has quit [Client Quit]
insitu has joined #mirage
insitu has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rgrinberg has quit [Ping timeout: 260 seconds]
AltGr has left #mirage [#mirage]
abeaumont has quit [Remote host closed the connection]