2013-03-25 00:43
hsbt_away changed the topic of #ruby-core to: check the latest release candidate for 1.9.1 release ftp.ruby-lang.org:/home/yugui/ruby-1.9.1-r26021+1.tar.bz2
00:00
kosaki2 has joined #ruby-core
00:08
nokada has joined #ruby-core
00:11
<
nokada >
drbrain: do you have test cases?
00:17
<
drbrain >
nokada: yes!
00:17
<
drbrain >
I can commit them, too
00:17
<
drbrain >
nokada: but I think you should review to see if my new behavior for bad input is OK
00:18
shiba___ has joined #ruby-core
00:19
hsbt has quit [Ping timeout: 258 seconds]
00:19
<
drbrain >
oh, I forgot to submit the tests
00:20
hsbt has joined #ruby-core
00:22
tenderlo_ has quit [Remote host closed the connection]
00:30
<
nokada >
DecimalInteger needs parentheses?
00:32
shiba___ has quit [Ping timeout: 264 seconds]
00:37
<
drbrain >
nokada: it allows --decimal-integer 1234xyz to have the value 1234
00:37
<
drbrain >
since the regular expression does not use \z
00:38
<
drbrain >
it seemed that allowing 1234xyz as input was intentional
00:39
<
nokada >
I can't remember...
00:39
<
drbrain >
it is a very old bug :D
00:39
nagachika has joined #ruby-core
00:40
<
nokada >
it doesn't seem intentional
00:41
<
drbrain >
I think each regexp should use \z, then
00:53
charliesome has joined #ruby-core
00:53
hsbt has quit [Read error: Connection reset by peer]
00:54
hsbt_ has joined #ruby-core
01:01
<
drbrain >
bus time, later‼
01:23
nari has joined #ruby-core
01:31
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
01:32
<
_ko1 >
charliesome: r42836
01:32
<
_ko1 >
ah, seq is uniq for each class
01:35
charliesome has joined #ruby-core
01:49
Domon has joined #ruby-core
02:07
headius has quit [Quit: headius]
02:12
shinnya has quit [Ping timeout: 268 seconds]
02:15
tylersmith has joined #ruby-core
02:18
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
02:31
charliesome has joined #ruby-core
02:38
soraher is now known as sorah_
02:38
sorah_ is now known as soraher
02:46
<
nokada >
drbrain: seems fine, and add tests 1234xyz fails, plz
03:18
hsbt_ has quit [Ping timeout: 268 seconds]
03:22
hsbt_ has joined #ruby-core
03:44
headius has joined #ruby-core
03:49
kosaki2 has quit [Remote host closed the connection]
03:56
kosaki2 has joined #ruby-core
04:10
kosaki2 has quit [Remote host closed the connection]
04:18
Domon has quit [Remote host closed the connection]
04:19
Domon has joined #ruby-core
04:23
Domon has quit [Ping timeout: 264 seconds]
04:25
<
drbrain >
nokada: OK!
05:12
Domon has joined #ruby-core
05:13
corundum has quit [Quit: okay bye]
05:15
hsbt_ has quit [Ping timeout: 268 seconds]
05:17
hsbt_ has joined #ruby-core
05:26
DanKnox is now known as DanKnox_away
05:36
headius has quit [Quit: headius]
05:40
tylersmith has quit [Remote host closed the connection]
05:41
tylersmith has joined #ruby-core
05:41
tylersmith has quit [Read error: Connection reset by peer]
05:41
tylersmith has joined #ruby-core
05:46
tylersmith has quit [Ping timeout: 268 seconds]
05:52
marcandre has quit [Remote host closed the connection]
06:32
hsbt_ has quit [Ping timeout: 240 seconds]
06:33
hsbt_ has joined #ruby-core
06:34
hsbt_ is now known as hsbt_away
06:35
hsbt_away is now known as hsbt
06:43
hsbt has quit [Quit: Tiarra 0.1+svn-39209: SIGTERM received; exit]
06:43
hsbt has joined #ruby-core
07:19
corundum has joined #ruby-core
07:25
<
_ko1 >
charliesome: hello
07:33
<
charliesome >
_ko1: howdy
07:34
<
_ko1 >
i read your last commit "foo"f.object_id == "foo"f.object_id
07:34
<
charliesome >
_ko1: yep
07:34
<
_ko1 >
when I read it, i think it is Symbol
07:35
<
_ko1 >
I'm afraid that similar two features confuse users.
07:35
<
_ko1 >
What do you think about?
07:36
<
charliesome >
i see your concern
07:36
<
_ko1 >
IMO, best solution is to merge Symbol and String
07:37
<
_ko1 >
and remove "foo"f and use :foo
07:37
<
charliesome >
but I'm not sure if users will be that confused
07:37
<
charliesome >
"foo"f.object_id == "foo"f.object_id is an implementation detail
07:37
<
nokada >
agree that it is a detail
07:37
<
charliesome >
it doesn't affect behaviour if the object ids are not equal
07:38
<
charliesome >
but if we can deduplicate frozen strings, it may lower memory usage in some cases
07:38
<
nokada >
so I thought it doesn't need testing
07:38
<
charliesome >
especially in larger rails apps
07:38
<
charliesome >
nokada: maybe we should have "implementation detail" tests
07:38
<
charliesome >
because I want to test that MRI does this, but maybe other implementations do not care about it
07:39
<
nokada >
well, we've such "tests" much already
07:39
<
nokada >
now I think it's bad
07:40
<
charliesome >
i agree that it is confusing what is spec and what is implementation detail
07:41
<
charliesome >
but i think we should test implementation details in MRI, even if it is not spec
07:42
<
charliesome >
for example if I write code that deduplicates strings, I want to know if I have accidentally broken it
07:43
<
_ko1 >
charliesome: i understand.
07:43
<
_ko1 >
nobu's concern is spec or not.
07:44
<
_ko1 >
my concern is to introduce suffix f syntax (confusing or not)
07:44
<
_ko1 >
what do you think about merging Symbol and String?
07:45
<
_ko1 >
if we can, we only need :foo
07:45
<
_ko1 >
such proposal (merging Symbol and String) was rejected with trial.
07:46
<
charliesome >
_ko1: I'm not sure about merging symbol and string
07:46
<
charliesome >
for two reasons:
07:47
<
charliesome >
("foo" + "bar").object_id != "foobar".object_d
07:47
<
charliesome >
("foo" + "bar").freeze.object_id != "foobar"f.object_id
07:49
<
charliesome >
lots of libraries use symbols and strings to mean different things
07:49
<
_ko1 >
sorry, mis typing
07:49
<
charliesome >
the :rubygems symbol is a symbolic representation of a string
07:50
<
nokada >
what does 'source :rubygems'?
07:51
<
charliesome >
nokada: no, :rubygems is special case
07:52
<
_ko1 >
charliesome: let us back to the basic. your concern is to improve performance.
07:52
<
_ko1 >
by reducing generating String objects
07:52
<
charliesome >
_ko1: yes
07:53
<
_ko1 >
did you measure performance?
07:54
<
charliesome >
_ko1: not directly, but I often see many string objects on the heap with identical contents
07:54
<
charliesome >
let me check
07:55
<
_ko1 >
There are two main reasons
07:55
<
dbussink >
btw, rubinius does sharing of strings, i guess kind of how mri does for array's
07:55
<
dbussink >
so we only dup actual string contents if it changes etc.
07:55
<
dbussink >
just fyi
07:55
<
charliesome >
_ko1: on the github app:
07:55
<
charliesome >
ObjectSpace.each_object(String).to_a.uniq.count # => 148114
07:55
<
charliesome >
ObjectSpace.each_object(String).count # => 386809
07:56
<
_ko1 >
(1) object counts -> affect GC count
07:56
<
_ko1 >
(2) duplicate String body (RSTRING_PTR(str)) -> memory consumption
07:56
<
_ko1 >
f suffix solve (1) and (2)
07:57
<
_ko1 >
CoW sharing string ptr (we already use on String#dup) solve (2)
07:58
<
_ko1 >
and RGenGC reduce impact of (1)
07:58
<
_ko1 >
only *reduce*.
07:58
<
_ko1 >
we planned to make more aggressive (2)
07:59
<
_ko1 >
so, my opinion is
07:59
<
_ko1 >
if impact of (1) is small, then frozen suffix is overkill
08:00
<
_ko1 >
> I'm afraid that similar two features confuse users.
08:01
<
_ko1 >
tradeoff: (a) performance improvement and (b) beutiful syntax
08:01
<
dbussink >
isn't this something where it's good to say to measure is to know?
08:02
<
dbussink >
how big is the advantage and is that worth it?
08:02
<
_ko1 >
i agreed your proposal because I don't notice the disadvantage of (b)
08:02
<
_ko1 >
s/don't/didn't/
08:03
<
_ko1 >
dbussink: more easy sentense please!!
08:04
<
dbussink >
_ko1: well, what i'm saying is that if the goal is performance
08:04
<
dbussink >
that should be measured
08:04
<
_ko1 >
easy English yay!
08:05
<
charliesome >
_ko1: i think most of the benefit of frozen strings will not be in user code, but in generated code
08:05
<
charliesome >
like erb
08:06
<
charliesome >
erb templates have a lot of static strings that have to be duped every time
08:08
<
charliesome >
_ko1: even with CoW sharing solving (2), it is still significantly faster to use f-strings instead of relying on CoW
08:08
<
charliesome >
twice as fast in my measurements
08:08
<
charliesome >
with GC.disable so we do not care about extra GC time
08:13
<
_ko1 >
ERb simulation benchmark
08:14
<
_ko1 >
CoW is worst LoL
08:14
<
charliesome >
interesting results
08:15
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
08:15
<
_ko1 >
"same object" simulates f string
08:16
<
_ko1 >
x1.5 for specific application
08:17
<
_ko1 >
however, ERb can make same optimization w/o f string
08:17
<
_ko1 >
this is weak point for your opinion
08:19
<
_ko1 >
but easy way to make such optimization
08:21
<
_ko1 >
with f literal
08:21
<
_ko1 >
same result of "same object"
08:23
<
_ko1 >
ruby 2.1.0dev (2013-09-05 trunk 42845) [x86_64-linux]
08:23
<
_ko1 >
user system total real
08:23
<
_ko1 >
normal literal 4.290000 0.010000 4.300000 ( 4.306291)
08:23
<
_ko1 >
f literal 3.370000 0.010000 3.380000 ( 3.377195)
08:23
<
_ko1 >
same object 3.430000 0.000000 3.430000 ( 3.429297)
08:23
<
_ko1 >
26% faster on x64 Linux
08:25
<
_ko1 >
Anyway, I'll post this discussion on ruby-core mailing list
08:25
<
_ko1 >
ah, charliesome is not available
08:27
harrow has quit [Ping timeout: 245 seconds]
08:28
harrow has joined #ruby-core
08:33
closer has joined #ruby-core
08:39
nari has quit [Ping timeout: 264 seconds]
08:40
closer009 has quit [*.net *.split]
08:42
wudofyr has quit [Ping timeout: 245 seconds]
08:46
<
dbussink >
_ko1: can i ask you a question about RTYPEDDATA?
08:47
<
dbussink >
especially about how it relates to DATA_PTR
08:49
wudofyr has joined #ruby-core
08:51
charliesome has joined #ruby-core
08:55
<
_ko1 >
dbussink: sure
08:56
<
dbussink >
_ko1: so, there's RDATA() and RTYPEDDATA() to get the internal structs
08:56
<
dbussink >
but doing RDATA() on something that is a typed data is invalid right?
08:57
<
_ko1 >
#define DATA_PTR(dta) (RDATA(dta)->data)
08:58
<
dbussink >
i think it works kind of by accident because the third element is the data for both types
08:58
<
_ko1 >
it is compatibile
08:58
<
_ko1 >
s/it is/they are/
08:58
<
dbussink >
i'm working on adding this to the rubinius c-api
08:58
<
_ko1 >
you are right.
08:58
<
_ko1 >
it is intentional
08:58
<
dbussink >
ok, so it means i probably have to make DATA_PTR point to a function in rubinius
08:59
<
dbussink >
because we can't do it like this because of type checks
08:59
<
dbussink >
and have the function check both the data and typed data type
08:59
<
_ko1 >
maybe people use DATA_PTR()
08:59
<
_ko1 >
#define DATA_PTR(dta) (RDATA(dta)->data)
09:01
<
dbussink >
i'll probably do something like #define DATA_PTR(dta) (capi_rdata_data_ptr(dta) and implement the logic in capi_rdata_data_ptr then
09:01
<
dbussink >
to support the different types
09:02
<
dbussink >
_ko1: is there a reason that there is no RUBY_T_TYPEDDATA ?
09:02
<
dbussink >
but that uses the same type as DATA?
09:02
<
_ko1 >
i don't want to increase another type
09:03
<
dbussink >
since it behaves differently in all other places
09:03
<
_ko1 >
and i think all of T_DATA should move to T_TYPEDDATA
09:07
<
_ko1 >
the purpose of T_DATA and T_TYPEDDATA is same
09:07
<
_ko1 >
exntended version of T_DATA
09:07
<
dbussink >
_ko1: is there a way to see the difference between a T_DATA or T_TYPEDDATA for a given object?
09:07
<
dbussink >
whether it's one or the other?
09:07
<
_ko1 >
so T_TYPEDDATA is a part of T_DATA
09:07
<
dbussink >
if you don't know the type in the code that calls it?
09:08
<
_ko1 >
#define RTYPEDDATA_P(v) (RTYPEDDATA(v)->typed_flag == 1)
09:08
<
dbussink >
right, but that would reinterpret the struct right?
09:08
<
_ko1 >
re-interpret
09:08
<
dbussink >
but it assumes that for RDATA the free function never would have 1 as the value right
09:09
<
_ko1 >
your are right
09:09
<
dbussink >
random reinterpretations like that make it really hard to support this stuff :(
09:09
<
_ko1 >
sorry i don't know what is "random reinterpretations"
09:10
<
dbussink >
well, you interpret something that is an RDATA struct as a RTYPEDATA Struct
09:10
<
dbussink >
which puts restrictions on what they can be
09:10
<
dbussink >
so it makes it harder to support the same api in rubinius
09:10
<
_ko1 >
i recommend that
09:10
<
dbussink >
recommend what?
09:11
<
_ko1 >
only support getter for free/mark/type
09:11
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
09:11
<
_ko1 >
for T_DATA (and T_TYPEDDATA)
09:11
<
_ko1 >
ah. and getter for data.
09:11
<
dbussink >
not sure what you mean
09:11
<
dbussink >
not support RTYPEDDATA_P?
09:11
<
dbussink >
problem is that we can't really decide that
09:12
<
dbussink >
since it's in mri people use it
09:12
<
dbussink >
so we have to support it
09:12
<
dbussink >
it's often not a choice for us
09:12
<
_ko1 >
and i don't understand your problem yet.
09:13
<
_ko1 >
what the problem checking it in dynamic
09:13
<
dbussink >
well, it means we also have to create compatible struct layouts
09:13
<
dbussink >
and can't use something that would work better for us
09:13
<
_ko1 >
why you cant use same definition?
09:13
<
dbussink >
because rdata and rtypeddata are different but also not
09:14
<
dbussink >
well, i have to now
09:14
<
dbussink >
i was looking at something else that was easier for us to implement this
09:14
<
_ko1 >
i feel current MRI implementation is easy.
09:15
<
_ko1 >
maybe some problem are there in rubinius to support it.
09:15
<
dbussink >
well, i guess it's easy for mri yes
09:15
<
dbussink >
but reinterpreting types etc. in general will often lead to problems i think
09:15
<
dbussink >
because it makes things not explicit
09:15
<
dbussink >
and forces certain models to be followed
09:16
<
_ko1 >
maybe we don't share the type/data model.
09:16
<
dbussink >
exactly, but these details force a compatibility layer for that
09:16
<
dbussink >
that's exactly my point :)
09:17
<
dbussink >
ideally i would wish the RDATA or RDATATYPED struct was not part of the api at all
09:17
<
dbussink >
and everything would work through just using functions
09:17
<
dbussink >
so the implementation details stay hidden
09:17
<
_ko1 >
i can undersntad.
09:19
<
_ko1 >
sorry i need to go meeting
09:19
<
_ko1 >
maybe i need to know more.
09:19
<
dbussink >
no problem, thanks for the answers!
09:59
soba has quit [Ping timeout: 245 seconds]
10:15
wudofyr has quit [Ping timeout: 264 seconds]
10:17
shiba___ has joined #ruby-core
10:18
wudofyr has joined #ruby-core
10:35
nari has joined #ruby-core
11:12
shiba___ has quit [Ping timeout: 246 seconds]
11:18
Domon has quit [Remote host closed the connection]
11:18
Domon has joined #ruby-core
11:22
Domon has quit [Ping timeout: 240 seconds]
12:37
marcandre has joined #ruby-core
12:58
closer has quit [Ping timeout: 260 seconds]
13:00
closer has joined #ruby-core
14:04
nagachika has quit [Remote host closed the connection]
14:08
hsbt has quit [Ping timeout: 264 seconds]
14:12
nokada has quit [Remote host closed the connection]
14:12
nokada has joined #ruby-core
14:20
hsbt has joined #ruby-core
14:37
nagachika has joined #ruby-core
14:42
enebo has joined #ruby-core
14:42
ahegyi_ has joined #ruby-core
14:50
corundum has quit [Read error: Connection reset by peer]
14:52
charliesome has joined #ruby-core
14:55
nari has quit [Ping timeout: 264 seconds]
15:00
ahegyi_ has quit [Ping timeout: 276 seconds]
15:01
ahegyi_ has joined #ruby-core
15:11
headius has joined #ruby-core
15:13
shinnya has joined #ruby-core
15:13
corundum has joined #ruby-core
15:56
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
15:58
charliesome has joined #ruby-core
16:13
tylersmith has joined #ruby-core
16:16
tylersmith has quit [Remote host closed the connection]
16:31
enebo has quit [Quit: enebo]
16:42
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
16:44
charliesome has joined #ruby-core
16:53
hsbt has quit [Ping timeout: 260 seconds]
16:53
hsbt has joined #ruby-core
17:03
charliesome has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
17:07
DanKnox_away is now known as DanKnox
17:17
nagachika has quit [Remote host closed the connection]
17:23
hsbt has quit [Ping timeout: 276 seconds]
17:24
hsbt has joined #ruby-core
17:41
DanKnox is now known as DanKnox_away
17:47
r0bgl33s0n has quit [Quit: WeeChat 0.4.1]
17:57
DanKnox_away is now known as DanKnox
18:00
tenderlove has joined #ruby-core
18:06
enebo has joined #ruby-core
18:26
rrpy has joined #ruby-core
18:33
rrpy is now known as rrmartins
18:36
corundum has quit [Read error: Connection reset by peer]
18:39
pumper has joined #ruby-core
18:53
pumper has left #ruby-core ["Leaving"]
19:00
rrmartins has quit [Remote host closed the connection]
19:15
tenderlove has quit [Remote host closed the connection]
19:36
nokada has quit [Remote host closed the connection]
20:31
marcandr_ has joined #ruby-core
20:32
marcandre has quit [Read error: Connection reset by peer]
20:32
marcandr_ has quit [Read error: Connection reset by peer]
20:33
marcandre has joined #ruby-core
20:47
DanKnox is now known as DanKnox_away
20:49
tenderlove has joined #ruby-core
20:56
ahegyi_ has quit [Read error: Operation timed out]
21:02
r0bgleeson has joined #ruby-core
21:03
tylersmith has joined #ruby-core
21:04
enebo has quit [Quit: enebo]
21:10
corundum has joined #ruby-core
21:15
corundum has quit [Quit: seeya]
21:21
corundum has joined #ruby-core
21:54
DanKnox_away is now known as DanKnox
22:02
samsaffron has joined #ruby-core
22:03
<
samsaffron >
does anyone know how memsize_of is meant to work ?
22:03
<
samsaffron >
irb(main):005:0> ObjectSpace.memsize_of("hello")
22:03
<
drbrain >
samsaffron: I think you need to compile with a flag to enable it
22:04
<
samsaffron >
oddly though:
22:04
<
samsaffron >
irb(main):011:0> ObjectSpace.memsize_of(String)
22:04
<
samsaffron >
=> 6248
22:04
<
drbrain >
maybe I'm thinking of a different thing then
22:05
<
drbrain >
I'm thinking of CALC_EXACT_MALLOC_SIZE which may do something different
22:17
enebo has joined #ruby-core
22:18
<
samsaffron >
its embedded strings
22:18
<
samsaffron >
irb(main):032:0> ObjectSpace.memsize_of("a"*23)
22:18
<
samsaffron >
irb(main):033:0> ObjectSpace.memsize_of("a"*24)
22:18
<
drbrain >
oh, right!
22:19
<
samsaffron >
though embedded strings still take up memory, so I need to estimate something there
22:19
<
drbrain >
you need to know the size of a struct RObject { }
22:20
<
drbrain >
you can probably use a fixed value based on ObjectSpace.memsize_of("a"*23)
22:20
<
drbrain >
if you can embed 23 characters you have a 64 bit machine, otherwise 32 bit
22:21
marcandre has quit [Remote host closed the connection]
22:26
nari has joined #ruby-core
22:42
nari has quit [Ping timeout: 264 seconds]
22:45
<
samsaffron >
can no think of any other way of getting the size of RVALUE
22:46
<
drbrain >
samsaffron: you can calculate it from C and pick the 64 bit or 32 bit size based on how much string you can embed
22:47
<
samsaffron >
I am not sure I want to ship a c extension with my memory profiler :(
22:48
<
drbrain >
you don't need to
22:48
<
drbrain >
it's going to be the same size on all platforms, you compute it today and stick it in a constant
23:06
headius has quit [Quit: headius]
23:08
headius has joined #ruby-core
23:13
tenderlove has quit [Remote host closed the connection]
23:15
headius has quit [Quit: headius]
23:21
nokada has joined #ruby-core
23:24
enebo has quit [Quit: enebo]
23:58
nari has joined #ruby-core