egg|nomz|egg changed the topic of #kspacademia to: https://gist.github.com/pdn4kd/164b9b85435d87afbec0c3a7e69d3e6d | Dogs are cats. Spiders are cat interferometers. | Космизм сегодня! | Document well, for tomorrow you may get mauled by a ネコバス. | <UmbralRaptor> egg|nomz|egg: generally if your eyes are dewing over, that's not the weather. | <ferram4> I shall beat my problems to death with an engineer.
e_14159 has quit [Ping timeout: 198 seconds]
e_14159 has joined #kspacademia
UmbralRaptor has quit [Remote host closed the connection]
UmbralRaptop has joined #kspacademia
<kmath> <3PSboyd> Someone's a shoulder cat now. https://t.co/CmtQAh7RRq
egg|phone|egg has quit [Remote host closed the connection]
<egg> UmbralRaptop: hm, I don't plan on being there, but I guess I could go
<egg> UmbralRaptop: and Iskierka doesn't have a border to cross so that's easier
<egg> UmbralRaptop: ah right GIR2.AN.BAR seems to be another way of writing patru
<egg> !wpn UmbralRaptop
* Qboid gives UmbralRaptop a neap soliton
<egg> !wpn whitequark
* Qboid gives whitequark a transitive death
<egg> um
<egg> whitequark: why is your github profile pic still a black square
TonyC has joined #kspacademia
egg|phone|egg has joined #kspacademia
TonyC1 has quit [Ping timeout: 186 seconds]
egg|cell|egg has joined #kspacademia
egg|phone|egg has quit [Read error: Connection reset by peer]
egg|phone|egg has joined #kspacademia
egg|mobile|egg has joined #kspacademia
egg|phone|egg has quit [Read error: -0x1: UNKNOWN ERROR CODE (0001)]
egg|cell|egg has quit [Read error: Connection reset by peer]
<egg|work|egg> !wpn whitequark
* Qboid gives whitequark a Trojan state machine
<egg|work|egg> if i18n is internationalization, is i12n internationale
<egg|work|egg> s/2n/2e/g
<Qboid> egg|work|egg meant to say: if i18n is internationalization, is i12e internationale
<egg|work|egg> !wpn котя and the котяchrome kitten
* Qboid gives котя and the котяchrome kitten a nilpotent Balmer barber
<egg|work|egg> the nilpotent barber vanishes if he shaves himself sufficiently many times
egg|phone|egg has joined #kspacademia
egg|mobile|egg has quit [Read error: Connection reset by peer]
egg|cell|egg has joined #kspacademia
egg|phone|egg has quit [Read error: Connection reset by peer]
egg|cell|egg has quit [Read error: Connection reset by peer]
egg|phone|egg has joined #kspacademia
egg|cell|egg has joined #kspacademia
egg|mobile|egg has joined #kspacademia
egg|cell|egg has quit [Read error: -0x1: UNKNOWN ERROR CODE (0001)]
egg|phone|egg has quit [Ping timeout: 186 seconds]
egg|phone|egg has joined #kspacademia
egg|mobile|egg has quit [Ping timeout: 186 seconds]
egg|cell|egg has joined #kspacademia
egg|phone|egg has quit [Read error: Connection reset by peer]
egg|phone|egg has joined #kspacademia
egg|cell|egg has quit [Read error: Connection reset by peer]
egg|phone|egg has quit [Read error: Connection reset by peer]
<kmath> <FadAstra> @arxiv Two groups wrote papers in 8 hours based upon #GaiaDR2 https://t.co/qN1S5CO0pg
kmath has quit [Ping timeout: 186 seconds]
APlayer has joined #kspacademia
UmbralRaptop has quit [Quit: Bye]
UmbralRaptop has joined #kspacademia
<egg|work|egg> !wpn UmbralRaptop
* Qboid gives UmbralRaptop a contravariant polynomial
<egg|work|egg> !wpn whitequark
* Qboid gives whitequark a walrus
* egg|work|egg wonders whether котя will eat the walrus
<UmbralRaptop> !wpn egg|work|egg
* Qboid gives egg|work|egg an isothermal tarrasque
UmbralRaptor has joined #kspacademia
UmbralRaptop has quit [Ping timeout: 182 seconds]
* egg pets whitequark
<whitequark> hi
<egg> how are the cats
<egg> whitequark: if the cats were sequenced, what sort of things could we tell from their genomes?
<whitequark> still in hk
<egg> whitequark: do hk cats still flee?
<whitequark> i haven't seen any in a long while
<egg> hmm
<egg> whitequark: there are apparently cat cafes in hk
<egg> so you could look there i guess :-p
<whitequark> too shy
<SnoopJeDi> oh neat, today's speaker is one of the LIGO Nobel recipients and was also chair of COBE's science working group \o/
<UmbralRaptor> !
<APlayer> Are there any powers of two larger than 2⁰ that have an arbitrary first digit and only zeroes after that?
<APlayer> I don't think there are, are there?
<APlayer> Larger than 2³, even
<egg> whitequark: you or the cats
<whitequark> egg: me
<whitequark> APlayer: that means a power of two divisible by a power of ten
<whitequark> which is clearly absurd
<APlayer> Not just by "a" power of ten, but a specific power of ten
<APlayer> But alright, point taken, makes sense
<whitequark> 10=2*5, and no power of 2 is divisible by 5
* APlayer needs to fix his knowledge of primes, division and related subjects
<SnoopJeDi> it takes some practice before it's natural, imo
<SnoopJeDi> powers of 2 in particular are something people have dealt with often, though
<whitequark> i dunno, i don't think i touched that knowledge since primary school
<APlayer> But what if you have 2⁴ котяs?
<SnoopJeDi> whitequark, surely you've thought about powers of 2
<SnoopJeDi> or do you mean explicitly dealing with it in a classroom sense
<SnoopJeDi> vs adjacent to some unrelated task
<egg> there's nothing specific to 2 here, any number not divisible by 10 will do
<SnoopJeDi> yes, that's rather obvious
<SnoopJeDi> but, like most things mathematicians treat with, it's only obvious in hindsight, heh.
<egg> bofh: is there a difference in behaviour between those snippets? my compiler generates the former usually, but the latter if I do some intrinsics trickery to produce the thing that gets movqed https://hastebin.com/vayesamona.rb
<egg> (or whitequark or anyone who feels like looking at x86-64 nonsense)
<egg> wait I'm missing a chunk from the first one
<egg> yeah nevermind
<egg> (aside from the random mulsd at the beginning of the first one aaargh)
<egg> (and the mov rcx, 0xfffffff000000000 is also irrelevant)
<egg> mul vs. imul, and correspondingly different constants and shifts? Ꙩ_ꙩ why
<egg> oh derp _mm_cvtsi128_si64 returns a signed integer of course
* egg stabs egg
<APlayer> Congratulations! You now have egg on a stick
<UmbralRaptor> s/котяs/котяс/
<bofh> egg: there's no reason to use intrinsics for scalar code imho.
<bofh> also those asm snippets in both cases are essentially ~equivalent perf-wise, wide imul and wide mul have ~identical thruput/latency on p. much any Intel/AMD.
<bofh> (the only case where the difference matters is in imuls small enough that the constant fits into its immediate field. but that's blatantly obviously not the case here).
<iximeow> in the last haste there's two mul in the first, only one in the second?
* iximeow rereads
<egg> bofh: well msvc refuses to emit sse2 pand etc. (even movq!) without intrinsics so I use that
<iximeow> ah nvm i see you mentioned that mulsd is unrelated
<egg> bofh: sadly the whole of C + Y / 3 is in the critical path and that I think I have to do with plain x86-64 stuff afaict
<egg> bofh: unless you can see a way to do it in sse2 or 3?
<egg> but without a 64-bit mul it seems infeasible :-/
* egg pokes _mm_mul_epu32 in the 32
<UmbralRaptor> mul_apin64
<bofh> egg: so like, even if you do it in sse2 it won't be faster, C + Y / 3 is fast in integer x86_64 >_>
<bofh> like, integer mul is 3 cycles latency *at most* since Nehalem, I think it might even be 2 now.
<bofh> and that's for all integer muls, incl. 64x64->128.
<egg> bofh: yes but you pay one cycle either way for the movqs
<bofh> yeah but that's all of 1 cycle, if that's actually seriously hurting your overall function perf, then your function is fast enough.
<bofh> :P
<egg> bofh: that's certainly noticeable for things like the clobbering to 16 bits
<bofh> for clobbering to 16 bits, sure, but that's b/c pand/andps makes sense to use there.
<egg> bofh: and for extracting the sign you need a couple of ands, and doing that in sse2 puts only one of them on the critical path
<egg> BM_EggCbrt 31594 ns 30762 ns 21333 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00; -∛-2 = +nan
<egg> BM_EggSignedIntrinsicCbrt 32308 ns 32645 ns 24889 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00; -∛-2 = +1.25992104989487319e+00
<egg> BM_EggSignedCbrt 33797 ns 32087 ns 22400 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00; -∛-2 = +1.25992104989487319e+00
<bofh> wait, how on *earth* is extracting the sign on the critical path? you do it at the start of the function in the same path as the imul
<egg> bofh: wait what? don't I need to extract the sign before I do the linear approximation stuff?
<bofh> and then at the very end either OR it back in, or just do if (sign) x = -x before the return.
<egg> yeah I or it back in
<bofh> egg: uh the start of your f'n shouold be extract sign, absolute value, linear approx.
<bofh> and those can *all* be done in x86_64 from the initial movq.
<egg> bofh: oh the x86-64 stuff is superscalar too?
<bofh> I mean there are no data dependencies on the signbit once you extract it until the extreme end of the function...
<egg> bofh: yeah obviously
<egg> bofh: then again there's no reason not to do it in sse2 instrinsics (because then I get an m128i that I can just or at the end, instead of having to separately movq it back up)
<egg> bofh: but where to do abs is a good questino
<egg> s/tino/tion/ even
<egg> bofh: ah wait I have a dependency on abs y further down the line
<egg> bofh: so I'm better off doing that bitand in sse2 than movqing abs y back up too
<bofh> so like after the abs at the start of the code you *only* have a dependency on abs y, not y.
APlayer has quit [Read error: Connection reset by peer]
APlayer has joined #kspacademia
Technicalfool has joined #kspacademia
<egg> bofh: yes, but y comes to me in some xmm register, and I need abs y in one too
APlayer has quit [Ping timeout: 182 seconds]
<egg> bofh: so if I compute abs y from y with a pand/pandn I'm fine, otherwise I induce a (mild) dependency on the first movq and I need an additional movq that I wouldn't otherwise
<bofh> okay, I see your point.
<egg> bofh: amusingly (with haswell, and with the caveats of iaca occasionally being confused), the version with intrinsics is 120 cycles like the positives-only version (in practice and on skylake it's slightly slower, but noticeably less so than doing every integer operation through movqs, see above)
<egg> live, uh, lack of cat? https://www.youtube.com/watch?v=O4zKbwnojuY
<SnoopJeDi> um so apparently LIGO is limited at low frequencies by Brownian motion in the reflective optical coating on the test masses
<whitequark> uh
<SnoopJeDi> well, that's the way he phrased it anyhow. I liked it better as he rephrased it: losses in the material correspond directly with thermal noise
<SnoopJeDi> I didn't realize A+/Voyager were a thing, but apparently they're planning to implement a "squeezed light" (goddammit optics) technique this year to push down the radiative forcing noise
<egg> bofh: lolwtf, adding branches for rescaling to prevent over/underflow (and correctly handle subnormal values as a side effect) makes the benchmark (main branch) *faster*?!
<SnoopJeDi> surprising number of orders of magnitude left for a terrestrial GW observatory (if one believes the claims, but I think that's reasonable)
<egg> bofh: at least you're right that it doesn't slow things down at all :-p
<egg> all hail speggulative eggsecution
<Iskierka> for as long as we're allowed to use it
<egg> bofh: it's not like I tried doing anything smart either https://github.com/eggrobin/Principia/blob/cbrt-benchmarks/benchmarks/cbrt.cpp#L178-L188 (I need to refine and prove the values of smol and big, they're likely way to big and smol respectively)
<UmbralRaptor> Ellied: what sort of spectral resolution do your glasses get?