egg|nomz|egg changed the topic of #kspacademia to: https://gist.github.com/pdn4kd/164b9b85435d87afbec0c3a7e69d3e6d | Dogs are cats. Spiders are cat interferometers. | Космизм сегодня! | Document well, for tomorrow you may get mauled by a ネコバス. | <UmbralRaptor> egg|nomz|egg: generally if your eyes are dewing over, that's not the weather. | <ferram4> I shall beat my problems to death with an engineer.
<egg|zzz|egg> bofh: um yes but that x^3-y is part of the cbrt implementation,
<egg|zzz|egg> bofh: how many layers of cbrt are you on,
<egg|zzz|egg> bofh: although maybe I could distribute the numerator to make it less cancelly than precisely the thing that I'm trying to make most cancelly
<bofh> egg|zzz|egg: ohh I see what you mean. I don't *think* you're losing too many bits there...
<egg|zzz|egg> bofh: I think that's why I'm not improving things with better guess + Householder 6 vs. Householder 10 and it's all at a ridiculous 0,9 ULPs
<bofh> It's possible (tho I maintain anything < 1ULP is good), actually. Hm. I actually wonder if just doing the *subtraction* in double-double suffices to help at all.
<egg|zzz|egg> bofh: also I appear to be alternating between the two decimal markers defined by CGPM 9 resolution 7/CGPM 22 resolution 10/ISO 31-0 3.3/ISO 80000-1 7.3.2 :-p
<egg|zzz|egg> bofh: nah any egg can do faithfully rounded, it's only interesting if you try to get close to correctly rounded
<egg|zzz|egg> bofh: the subtraction is exact
<egg|zzz|egg> the problem is not the rounding of the subtraction, it is the condition of the subtraction
<bofh> Ahh. Right.
<bofh> also I maintain what is interesting is faithfully rounded with optimal perf :p
<egg|zzz|egg> bofh: the true answer is probably that you need an army of numericists to do the end-to-end error analysis and get optimal perf for whatever precision you need
<egg|zzz|egg> (and optimal precision within that perf if magic numbers & polynomials are tunable)
<egg|zzz|egg> bofh: e.g. if it's the step resizing factor in adaptive stepsize then touching the FPU for your rootn is sinful
<egg|zzz|egg> since you multiply the result with 0,9 anyway :-p
<egg|zzz|egg> bofh: but, https://twitter.com/FioraAeterna/status/968884994167881728 (1), so sub-ULP accuracy is good
<kmath> <FioraAeterna> @f1ac5 there's a few other possibilities! ⏎ ⏎ 1. you have a fetish for floating point ⏎ 2. you need predictable wrongnes… https://t.co/1lxO0i9PIk
<bofh> also you're making me want to revisit and improve the accuracy of my Airy function implementation past ~3ULP worst-case error for negative values near zeroes but fucking hell that thing is a *nightmare*.
<egg|zzz|egg> bofh: uh so distributing that cancellation in the 6th order householder somehow makes it unfaithful Ꙩ_ꙩ
<egg|zzz|egg> did I just introduce a bigger cancellation,
<bofh> ...*wat*
<bofh> how the hell
<egg|zzz|egg> plz2halp with the error analysis of
<egg|zzz|egg> double const numerator =
<egg|zzz|egg> x * ((5 * x³ + 12 * y) * x⁶ - (12 * x³ + 5 * y) * y²);
<egg|zzz|egg> return x - numerator / denominator;
<egg|zzz|egg> double const denominator = (7 * x³ + 42 * y) * x⁶ + (30 * x³ + 2 * y) * y²;
<egg|zzz|egg> oh
<egg|zzz|egg> uh
<egg|zzz|egg> yeah I just made the cancellation worse
<egg|zzz|egg> *sobbing*
<bofh> firsyt off why aren't you precomputing z = x^3 as an intermediate and using that in your numerator/denominator expressions? but otherwise... that numerator looks like it can have nasty cancellation in some cases.
<egg|zzz|egg> bofh: superscript digits are perfectly good characters in identifiers,
<bofh> ...
<egg|zzz|egg> seriously this is way more legible this way
<bofh> okay, so I should assume any high-order superscript digits actually have the common sub-powers precomputed?
<egg|zzz|egg> yeah, probably russian peasant
<egg|zzz|egg> (wait is it called that in english too or does the above sentence sound completely nonsensical)
<bofh> yeah, the russian peasant algo
<egg|zzz|egg> okay, it's called the same in french funnily enough
<egg|zzz|egg> (paysan russe)
<egg|zzz|egg> bofh: I mean it's just perfectly good C++, you should assume it's C++ and so the things that are identifiers are identifiers :-p
<egg|zzz|egg> bofh: and the numerator is not just cancelly in some cases, it's *actively trying to cancel* since you try to get x close to the cube root of y
<egg|zzz|egg> bofh: okay so basically what I'd like is a high-order iterate that is a rational function in (x-y), x, and y with positive coefficients
<egg|zzz|egg> bofh: also a pony
<bofh> Yeah I don't *think* you'll get that, particularly not the "only positive coeffs" bit I don't think.
<egg|zzz|egg> bofh: because the Householder stuff seems to be inherently heavily cancelly,
<egg|zzz|egg> there are methods other than householder though, kahan lists two of order 5
e_14159 has quit [Ping timeout: 198 seconds]
<bofh> egg|zzz|egg: I suspect it's just Householder composed with some smooth function, but am not sure.
e_14159 has joined #kspacademia
<egg|zzz|egg> bofh: hm but can it help with the cancelly bits
<egg|zzz|egg> bofh: like if you can pull off some trick like the conjugate multiplication thing maybe you can make this non-horrible
<egg|zzz|egg> gah we really need to get the cat in here for advice on this stuff
<bofh> egg|zzz|egg: I don't *think* this can be made non-cancellable via conjugate multiplication or similar, but let me think.
<egg|zzz|egg> maybe by linking moar catpics https://twitter.com/whitequark/status/980575216047833089
<kmath> <whitequark> https://t.co/jSmmXQdRfy
<egg|zzz|egg> !wpn котя
* Qboid gives котя a thoracic buffered truffle
ferram4 has quit [Read error: Connection reset by peer]
ferram4 has joined #kspacademia
SnoopJeDi has joined #kspacademia
<kmath> <6b766e> https://t.co/8HQmE64KAE
<kmath> <x86instructions> VGATHERDPS - Summon Very Angry Bears (Circular Formation)
<kmath> <brettmor> Shout out to all of the grads who applied and *didn't* get a fellowship this year. You're in good company! https://t.co/Sq7qhYDg3J
Technicalfool_ has quit [Ping timeout: 383 seconds]
<egg|cell|egg> Bofh: I guess that's why the cat looks for a perfect cube in a table
<bofh> egg|cell|egg: oh?
<bofh> (I *extremely* dislike that table fwiw)
<egg|cell|egg> Bofh: for a perfect cube x³-y will be nice to compute
<egg|cell|egg> But what if you 0 the low bits of your guess?
<egg|cell|egg> Hmm
<egg|cell|egg> Nerd sniping intensifies
<bofh> egg|cell|egg: yes but now you're storing a large table for at best fractional ULP improvement, this is not better at all imho.
<egg|cell|egg> Bofh: but fractional ulps are good,
<egg|cell|egg> Bofh: but 0ing bits of your guess might work
<egg|cell|egg> How many sigbits can you have and still have X*X*X exact
<egg|cell|egg> !Wpn bofh
* Qboid gives bofh a coverage-guided condenser
<bofh> hrm
<bofh> isn't it always going to be exact in the absence of overflow or underflow?
<egg|cell|egg> Wat
<egg|cell|egg> Bofh: if it were exact there would be no cancellation issue
armed_troop has quit [Ping timeout: 182 seconds]
armed_troop has joined #kspacademia
<bofh> okay, right, but okay I'm actually having trouble thinking of when multiplication is inexact in the absence of over/underflow
<egg|cell|egg> When the result has more than 53 bits?
<egg|cell|egg> So I guess the guess must have 17 bits or so
<egg|zzz|egg> bofh: okay I actually forgot this part in the kahan paper, he covers this (that explains why I got shitty precision on his method) "must be accurate to almost, and *rounded* to at most, a third as many figures as the arithmetic carries"
<egg|zzz|egg> bofh: so if I do that maybe I don't get 0,89 ULPs where he claims 0,59
<bofh> Ahh, I see.
<bofh> I'm.fairly certain the approximate_rootn.pdf guess has between 5 & 6 bits, & one Newton iterate won't put that above 12, so you're good?
<egg|zzz|egg> bofh: well still need to actually do the bitand with 0xFFFF'FFF0'0000'0000
<egg|zzz|egg> bofh: once I do that it is *a lot* better
<egg|zzz|egg> wow
<egg|zzz|egg> also what is up with the cat's implementation, it has comparatively pretty bad rounding?
<egg|zzz|egg> bofh: so, max ULPs is a bit noisy on 10 000 values, looking at frequency of incorrect rounding instead
<egg|zzz|egg> Atlas: 0.0861
<egg|zzz|egg> Householder 10: 0.0147
<egg|zzz|egg> Newton on the inverse + Householder 6: 0.0136
<egg|zzz|egg> Kahan 0.0011 (!!)
<egg|zzz|egg> Microsoft: 0.2803 (lol)
<bofh> wait what the shit, how did clobbering bits we don't care about *improve* the result? the multiplication of 17 bits^3 vs 53 bits^3 where 24 are garbage should have the same amount of useful bits: the garbage ones get shifted out in the multiply anyway.
<bofh> let alone that good of a result, damn.
<egg|zzz|egg> for max ULPs on my 10 000 test values:
<egg|zzz|egg> Atlas: 0.71814
<egg|zzz|egg> Householder 10: 0.57580
<egg|zzz|egg> Newton then Householder 6: 0.58524
<egg|zzz|egg> Kahan: 0.55148
<egg|zzz|egg> Microsoft: 1.2779 (lol)
<egg|zzz|egg> bofh: well because clobbering the bits means you have no error in x³-y instead of a massive cancellation
<egg|zzz|egg> bofh: so your correction term from x is much better
<egg|zzz|egg> otherwise your correction term is mostly rounding errors from the computation of x³
<bofh> ohh, right, rounding is a thing.
<bofh> duh.
<egg|zzz|egg> :D
<bofh> also I'm morbidly curious what MS does now, it's slower and not even accurate to <1ULP.
<egg|zzz|egg> bofh: I'm really surprised that atlas's method is that badly rounded; I'm actually using the C implementation from the ARM directory which uses his polynomial and table but is not by him, maybe it has additional errors?
<bofh> Possibly, I'm not sure. I wonder if expressions got replaced with mathematically equivalent but ill-conditioned ones or something, hm.
<bofh> (alternatively it's possible he didn't care at the time. he *did* say recently he needs to revisit it :P)
<egg|zzz|egg> now with 100 000 random values: incorrectness frequencies for the same: {0.08777, 0.01624, 0.01384, 0.001, 0.27913}
<egg|zzz|egg> and max errors: 0.72486, 0.60568, 0.58524, 0.56219, 1.3531
<bofh> Not bad.
<egg|zzz|egg> bofh: obviously the high householders from a shitty guess are worse than Kahan's 4th order iterate from a near-optimal guess
<egg|zzz|egg> bofh: not sure how to get a 17-bit guess without a division though :-/
<bofh> inverse cbrt then 1 Newton iterate then do x * z * z with clever masking?
<bofh> wait, I thought Kahan and your Householders all used the exact same starting guess code?
<egg|zzz|egg> bofh: Kahan does a Halley then a 4th order iterate
<egg|zzz|egg> and rounds to 17 bits after the Halley
<egg|zzz|egg> that gives you a 17 bits guess which is how you get almost always correct rounding
<bofh> Ahh. Actually I'm curious how Halley, round to 17, Halley compares.
<egg|zzz|egg> bofh: that's not going to be faithful
<egg|zzz|egg> bofh: the bit clobbering trickery is only relevant when you're near the limits of the precision, otherwise you're far from caring about rounding
<egg|zzz|egg> bofh: and Halley can't get you from 17 bits to all so you can't use it as your last step
<bofh> Right, Halley will be at best non-faithful, at worst 1-3 bits off outright.
ferram4 has quit [Ping timeout: 198 seconds]
<egg|zzz|egg> bofh: well >1 ULP off is what unfaithful means right?
<egg|zzz|egg> bofh: I wonder whether you can do 2 rounds of newton on the inverse faster by Estrining it as a single horrible polynomial, then use that as your guess for a 4th order iteration
<bofh> So I considered doing that before but never tried it, it seems worth it tho.
<egg|zzz|egg> *nerd sniping intensifies*
tawny has quit [Ping timeout: 190 seconds]
APlayer has joined #kspacademia
ferram4 has joined #kspacademia
ferram4 has quit [Ping timeout: 198 seconds]
awang has quit [Ping timeout: 182 seconds]
awang has joined #kspacademia
tawny has joined #kspacademia
tawny has quit [Ping timeout: 182 seconds]
<kmath> <DrPhiltill> #TFW you get a calendar invitation to interview an applicant in the middle of the night — 2:30 a.m. on Saturday mor… https://t.co/yntg412Euj
ferram4 has joined #kspacademia
ferram4 has quit [Ping timeout: 198 seconds]
ferram4 has joined #kspacademia
ferram4 has quit [Ping timeout: 198 seconds]
Snoozee has quit [Ping timeout: 190 seconds]
Snoozee has joined #kspacademia
Snoozee is now known as Majiir
Technicalfool has joined #kspacademia
<UmbralRaptor> Humanity Star git Incredible Journey'd? https://twitter.com/Marco_Langbroek/status/981538723841150977
<kmath> <Marco_Langbroek> (1/5) ⏎ Remember @TheHumanityStar, and how its builders @RocketLab claimed it would be visible for 9 months? While in… https://t.co/6Wt9fwy1Ia
<APlayer> Huh, what? It reentered already?
<APlayer> I was totally planning to photograph it somewhen in early summer, and kind of put it on my long term to-do list... And now it's gone, just like that, half a year earlier than I was told
<APlayer> Well, whatever
ferram4 has joined #kspacademia
tawny has joined #kspacademia
ferram4 has quit [Ping timeout: 198 seconds]
ferram4 has joined #kspacademia
APlayer has quit [Ping timeout: 383 seconds]
<UmbralRaptor> s/git/got/
<Qboid> UmbralRaptor meant to say: Humanity Star got Incredible Journey'd? https://twitter.com/Marco_Langbroek/status/981538723841150977
<kmath> <Marco_Langbroek> (1/5) ⏎ Remember @TheHumanityStar, and how its builders @RocketLab claimed it would be visible for 9 months? While in… https://t.co/6Wt9fwy1Ia
APlayer has joined #kspacademia
<iximeow> from that thread
<iximeow> >(* and to validate my orbital lifetime estimates for an upcoming launch)
<iximeow> :D
tawny has quit [Ping timeout: 190 seconds]
<UmbralRaptor> \o/ (I hope)
Majiir has quit [Ping timeout: 190 seconds]
<kmath> <mcclure111> Why do we use floating point coordinates for game physics engines anyway. Why not fixed point
<SnoopJeDi> LOL
<UmbralRaptor> I think Doom used fixed point?
Snoozee has joined #kspacademia
Snoozee is now known as Majiir
<UmbralRaptor> !wpn Majiir
* Qboid gives Majiir a thorium symbol
<egg|zzz|egg> bofh: I am amused by that bit-clobbering trick to get an exact cube
<egg|zzz|egg> bofh: also I reinvented it this morning and then I found it in Kahan's thing, maybe I should read Kahan's thing more carefully :-p
* UmbralRaptor pokes Gaussian units in the random 4π.
<bofh> rofl. I mean same, since I completely overlooked it myself.
Technicalfool has quit [Ping timeout: 182 seconds]
tawny has joined #kspacademia
<egg|zzz|egg> !wpn rqou
* Qboid gives rqou a caffeinated metric with a radiator attachment
* rqou meows
UmbralRaptor is now known as NomalRaptor
<APlayer> Random (literally) question: You have a probability experiment with n different, equally likely outcomes, and you do it n times. I noticed that the probability of one specific outcome occurring at least once is very close to 2/3. Is there some sort of theorem or something for that?
<APlayer> Or is it a false assumption that the probability is next to 2/3?
<egg|zzz|egg> !wpn NomalRaptor
* Qboid gives NomalRaptor a toasted panzer
<NomalRaptor> So, E&M is this week, not last.
* egg|zzz|egg gives NomalRaptor a gold cabinet
<NomalRaptor> ???
<NomalRaptor> And I'm certain that one of the homework problems involved using a non-constructive proof to generate an equation for the potential around an object.
<SnoopJeDi> APlayer, the probability that you *don't* observe one of the outcomes is ((n-1)/n)**n, which in the lim n->∞ is 1/e
<SnoopJeDi> 1 - 1/e is close to 2/3
<SnoopJeDi> "close"
<NomalRaptor> ^
<NomalRaptor> e ≈ 3 ≈ π
<SnoopJeDi> n.b. ((n-1)/n)**n) = (1 - 1/n)**n
<APlayer> π ≈ 4?
<egg|zzz|egg> rounding to nearest to 1 bit, yes
<NomalRaptor> π ≡ 3
* NomalRaptor ducks.
<egg|zzz|egg> wasn't it bofh who had tales of π = 1?
<APlayer> Wait, is lim n -> inf ((n-1)/n)^n not the definition of e?
<SnoopJeDi> (1 + 1/n)**n, not -
<SnoopJeDi> hence 1/e
<APlayer> Ah, yes
<APlayer> Alright, so what I was seeing was actually approaching 1/e and not 1/3, haha
<APlayer> Well, this is interesting
<SnoopJeDi> n.b. this is only true if these are independent events, but I assumed that's what you meant
<APlayer> Because I used to (for no reason at all, don't ask me why) intuitively assume the probability in such cases to be 1/2
<APlayer> And used that to estimate things in life
<APlayer> Will correct to 2/3 for estimation purposes and 1 - 1/e for calculation purposes
<APlayer> Thanks for the explanation!
<kmath> <stephentyrone> I have no idea what’s happening here, but it’s wonderful. https://t.co/7PVDGPnQnx
<kmath> <stephentyrone> https://t.co/FTuvYktX9d
<egg|zzz|egg> NomalRaptor: well I gave Nomal a gold cabinet
<NomalRaptor> yay
<SnoopJeDi> APlayer, it vaguely reminds me of the Monty Hall problem but I'm not sure it's related
<APlayer> Was it the one with two goats and a car?
<SnoopJeDi> yea, that's the problem
<APlayer> Ah, I remember that
<APlayer> It was a fun one, though I couldn't solve it (not sure if I could now, but I had no knowledge about probabilities back then)
<awang> Hey guys, I have a present for you
<awang> I'm pretty sure this is supposed to be the EULA thing for an OS upgrade
<awang> I have no idea how it ended up like this
<APlayer> The Unicode gods were displeased with your sacrifice
* awang hurries to the store to buy more �
<APlayer> !u �
<Qboid> U+FFFD REPLACEMENT CHARACTER (�)
<APlayer> What
<APlayer> This was not informative at all, haha
<APlayer> Ah, wait, so this is /supposed/ to be a tofu?
<awang> Um
<awang> !u �
<Qboid> U+FFFD REPLACEMENT CHARACTER (�)
<awang> ^That's what it is on my computer, idk about tofu
<APlayer> Some random chinese site I found on a Google search: https://houkanshan.com/img/fontonload/fffd.png
egg|cell|egg has quit [Ping timeout: 383 seconds]
Technicalfool has joined #kspacademia
* NomalRaptor throws Mario head first at that page,
<SnoopJeDi> "Mac: um we're indistinguishable from PC now but we don't have chromatic abberation?"
<SnoopJeDi> tangentially: PlayerUnknown's Battlegrounds has some of the most satisfying rifle scopes I've used in a videogame now that they model parallax and spherical/chromatic aberration (although I don't play milsim so...?)
<egg|zzz|egg> bofh: I have something which is correctly rounded on 100 000 random values
<egg|zzz|egg> bofh: it is slightly slower than Kahan's method though
<egg|zzz|egg> bofh: okay wat double-newton on the inverse is actually slower than Halley despite its fdiv? how the hell did I write my double-newton to get that
<bofh> What on *earth*? I want to see how that works on your Sandy Bridge desktop too fwiw
<egg|zzz|egg> bofh: I suspect I just wrote that polynomial badly and am getting unpleasant deps
<egg|zzz|egg> bofh: okay any obviousness here (the divisions in the constant obviously are evaluated at compile time so don't worry about those) https://github.com/eggrobin/Principia/blob/fa4d2d4/benchmarks/cbrt.cpp#L159-L167
<egg|zzz|egg> obviously an annoying thing is that you can't get to the polynomial til you've evaluated r³y
<egg|zzz|egg> hmm
<egg|zzz|egg> is it iacatime again
<egg|zzz|egg> i do not like iacatime
<bofh> iacatime?
<bofh> also sec, I have to run for a bit atm
<egg|zzz|egg> bofh: IACA
<egg|zzz|egg> IACA?
<Qboid> egg|zzz|egg: [IACA] => Intel® Architecture Code Analyzer
* egg|zzz|egg stares at the graph
<egg|zzz|egg> this tree is not very wide
<egg|zzz|egg> maybe I should look at the polynomial x * x * y directly
<egg|zzz|egg> oooor maybe looking at something that summons 32nd powers is not such a good idea
<egg|zzz|egg> but this chain of three muls at the end is shit
<egg|zzz|egg> bofh: IACA says the Halley is 53 cycles on Haswell, 55 on Sandy, whereas the above double-newton-inverse is 60 on both
<egg|zzz|egg> yeah I can bring it down a bit more but it just isn't a very nice polynomial tbh
* NomalRaptor sobs.
NomalRaptor is now known as UmbralRaptor
<egg|zzz|egg> bofh: okay further care brings it faster than Halley, sanity is restored
<bofh> egg|zzz|egg: okay yeah I figured. mind uploading the optimized expression? (just got back).
<egg|zzz|egg> bofh: still slower than non-Estrin Householder, but now faster than Kahan
<egg|zzz|egg> and better rounded
<egg|zzz|egg> I beat the wolf on all metrics, but beating the cat similarly might be harder
awang has quit [Ping timeout: 182 seconds]
awang has joined #kspacademia
<egg|zzz|egg> bofh: okay I arguably beat the cat with just 10th order householder, but that's not very well rounded
<egg|zzz|egg> bofh: I vaguely wonder whether we can compute the cube of the guess in integer arithmetic
<bofh> it's only ~6 bits precision, no? so your mantissa product becomes just a regular multiply plus some shifts. I *highly* doubt it'll be faster than the fp cube tho.
<egg|zzz|egg> bofh: for the 6-bit one you're nicely using your throughput with (r * y) * (r * r) so that's fine
<egg|zzz|egg> bofh: more annoying is x * x * x on the 17-bit one
<egg|zzz|egg> it's exact, but you can't do anything until you have this cube
<egg|zzz|egg> wait nevermind there are plenty of constant * y you can do
<egg|zzz|egg> bofh: anyway I should try Householder 5th instead of 6th to see if it's still correctly rounded
<egg|zzz|egg> bofh: amusingly, Householder 6th with Estrin is faster than Kahan's Hornery 4th order method?!
<egg|zzz|egg> Horner in main()
<egg|zzz|egg> !wpn UmbralRaptor
* Qboid gives UmbralRaptor a FITS radiography
<egg|zzz|egg> that makes a surprising amount of sense
APlayer has quit [Ping timeout: 383 seconds]
<bofh> egg|zzz|egg: I mean I'm not *too* surprised, Estrin is fast and I've come to the conclusion people need to use it much more.
<egg|zzz|egg> bofh: high order is good, estrin is good, forget everything you learned about hornering all the things and cubic splines and leaprogging newtons
<egg|zzz|egg> :D
<kmath> <FakeUnicode> ᴀ ⏎ 𝝖ͣᵃᴀ A ⏎  ͣᴬAᴀ𝝖ͣ ⏎   ͣ Aᴬᴀ   A ⏎   ͣ ᴬAᴀͣᴬ𝝖ᴀ ⏎    𝝖ͣᴀᵃAₐ ᴬ ⏎ A  ᴬᴀ·Aͣᴀ ᴬA ⏎    ᴬ Aₐᴀᵃ𝝖ᴀ𝝖 ͣ ⏎   A 𝝖 ᴀͣᴬAₐᴀ ⏎     A 𝝖 ᴬͣᴀ A ᴀ… https://t.co/Cmcz1Zosye
<bofh> egg|zzz|egg: so cubic splines always sucked for transcendental functions :P my issue with Estrin is it generally only started beconing useful for higher orders, which used to be not that good perf-wise. But now that they *are*...
<bofh> becoming*
<egg|zzz|egg> ΑАA
<egg|zzz|egg> bofh: well it can be useful for low orders tbh
<egg|zzz|egg> it's just that for low orders it's not called Estrin, it's called "as written in the monomial basis") :-p
<egg|zzz|egg> OK I can get a bit more speed if I go down to 4th order Householder (5th performs as 6th), but then I'm not correctly-rounded anymore on 100 000 random inputs, I have 0,021 % incorrect roundings and 0,502 ULPs max on those test values (probably bigger max in general, also I don't have 2 sig dec on that percentage)
<egg|zzz|egg> bofh: not worth it for a perf improvement from 37ish ns to 36ish ns
<egg|zzz|egg> (Atlas's method and Householder 10 are 25ish)
<egg|zzz|egg> bofh: also Atlas's method is way worse than Householder 10 rounding-wise so basically it's all householders? until the cat invents something smarter
<egg|zzz|egg> bofh: I should ask the cat whether that 8,8 % makes sense or is the implementer screwing up his fancy polynomials though
<bofh> Yeah, I'm curious myself. Totally unsurprised Householder works well, you can imagine my confusion a few days ago when it seemed otherwise :P
<egg|zzz|egg> bofh: it's all about that 36-bit clobbering :-p
awang has quit [Ping timeout: 383 seconds]
<bofh> Apparently :P
<egg|zzz|egg> bofh: the implementation of Atlas's method is by https://twitter.com/jakevortex it seems
egg|phone|egg has joined #kspacademia
UmbralRaptor has quit [Quit: Bye]
UmbralRaptop has joined #kspacademia
<bofh> egg|zzz|egg: no idea who that is, but let me check over both atlas's code and the impl myself since I do know arm asm myself.
<egg|zzz|egg> bofh: the arm implementation is C, not arm asm
<egg|zzz|egg> that's why I use this one
<egg|zzz|egg> because feeding the intel asm for the whole function to MSVC is probably not going to be a pleasant experience
UmbralRaptor has joined #kspacademia
<bofh> egg|zzz|egg: so you linked me an impl in assembly, and I didn't scroll down far enough to see what ISA it was.
<UmbralRaptor> Jackson: Use the boundary conditions to find a unique solution. Also Jackson, let me give an example with a power series that's unlike the Taylor series you would actually find. https://photos.app.goo.gl/wOjc2OaefkSvqfYy1
<bofh> oh, it's x86_64 asm? rofl.
UmbralRaptor has quit [Client Quit]
<egg|zzz|egg> bofh: cbrtf.s is atlas's, for Intel
<bofh> that's easy why didn't I peer at it sooner
UmbralRaptor has joined #kspacademia
<egg|zzz|egg> cbrt.c is a reimplementation in C for ARM
<bofh> (I blame being busy and a bit exhausted)
<bofh> mind relinking cbrtf.s?
<egg|zzz|egg> and I use the ARM implementation and compile it for intel because linking the intel one doesn't seem fun
UmbralRaptop has quit [Ping timeout: 198 seconds]
<bofh> Right, you're on windows--wait the calling convention for floats is ~the same, so you should be fine.
<bofh> UmbralRaptor: wait, what's the weird power series expansion that you're referring to?
<bofh> UmbralRaptor: like I'm seeing the exact series solution he mentions, multiplied by a Legendre polynomial in cos(\theta)
<egg|zzz|egg> bofh: not sure what the calling convention is, but also even if it works just figuring out how to link it in sounds unfun :-p
<bofh> just add the object file into the call to link.exe or w/e MSVC uses nowadays? :P
<UmbralRaptor> bofh: \frac{1}{x-x'} = \frac{1}{r_\gt} \sum_{l=0}^\infin \left( \frac{r_\lt}{r_\gt} \right)^l
<SilverFox> what is the best method of while(true) without maxing the CPU?
<bofh> \frac{r_\lt}{r_\gt} <- the fuck is this?
<egg|zzz|egg> r</r> ? this is sounding like HTML
<bofh> is that r_</r_>???
<bofh> UmbralRaptor: I'm not sure this makes any sense, how do r_< & r_> relate to x & x'?
<UmbralRaptor> It's that sum about 1/3 of the way down page 103.
<bofh> (like this is half my problem with Jackson, he *loves* doing complicated coordinate transformations without any intermediate steps and you're like WTF)
<egg|zzz|egg> bofh: I hear Euler otoh goes into painstaking detail and copies giant expressions where he changes one thing :-p
<bofh> OH FUCK NEVERMIND I HATE THIS NOTATION SO MUCH
<bofh> r_< is min(|x|,|x'|) and r_< is max(|x|,|x'|)
<bofh> er, that second one should be a r_>
<egg|zzz|egg> bofh: J. Twit. Numer. Anal. https://twitter.com/eggleroy/status/981679873537298432
<kmath> <eggleroy> @stephentyrone I seem to have found a method faster than Kahan’s & which correctly rounds on those 100 000 inputs:… https://t.co/ugvPFJLIQr
<egg|zzz|egg> (which this tweet is too short to contain?)
<Majiir> !wpn egg|zzz|egg
* Qboid gives egg|zzz|egg a Chomskyan COME FROM
<Majiir> !wpn hatbot
* Qboid gives hatbot a bipartite pharmacy
<egg|zzz|egg> !wpn Majiir
* Qboid gives Majiir an explosive Poké Ball/sole hybrid
<Majiir> RIP hatbot
<egg|zzz|egg> it's been a while since we've seen hatbot
<egg|zzz|egg> ;seen hatbot
<kmath> egg|zzz|egg: hatbot (hatbot!~hatbot@ec2-54-167-168-157.compute-1.amazonaws.com) was last seen joining in #bottorture at 2017-10-14 01:05:05 +0000
<bofh> hatbot?
<bofh> egg|zzz|egg: nice!
* UmbralRaptor stares at the lack of > and < in LaTeX.
<UmbralRaptor> Oh, they work directly. o_O
<kmath> <bofh453> OH: "I contend that the SI unit of suffering is the milliJackson."
* UmbralRaptor was expecting some special command was needed.
<UmbralRaptor> Pretty much.
<UmbralRaptor> AN EXAMPLE SHOULD NOT LEAD THE STUDENT TO ABANDON THE METHOD THAT YOU'RE SHOWING BECAUSE YOU SEEMINGLY CONTRADICTED THE ENTIRE POINT OF BOUNDARY CONDITIONS AND UNIQUENESS!
<UmbralRaptor> Relatedly, I probably failed that test.
UmbralRaptop has joined #kspacademia
<bofh> UmbralRaptor: like now that I'm done WTFing at the horrible abuse of notation I'm EXTRA-WTFING at his choice of example
<egg|zzz|egg> bofh: would the notation be improved by CJK
<egg|zzz|egg> bofh: also the cat has liked my tweet, is this peer review,
UmbralRaptor has quit [Ping timeout: 182 seconds]