egg|nomz|egg changed the topic of #kspacademia to: https://gist.github.com/pdn4kd/164b9b85435d87afbec0c3a7e69d3e6d | Dogs are cats. Spiders are cat interferometers. | Космизм сегодня! | Document well, for tomorrow you may get mauled by a ネコバス. | <UmbralRaptor> egg|nomz|egg: generally if your eyes are dewing over, that's not the weather. | <ferram4> I shall beat my problems to death with an engineer.
<egg|zzz|egg>
bofh: um yes but that x^3-y is part of the cbrt implementation,
<egg|zzz|egg>
bofh: how many layers of cbrt are you on,
<egg|zzz|egg>
bofh: although maybe I could distribute the numerator to make it less cancelly than precisely the thing that I'm trying to make most cancelly
<bofh>
egg|zzz|egg: ohh I see what you mean. I don't *think* you're losing too many bits there...
<egg|zzz|egg>
bofh: I think that's why I'm not improving things with better guess + Householder 6 vs. Householder 10 and it's all at a ridiculous 0,9 ULPs
<bofh>
It's possible (tho I maintain anything < 1ULP is good), actually. Hm. I actually wonder if just doing the *subtraction* in double-double suffices to help at all.
<egg|zzz|egg>
bofh: also I appear to be alternating between the two decimal markers defined by CGPM 9 resolution 7/CGPM 22 resolution 10/ISO 31-0 3.3/ISO 80000-1 7.3.2 :-p
<egg|zzz|egg>
bofh: nah any egg can do faithfully rounded, it's only interesting if you try to get close to correctly rounded
<egg|zzz|egg>
bofh: the subtraction is exact
<egg|zzz|egg>
the problem is not the rounding of the subtraction, it is the condition of the subtraction
<bofh>
Ahh. Right.
<bofh>
also I maintain what is interesting is faithfully rounded with optimal perf :p
<egg|zzz|egg>
bofh: the true answer is probably that you need an army of numericists to do the end-to-end error analysis and get optimal perf for whatever precision you need
<egg|zzz|egg>
(and optimal precision within that perf if magic numbers & polynomials are tunable)
<egg|zzz|egg>
bofh: e.g. if it's the step resizing factor in adaptive stepsize then touching the FPU for your rootn is sinful
<egg|zzz|egg>
since you multiply the result with 0,9 anyway :-p
<kmath>
<FioraAeterna> @f1ac5 there's a few other possibilities! ⏎ ⏎ 1. you have a fetish for floating point ⏎ 2. you need predictable wrongnes… https://t.co/1lxO0i9PIk
<bofh>
also you're making me want to revisit and improve the accuracy of my Airy function implementation past ~3ULP worst-case error for negative values near zeroes but fucking hell that thing is a *nightmare*.
<egg|zzz|egg>
bofh: uh so distributing that cancellation in the 6th order householder somehow makes it unfaithful Ꙩ_ꙩ
<egg|zzz|egg>
did I just introduce a bigger cancellation,
<egg|zzz|egg>
yeah I just made the cancellation worse
<egg|zzz|egg>
*sobbing*
<bofh>
firsyt off why aren't you precomputing z = x^3 as an intermediate and using that in your numerator/denominator expressions? but otherwise... that numerator looks like it can have nasty cancellation in some cases.
<egg|zzz|egg>
bofh: superscript digits are perfectly good characters in identifiers,
<bofh>
...
<egg|zzz|egg>
seriously this is way more legible this way
<bofh>
okay, so I should assume any high-order superscript digits actually have the common sub-powers precomputed?
<egg|zzz|egg>
yeah, probably russian peasant
<egg|zzz|egg>
(wait is it called that in english too or does the above sentence sound completely nonsensical)
<bofh>
yeah, the russian peasant algo
<egg|zzz|egg>
okay, it's called the same in french funnily enough
<egg|zzz|egg>
(paysan russe)
<egg|zzz|egg>
bofh: I mean it's just perfectly good C++, you should assume it's C++ and so the things that are identifiers are identifiers :-p
<egg|zzz|egg>
bofh: and the numerator is not just cancelly in some cases, it's *actively trying to cancel* since you try to get x close to the cube root of y
<egg|zzz|egg>
bofh: okay so basically what I'd like is a high-order iterate that is a rational function in (x-y), x, and y with positive coefficients
<egg|zzz|egg>
bofh: also a pony
<bofh>
Yeah I don't *think* you'll get that, particularly not the "only positive coeffs" bit I don't think.
<egg|zzz|egg>
bofh: because the Householder stuff seems to be inherently heavily cancelly,
<egg|zzz|egg>
there are methods other than householder though, kahan lists two of order 5
e_14159 has quit [Ping timeout: 198 seconds]
<bofh>
egg|zzz|egg: I suspect it's just Householder composed with some smooth function, but am not sure.
e_14159 has joined #kspacademia
<egg|zzz|egg>
bofh: hm but can it help with the cancelly bits
<egg|zzz|egg>
bofh: like if you can pull off some trick like the conjugate multiplication thing maybe you can make this non-horrible
<egg|zzz|egg>
gah we really need to get the cat in here for advice on this stuff
<bofh>
egg|zzz|egg: I don't *think* this can be made non-cancellable via conjugate multiplication or similar, but let me think.
<kmath>
<brettmor> Shout out to all of the grads who applied and *didn't* get a fellowship this year. You're in good company! https://t.co/Sq7qhYDg3J
Technicalfool_ has quit [Ping timeout: 383 seconds]
<egg|cell|egg>
Bofh: I guess that's why the cat looks for a perfect cube in a table
<bofh>
egg|cell|egg: oh?
<bofh>
(I *extremely* dislike that table fwiw)
<egg|cell|egg>
Bofh: for a perfect cube x³-y will be nice to compute
<egg|cell|egg>
But what if you 0 the low bits of your guess?
<egg|cell|egg>
Hmm
<egg|cell|egg>
Nerd sniping intensifies
<bofh>
egg|cell|egg: yes but now you're storing a large table for at best fractional ULP improvement, this is not better at all imho.
<egg|cell|egg>
Bofh: but fractional ulps are good,
<egg|cell|egg>
Bofh: but 0ing bits of your guess might work
<egg|cell|egg>
How many sigbits can you have and still have X*X*X exact
<egg|cell|egg>
!Wpn bofh
* Qboid
gives bofh a coverage-guided condenser
<bofh>
hrm
<bofh>
isn't it always going to be exact in the absence of overflow or underflow?
<egg|cell|egg>
Wat
<egg|cell|egg>
Bofh: if it were exact there would be no cancellation issue
armed_troop has quit [Ping timeout: 182 seconds]
armed_troop has joined #kspacademia
<bofh>
okay, right, but okay I'm actually having trouble thinking of when multiplication is inexact in the absence of over/underflow
<egg|cell|egg>
When the result has more than 53 bits?
<egg|cell|egg>
So I guess the guess must have 17 bits or so
<egg|zzz|egg>
bofh: okay I actually forgot this part in the kahan paper, he covers this (that explains why I got shitty precision on his method) "must be accurate to almost, and *rounded* to at most, a third as many figures as the arithmetic carries"
<egg|zzz|egg>
bofh: so if I do that maybe I don't get 0,89 ULPs where he claims 0,59
<bofh>
Ahh, I see.
<bofh>
I'm.fairly certain the approximate_rootn.pdf guess has between 5 & 6 bits, & one Newton iterate won't put that above 12, so you're good?
<egg|zzz|egg>
bofh: well still need to actually do the bitand with 0xFFFF'FFF0'0000'0000
<egg|zzz|egg>
bofh: once I do that it is *a lot* better
<egg|zzz|egg>
wow
<egg|zzz|egg>
also what is up with the cat's implementation, it has comparatively pretty bad rounding?
<egg|zzz|egg>
bofh: so, max ULPs is a bit noisy on 10 000 values, looking at frequency of incorrect rounding instead
<egg|zzz|egg>
Atlas: 0.0861
<egg|zzz|egg>
Householder 10: 0.0147
<egg|zzz|egg>
Newton on the inverse + Householder 6: 0.0136
<egg|zzz|egg>
Kahan 0.0011 (!!)
<egg|zzz|egg>
Microsoft: 0.2803 (lol)
<bofh>
wait what the shit, how did clobbering bits we don't care about *improve* the result? the multiplication of 17 bits^3 vs 53 bits^3 where 24 are garbage should have the same amount of useful bits: the garbage ones get shifted out in the multiply anyway.
<bofh>
let alone that good of a result, damn.
<egg|zzz|egg>
for max ULPs on my 10 000 test values:
<egg|zzz|egg>
Atlas: 0.71814
<egg|zzz|egg>
Householder 10: 0.57580
<egg|zzz|egg>
Newton then Householder 6: 0.58524
<egg|zzz|egg>
Kahan: 0.55148
<egg|zzz|egg>
Microsoft: 1.2779 (lol)
<egg|zzz|egg>
bofh: well because clobbering the bits means you have no error in x³-y instead of a massive cancellation
<egg|zzz|egg>
bofh: so your correction term from x is much better
<egg|zzz|egg>
otherwise your correction term is mostly rounding errors from the computation of x³
<bofh>
ohh, right, rounding is a thing.
<bofh>
duh.
<egg|zzz|egg>
:D
<bofh>
also I'm morbidly curious what MS does now, it's slower and not even accurate to <1ULP.
<egg|zzz|egg>
bofh: I'm really surprised that atlas's method is that badly rounded; I'm actually using the C implementation from the ARM directory which uses his polynomial and table but is not by him, maybe it has additional errors?
<bofh>
Possibly, I'm not sure. I wonder if expressions got replaced with mathematically equivalent but ill-conditioned ones or something, hm.
<bofh>
(alternatively it's possible he didn't care at the time. he *did* say recently he needs to revisit it :P)
<egg|zzz|egg>
now with 100 000 random values: incorrectness frequencies for the same: {0.08777, 0.01624, 0.01384, 0.001, 0.27913}
<egg|zzz|egg>
and max errors: 0.72486, 0.60568, 0.58524, 0.56219, 1.3531
<bofh>
Not bad.
<egg|zzz|egg>
bofh: obviously the high householders from a shitty guess are worse than Kahan's 4th order iterate from a near-optimal guess
<egg|zzz|egg>
bofh: not sure how to get a 17-bit guess without a division though :-/
<bofh>
inverse cbrt then 1 Newton iterate then do x * z * z with clever masking?
<bofh>
wait, I thought Kahan and your Householders all used the exact same starting guess code?
<egg|zzz|egg>
bofh: Kahan does a Halley then a 4th order iterate
<egg|zzz|egg>
and rounds to 17 bits after the Halley
<egg|zzz|egg>
that gives you a 17 bits guess which is how you get almost always correct rounding
<bofh>
Ahh. Actually I'm curious how Halley, round to 17, Halley compares.
<egg|zzz|egg>
bofh: that's not going to be faithful
<egg|zzz|egg>
bofh: the bit clobbering trickery is only relevant when you're near the limits of the precision, otherwise you're far from caring about rounding
<egg|zzz|egg>
bofh: and Halley can't get you from 17 bits to all so you can't use it as your last step
<bofh>
Right, Halley will be at best non-faithful, at worst 1-3 bits off outright.
ferram4 has quit [Ping timeout: 198 seconds]
<egg|zzz|egg>
bofh: well >1 ULP off is what unfaithful means right?
<egg|zzz|egg>
bofh: I wonder whether you can do 2 rounds of newton on the inverse faster by Estrining it as a single horrible polynomial, then use that as your guess for a 4th order iteration
<bofh>
So I considered doing that before but never tried it, it seems worth it tho.
<kmath>
<DrPhiltill> #TFW you get a calendar invitation to interview an applicant in the middle of the night — 2:30 a.m. on Saturday mor… https://t.co/yntg412Euj
<kmath>
<Marco_Langbroek> (1/5) ⏎ Remember @TheHumanityStar, and how its builders @RocketLab claimed it would be visible for 9 months? While in… https://t.co/6Wt9fwy1Ia
<APlayer>
Huh, what? It reentered already?
<APlayer>
I was totally planning to photograph it somewhen in early summer, and kind of put it on my long term to-do list... And now it's gone, just like that, half a year earlier than I was told
<kmath>
<Marco_Langbroek> (1/5) ⏎ Remember @TheHumanityStar, and how its builders @RocketLab claimed it would be visible for 9 months? While in… https://t.co/6Wt9fwy1Ia
APlayer has joined #kspacademia
<iximeow>
from that thread
<iximeow>
>(* and to validate my orbital lifetime estimates for an upcoming launch)
<kmath>
<mcclure111> Why do we use floating point coordinates for game physics engines anyway. Why not fixed point
<SnoopJeDi>
LOL
<UmbralRaptor>
I think Doom used fixed point?
Snoozee has joined #kspacademia
Snoozee is now known as Majiir
<UmbralRaptor>
!wpn Majiir
* Qboid
gives Majiir a thorium symbol
<egg|zzz|egg>
bofh: I am amused by that bit-clobbering trick to get an exact cube
<egg|zzz|egg>
bofh: also I reinvented it this morning and then I found it in Kahan's thing, maybe I should read Kahan's thing more carefully :-p
* UmbralRaptor
pokes Gaussian units in the random 4π.
<bofh>
rofl. I mean same, since I completely overlooked it myself.
Technicalfool has quit [Ping timeout: 182 seconds]
tawny has joined #kspacademia
<egg|zzz|egg>
!wpn rqou
* Qboid
gives rqou a caffeinated metric with a radiator attachment
* rqou
meows
UmbralRaptor is now known as NomalRaptor
<APlayer>
Random (literally) question: You have a probability experiment with n different, equally likely outcomes, and you do it n times. I noticed that the probability of one specific outcome occurring at least once is very close to 2/3. Is there some sort of theorem or something for that?
<APlayer>
Or is it a false assumption that the probability is next to 2/3?
<egg|zzz|egg>
!wpn NomalRaptor
* Qboid
gives NomalRaptor a toasted panzer
<NomalRaptor>
So, E&M is this week, not last.
* egg|zzz|egg
gives NomalRaptor a gold cabinet
<NomalRaptor>
???
<NomalRaptor>
And I'm certain that one of the homework problems involved using a non-constructive proof to generate an equation for the potential around an object.
<SnoopJeDi>
APlayer, the probability that you *don't* observe one of the outcomes is ((n-1)/n)**n, which in the lim n->∞ is 1/e
<SnoopJeDi>
1 - 1/e is close to 2/3
<SnoopJeDi>
"close"
<NomalRaptor>
^
<NomalRaptor>
e ≈ 3 ≈ π
<SnoopJeDi>
n.b. ((n-1)/n)**n) = (1 - 1/n)**n
<APlayer>
π ≈ 4?
<egg|zzz|egg>
rounding to nearest to 1 bit, yes
<NomalRaptor>
π ≡ 3
* NomalRaptor
ducks.
<egg|zzz|egg>
wasn't it bofh who had tales of π = 1?
<APlayer>
Wait, is lim n -> inf ((n-1)/n)^n not the definition of e?
<SnoopJeDi>
(1 + 1/n)**n, not -
<SnoopJeDi>
hence 1/e
<APlayer>
Ah, yes
<APlayer>
Alright, so what I was seeing was actually approaching 1/e and not 1/3, haha
<APlayer>
Well, this is interesting
<SnoopJeDi>
n.b. this is only true if these are independent events, but I assumed that's what you meant
<APlayer>
Because I used to (for no reason at all, don't ask me why) intuitively assume the probability in such cases to be 1/2
<APlayer>
And used that to estimate things in life
<APlayer>
Will correct to 2/3 for estimation purposes and 1 - 1/e for calculation purposes
* NomalRaptor
throws Mario head first at that page,
<SnoopJeDi>
"Mac: um we're indistinguishable from PC now but we don't have chromatic abberation?"
<SnoopJeDi>
tangentially: PlayerUnknown's Battlegrounds has some of the most satisfying rifle scopes I've used in a videogame now that they model parallax and spherical/chromatic aberration (although I don't play milsim so...?)
<egg|zzz|egg>
bofh: I have something which is correctly rounded on 100 000 random values
<egg|zzz|egg>
bofh: it is slightly slower than Kahan's method though
<egg|zzz|egg>
bofh: okay wat double-newton on the inverse is actually slower than Halley despite its fdiv? how the hell did I write my double-newton to get that
<bofh>
What on *earth*? I want to see how that works on your Sandy Bridge desktop too fwiw
<egg|zzz|egg>
bofh: still slower than non-Estrin Householder, but now faster than Kahan
<egg|zzz|egg>
and better rounded
<egg|zzz|egg>
I beat the wolf on all metrics, but beating the cat similarly might be harder
awang has quit [Ping timeout: 182 seconds]
awang has joined #kspacademia
<egg|zzz|egg>
bofh: okay I arguably beat the cat with just 10th order householder, but that's not very well rounded
<egg|zzz|egg>
bofh: I vaguely wonder whether we can compute the cube of the guess in integer arithmetic
<bofh>
it's only ~6 bits precision, no? so your mantissa product becomes just a regular multiply plus some shifts. I *highly* doubt it'll be faster than the fp cube tho.
<egg|zzz|egg>
bofh: for the 6-bit one you're nicely using your throughput with (r * y) * (r * r) so that's fine
<egg|zzz|egg>
bofh: more annoying is x * x * x on the 17-bit one
<egg|zzz|egg>
it's exact, but you can't do anything until you have this cube
<egg|zzz|egg>
wait nevermind there are plenty of constant * y you can do
<egg|zzz|egg>
bofh: anyway I should try Householder 5th instead of 6th to see if it's still correctly rounded
<egg|zzz|egg>
bofh: amusingly, Householder 6th with Estrin is faster than Kahan's Hornery 4th order method?!
<egg|zzz|egg>
Horner in main()
<egg|zzz|egg>
!wpn UmbralRaptor
* Qboid
gives UmbralRaptor a FITS radiography
<egg|zzz|egg>
that makes a surprising amount of sense
APlayer has quit [Ping timeout: 383 seconds]
<bofh>
egg|zzz|egg: I mean I'm not *too* surprised, Estrin is fast and I've come to the conclusion people need to use it much more.
<egg|zzz|egg>
bofh: high order is good, estrin is good, forget everything you learned about hornering all the things and cubic splines and leaprogging newtons
<kmath>
<FakeUnicode> ᴀ ⏎ 𝝖ͣᵃᴀ A ⏎ ͣᴬAᴀ𝝖ͣ ⏎ ͣ Aᴬᴀ A ⏎ ͣ ᴬAᴀͣᴬ𝝖ᴀ ⏎ 𝝖ͣᴀᵃAₐ ᴬ ⏎ A ᴬᴀ·Aͣᴀ ᴬA ⏎ ᴬ Aₐᴀᵃ𝝖ᴀ𝝖 ͣ ⏎ A 𝝖 ᴀͣᴬAₐᴀ ⏎ A 𝝖 ᴬͣᴀ A ᴀ… https://t.co/Cmcz1Zosye
<bofh>
egg|zzz|egg: so cubic splines always sucked for transcendental functions :P my issue with Estrin is it generally only started beconing useful for higher orders, which used to be not that good perf-wise. But now that they *are*...
<bofh>
becoming*
<egg|zzz|egg>
ΑАA
<egg|zzz|egg>
bofh: well it can be useful for low orders tbh
<egg|zzz|egg>
it's just that for low orders it's not called Estrin, it's called "as written in the monomial basis") :-p
<egg|zzz|egg>
OK I can get a bit more speed if I go down to 4th order Householder (5th performs as 6th), but then I'm not correctly-rounded anymore on 100 000 random inputs, I have 0,021 % incorrect roundings and 0,502 ULPs max on those test values (probably bigger max in general, also I don't have 2 sig dec on that percentage)
<egg|zzz|egg>
bofh: not worth it for a perf improvement from 37ish ns to 36ish ns
<egg|zzz|egg>
(Atlas's method and Householder 10 are 25ish)
<egg|zzz|egg>
bofh: also Atlas's method is way worse than Householder 10 rounding-wise so basically it's all householders? until the cat invents something smarter
<egg|zzz|egg>
bofh: I should ask the cat whether that 8,8 % makes sense or is the implementer screwing up his fancy polynomials though
<bofh>
Yeah, I'm curious myself. Totally unsurprised Householder works well, you can imagine my confusion a few days ago when it seemed otherwise :P
<egg|zzz|egg>
bofh: it's all about that 36-bit clobbering :-p
<bofh>
egg|zzz|egg: no idea who that is, but let me check over both atlas's code and the impl myself since I do know arm asm myself.
<egg|zzz|egg>
bofh: the arm implementation is C, not arm asm
<egg|zzz|egg>
that's why I use this one
<egg|zzz|egg>
because feeding the intel asm for the whole function to MSVC is probably not going to be a pleasant experience
UmbralRaptor has joined #kspacademia
<bofh>
egg|zzz|egg: so you linked me an impl in assembly, and I didn't scroll down far enough to see what ISA it was.
<UmbralRaptor>
Jackson: Use the boundary conditions to find a unique solution. Also Jackson, let me give an example with a power series that's unlike the Taylor series you would actually find. https://photos.app.goo.gl/wOjc2OaefkSvqfYy1
<bofh>
oh, it's x86_64 asm? rofl.
UmbralRaptor has quit [Client Quit]
<egg|zzz|egg>
bofh: cbrtf.s is atlas's, for Intel
<bofh>
that's easy why didn't I peer at it sooner
UmbralRaptor has joined #kspacademia
<egg|zzz|egg>
cbrt.c is a reimplementation in C for ARM
<SilverFox>
what is the best method of while(true) without maxing the CPU?
<bofh>
\frac{r_\lt}{r_\gt} <- the fuck is this?
<egg|zzz|egg>
r</r> ? this is sounding like HTML
<bofh>
is that r_</r_>???
<bofh>
UmbralRaptor: I'm not sure this makes any sense, how do r_< & r_> relate to x & x'?
<UmbralRaptor>
It's that sum about 1/3 of the way down page 103.
<bofh>
(like this is half my problem with Jackson, he *loves* doing complicated coordinate transformations without any intermediate steps and you're like WTF)
<egg|zzz|egg>
bofh: I hear Euler otoh goes into painstaking detail and copies giant expressions where he changes one thing :-p
<bofh>
OH FUCK NEVERMIND I HATE THIS NOTATION SO MUCH
<bofh>
r_< is min(|x|,|x'|) and r_< is max(|x|,|x'|)
<kmath>
<eggleroy> @stephentyrone I seem to have found a method faster than Kahan’s & which correctly rounds on those 100 000 inputs:… https://t.co/ugvPFJLIQr
<egg|zzz|egg>
(which this tweet is too short to contain?)
<Majiir>
!wpn egg|zzz|egg
* Qboid
gives egg|zzz|egg a Chomskyan COME FROM
<Majiir>
!wpn hatbot
* Qboid
gives hatbot a bipartite pharmacy
<egg|zzz|egg>
!wpn Majiir
* Qboid
gives Majiir an explosive Poké Ball/sole hybrid
<Majiir>
RIP hatbot
<egg|zzz|egg>
it's been a while since we've seen hatbot
<egg|zzz|egg>
;seen hatbot
<kmath>
egg|zzz|egg: hatbot (hatbot!~hatbot@ec2-54-167-168-157.compute-1.amazonaws.com) was last seen joining in #bottorture at 2017-10-14 01:05:05 +0000
<bofh>
hatbot?
<bofh>
egg|zzz|egg: nice!
* UmbralRaptor
stares at the lack of > and < in LaTeX.
<kmath>
<bofh453> OH: "I contend that the SI unit of suffering is the milliJackson."
* UmbralRaptor
was expecting some special command was needed.
<UmbralRaptor>
Pretty much.
<UmbralRaptor>
AN EXAMPLE SHOULD NOT LEAD THE STUDENT TO ABANDON THE METHOD THAT YOU'RE SHOWING BECAUSE YOU SEEMINGLY CONTRADICTED THE ENTIRE POINT OF BOUNDARY CONDITIONS AND UNIQUENESS!
<UmbralRaptor>
Relatedly, I probably failed that test.
UmbralRaptop has joined #kspacademia
<bofh>
UmbralRaptor: like now that I'm done WTFing at the horrible abuse of notation I'm EXTRA-WTFING at his choice of example
<egg|zzz|egg>
bofh: would the notation be improved by CJK
<egg|zzz|egg>
bofh: also the cat has liked my tweet, is this peer review,