#kspacademia on 2018-04-02 — irc logs at esper.irclog.whitequark.org

00:15 <bofh> https://twitter.com/vesselofstar/status/980217558178586624 thank

00:15 <kmath> <vesselofstar> happy trans day of visibility y'all https://t.co/5M6jCzceZG

00:32 <egg> !wpn bofh

00:32 * Qboid gives bofh a killing guillotine

00:33 <egg> um

00:33 <whitequark> that's a weapon alright

00:36 * bofh stuffs some US senators into it

00:38 <egg> bofh: so I wonder whether there are higher-order methods that have an effect on the relative error whose derivative does not vanish except at 0

00:39 <bofh> I don't think so, why would that be the case?

00:39 <egg> bofh: well it's the case of Newton and Halley

00:40 <egg> so maybe one can craft other methods like that?

00:40 <bofh> Hmm. Point. Hm. Not sure. Still poring over the result from approximate_rootn.pdf in the hopes of trying to figure out why the magic constants are linear to a huge degree

00:41 <egg> if it's the case it means that the singular points of the relative error after iteration are at the same places as the extrema of the relative error before iterating, which makes it easy to do a γ optimization

00:42 e_14159 has quit [Ping timeout: 190 seconds]

00:43 Profound has joined #kspacademia

00:45 e_14159 has joined #kspacademia

00:52 Snoozee is now known as Majiir

01:10 <egg|cell|egg> !Wpn bofh

01:10 * Qboid gives bofh a pie squirrel

01:10 * egg|cell|egg zzz

01:28 <UmbralRaptor> !choose paperwork|research|study|other

01:28 <Qboid> UmbralRaptor: Your options are: paperwork, research, study, other. My choice: research

01:28 * UmbralRaptor chooses paperwork.

01:29 <UmbralRaptor> With physical paper, because it's somehow still the XXth century.

01:30 <Majiir> !wpn egg|cell|egg

01:30 * Qboid gives egg|cell|egg a stern resistor

01:30 <Majiir> Not a weapon, but a neat item. "You may *not* pass through here with such high voltage. Simmer down, young electron."

01:33 <UmbralRaptor> whitequark: so, I had to close a checking account. This entailed typing up a "formal statement" (no, they didn't have templates), printing it out, signing, and mailing the statement to their office. They then mailed me a check with the remaining funds.

01:37 Majiir is now known as Snoozee

01:40 UmbralRaptor has quit [Remote host closed the connection]

01:43 UmbralRaptop has joined #kspacademia

02:56 <UmbralRaptop> ;c 3500/10672

02:56 <kmath> UmbralRaptop: 0.327961019490255

02:56 <UmbralRaptop> ;c 3500/18e3

02:56 <kmath> UmbralRaptop: 0.194444444444444

02:56 <UmbralRaptop> ;c 4500/18e3

02:56 <kmath> UmbralRaptop: 0.25

02:58 <whitequark> UmbralRaptop: lol

02:58 <whitequark> I bet convincing my bank that my name and gender have changed will be "fun"

02:58 <whitequark> especially doing that cross-country

02:59 <UmbralRaptop> Yeah, that'll be Fun.

03:02 <UmbralRaptop> Bonus Fun if your name breaks assumptions about how many a person has (mine do). Also, orthography and/or unicode shenanigans?

03:18 <whitequark> oh lol no

03:18 <whitequark> I'm selecting my new name algorithmically specifically to avoid that

03:32 * UmbralRaptop imagines the new name looking like a password for some reason.

06:20 <bofh> https://twitter.com/jccwrt/status/980676752346439680 holy shit

06:20 <kmath> <jccwrt> I might have found the #Tiangong1 reentry in Himawari-8 images...

07:19 StCypher has quit [Ping timeout: 182 seconds]

08:28 <egg|cell|egg> !Wpn bofh

08:28 * Qboid gives bofh a Peregrine apple-space-like metric

09:15 <egg> !u フラクタル

09:15 <Qboid> U+30D5 KATAKANA LETTER HU (フ)

09:15 <Qboid> U+30E9 KATAKANA LETTER RA (ラ)

09:15 <Qboid> U+30AF KATAKANA LETTER KU (ク)

09:15 <Qboid> U+30BF KATAKANA LETTER TA (タ)

09:15 <Qboid> U+30EB KATAKANA LETTER RU (ル)

09:15 <egg> !wpn whitequark

09:15 * Qboid gives whitequark a deuterium action with a snail attachment

10:15 Guest has joined #kspacademia

10:18 Profound has quit [Ping timeout: 383 seconds]

10:44 <egg> yay more котяpics https://twitter.com/whitequark/status/980575216047833089

10:44 <kmath> <whitequark> https://t.co/jSmmXQdRfy

10:48 tawny- has quit [Ping timeout: 383 seconds]

11:41 <egg> bofh: so, ran some benchmarks, predictably the cat's cbrt (or rather the arm one which is written in C but uses the cat's polynomial) is much faster than Kahan's default approach which assumes division isn't much costlier

11:42 <egg> bofh: Kahan does mention computing rootn(y, -3), but doesn't do the error analysis for it and only suggests a 2nd order iteration

11:44 <egg> bofh: but I would guess the z * z * y would screw with any attempt at getting the last few ULPs?

11:50 <egg> yeah Halley involves a division so that's a mess

12:00 <egg> bofh: and since you need four rounds of Newton in binary64 it ends up being slower than the method with divisions, let alone the cat's polynomial

12:11 <egg> ---------------------------------------------------------

12:11 <egg> Benchmark Time CPU Iterations

12:11 <egg> ---------------------------------------------------------

12:11 <egg> BM_AtlasCbrt 24808 ns 24102 ns 27876 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

12:11 <egg> BM_KahanCbrt 35135 ns 32959 ns 21333 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

12:11 <egg> BM_KahanNoDivCbrt 41263 ns 40806 ns 17231 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

12:11 <egg> bofh: ^

12:38 <egg> bofh: phl tells me that back when he was writing Sqrt for Apex's Ada.Numerics the tradeoff was the same so he went with remezed polynomials (apparently he managed to find magic endpoints for his polynomials so that the error was 0,501 ULPs or so)

12:39 <egg> bofh: those tradeoffs move back and forth as the cat says https://twitter.com/stephentyrone/status/891326241663594496

12:39 <kmath> <stephentyrone> @kittenpies3 mostly went out of fashion when HW multiplication became fast. Expect them to be more widely used agai… https://t.co/aFtKyzvDAz

12:39 awang has joined #kspacademia

12:39 <egg> awang!

12:39 <egg> awang: RSS got released

12:40 <egg> awang: also principia is being fun https://github.com/pleroy/Principia/blob/79e09d276b35d543c4f9f06b762301565b7a3f6e/install_deps.sh#L82-L88

12:43 APlayer has joined #kspacademia

12:44 <awang> \o

12:44 <awang> Sorry, it's been one heck of a weekend

12:45 <awang> Just worked my way through GH notifications

12:45 <awang> Going through scrollback now

12:45 <awang> Yay, a *real* optional implementation :P

12:45 <awang> Was MSVC unable to compile it earlier?

12:51 <APlayer> Hey there!

12:54 <awang> Hello!

13:00 <egg> awang: since VS2017 we have <optional> in msvc

13:00 <egg> awang: same on ubuntu bionic

13:00 <egg> awang: the curl is for mac, where there's only <experimental/optional> which is old and shitty

13:00 <egg> so we just fetch a fresh one :-p

13:08 <awang> egg: Ah, makes sense

13:08 <awang> I remember having fun with that when unwittingly switching between real Clang and Apple Clang

13:10 <awang> egg: Also, I agree on the lunar release cycle

13:10 <awang> Release EVERYTHING at once

13:10 <egg> !csharp bofhtime()

13:10 <Qboid> 2018-04-02T08:10:57,105

13:10 <egg> yay

13:11 <awang> btw, do you still need a 1.4.2 compile for Principia?

13:11 <egg> awang: oh yeah we need to upgrade that at some point yes

13:11 <egg> awang: probably still need 1.4.1 just in case (but maybe with a deprecation message), so that means yet another config

13:12 <egg> *sobbing* four release configurations

13:12 <egg> though we're working to get rid of 1.2.2 so there's that

13:12 <egg> bofh: is bofhtime() accurate

13:13 <awang> Oh boy

13:14 <awang> Maybe release 1.4.2 and get rid of 1.2.2 in the same release?

13:14 <egg> nah we want to at least have one release where we still release but with a deprecation message

13:14 <awang> Ah

13:14 <egg> (so that it says upgrade KSP rather than upgrade principia)

13:15 <awang> Fair enough

13:15 * egg nebula https://www.spacetelescope.org/images/opo9603a/

13:28 <egg> awang: not sure whether it would end up being full moons or new moons for RealWhatever

13:28 <egg> if it's full moons it would nicely leapfrog principia

13:36 <egg> https://twitter.com/sigfig/status/980555405393883136

13:36 <kmath> <sigfig> u ever spend a few hours calculating a quantity you know to be zero by definition just to make sure the world hasnt collapsed beneath u

13:40 <awang> egg: I vote we split RealWhatever into three parts, and release on the full moon and quarter moons too :P

13:42 tawny- has joined #kspacademia

13:48 <UmbralRaptop> full/new, and apogee/perigee

13:48 <egg> UmbralRaptop: aaaaaa

13:48 <egg> UmbralRaptop: what about the nodes,

13:49 <egg> !wpn UmbralRaptop

13:49 * Qboid gives UmbralRaptop a split sturgeon

13:49 <egg> !wpn awang

13:49 * Qboid gives awang a furious-sounding csh-compatible abstraction

13:49 <egg> !wpn ferram4

13:49 * Qboid gives ferram4 a mu tesseract

13:49 <egg> !wpn whitequark

13:49 * Qboid gives whitequark a metabotropic squirrel/king hybrid

13:49 <awang> !u ï

13:49 <Qboid> U+00EF LATIN SMALL LETTER I WITH DIAERESIS (ï)

13:49 <egg> all hail the metabotropic squirrel king

13:49 <UmbralRaptop> egg: add those once the project is split into more parts?

13:50 <egg> awang: I'm really impressed at how promptly the MSVC devs react to our bug reports

13:50 <awang> UmbralRaptop: Does that mean major releases fall on supermoons?

13:50 <egg> nah principia has no concept of major of minor releases

13:51 <awang> egg: But the other ones do

13:51 <egg> awang: hm

13:51 <egg> clearly they will lose that concept and only have lunatic releases

13:51 <UmbralRaptop> Drop version support on blood moons?

13:51 <egg> !csharp bofhtime()

13:51 <Qboid> 2018-04-02T08:51:59,607

13:52 <awang> egg: I mean, compiler weirdness is pretty important

13:52 <awang> And the reports are probably better than most of the stuff they get

13:52 <egg> awang: now it's very usable

13:52 <awang> UmbralRaptop: What should happen for blue moons?

13:52 <egg> we still have moar bug reports coming up though

13:52 <egg> we have an issue with template variables

13:52 <awang> egg: If I ever try making a compiler, I'm using Principia in the test suite :P

13:53 <egg> translate principia to whatever language and use it as the test suite*

13:53 <egg> *: language must have a Turing-complete type system

13:53 <awang> s/translate/cross-compile

13:53 <Qboid> awang meant to say: I like how Chrome asked to cross-compile that fortran file

13:53 <awang> Um

13:53 <awang> ...That was definitely not what I expected

13:54 <UmbralRaptop> egg: so, PostScript?

13:54 <egg> is that bofh's italian FORTRAN

13:54 <awang> egg: Principia in Idris

13:54 <egg> aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

13:54 <awang> Yeah, I think it was the Italian Fortran

13:56 Guest has quit [Quit: ( www.nnscript.com :: NoNameScript 4.22 :: www.esnation.com )]

14:18 tawny- has quit [Ping timeout: 383 seconds]

14:42 Norgg has quit [Quit: Changing server]

15:03 <egg> !wpn awang

15:03 * Qboid gives awang a Lyman excess

15:03 <egg> !wpn bofh

15:03 * Qboid gives bofh a Maxwell-Boltzmann Laplacian

15:04 <egg> UmbralRaptop: котя https://twitter.com/whitequark/status/980575216047833089

15:04 <kmath> <whitequark> https://t.co/jSmmXQdRfy

15:05 <egg> also smol cat

15:05 <egg> whitequark: do the kittens have names? or indices?

15:05 <UmbralRaptop> cat tensor?

15:11 <egg> whitequark: there seems to be something in котя's hair, did she get wounded or something?

15:11 <UmbralRaptop> GPI?

15:12 <UmbralRaptop> !acr -add:GPI Gemini Planet Imager (pronounced gee-π)

15:12 <Qboid> UmbralRaptop: I added the explanation for this acronym.

15:13 <egg> UmbralRaptop: ... how do you pronounce gee-π

15:13 <UmbralRaptop> uh

15:19 <APlayer> !see Ellied

15:19 <APlayer> !seen Ellied

15:19 <Qboid> APlayer: I haven't seen the user Ellied yet.

15:19 <Ellied> ahoy

15:20 <APlayer> Bad bot

15:20 <APlayer> Hi there, Ellied

15:20 <Ellied> hi

15:20 <APlayer> Do you have a few minutes to spare?

15:20 <Ellied> a few, yeah

15:21 <APlayer> If you are busy, I can ask later, because I have no idea how long it will take

15:21 <APlayer> Basically, I have trouble interfacing a sensor with an Arduino through I2C and no idea how to fix it

15:22 <SnoopJeDi> https://github.com/curl/curl/pull/2444

15:22 <Qboid> [#2444] title: curl: add support for a "--rootme" command line parameter | Passing this parameter will download the specified URLs and execute them via `sudo(8)` using `sh(1)`, saving countless keystrokes when installing modern software.... | https://github.com/curl/curl/issues/2444

15:23 <APlayer> It's not the Hardware, because I am certain it works, including the connections. But the sensor is not responding at all to I2C and I am not sure how to proceed looking for the error

15:25 UmbralRaptop has quit [Quit: Bye]

15:25 <APlayer> Also, I have an Oscilloscope on hand, if I can use it in any way

15:25 UmbralRaptop has joined #kspacademia

15:41 <UmbralRaptop> This is an argument against 2FA, right? https://photos.app.goo.gl/oohH8XszcNMPwuUR2

15:42 <Ellied> hrm, I haven't actually used I2C before but I have a basic understanding of how it works

15:44 <Ellied> APlayer: first thing to try would be to make sure that the clock line (SCL) is actually producing a clock and the data line (SDA) is actually showing data

15:45 <APlayer> Here is what I asked initially, because it has more information: "So, I have an Arduino Pro Mini and a GY-91 sensor (https://i.ebayimg.com/images/g/UJ4AAOSwjkVZg4DO/s-l1600.jpg). They should communicate via I2C, but they don't. In tracking down the problem, I found out that Arduino's Wire library hangs the script on calling Wire.endTransmission().

15:45 <APlayer> Inspecting the hardware, it seems to be fine, both components have lit LEDs (I guess they are powered properly and not fried, then), SCL is connected to Arduino pin A5, SDA is connected to Arduino pin A4 and both lines have a 4.7 kOhm pull-up to 5V. And that's basically where I am stuck, because I have no idea what to do next - how can I troubleshoot this? Or do you already have an idea of what might be going

15:45 <APlayer> on?"

15:46 <Ellied> what is this sensor measuring?

15:46 <APlayer> Gyroscope, Accelerometer, Magnetometer

15:47 <APlayer> But it's not only sending no data, it's not responding to any calls at its adress at all

15:47 <APlayer> Let me see with the oscilloscope (let me also see if I can use the scope at all, heh)

15:58 <APlayer> Well, dammit

15:58 <APlayer> I am stupid

15:59 <APlayer> I found an I2C scan script which would scan for all possible I2C addresses and see if they respond

16:00 <APlayer> I forgot that the sensor is powered from a pin which I have to set high first and of course it did not respond. Setting the pin to high made it find them (the same board also has a barometer on it)

16:00 <APlayer> But I can't get the scope to show anything useful

16:04 <Ellied> ah yeah sensors need power

16:04 <Ellied> generally speaking anyway

16:05 <Ellied> what kind of scope are we talking about here? brand name/model number?

16:07 <APlayer> It's something tiny handheld, probably no name. https://electrodrome.net/wp-content/uploads/elektronik/messen-steuern-regeln/oszilloskope/jye-tech-dso-150-696x866.jpg

16:07 UmbralRaptop has quit [Remote host closed the connection]

16:07 <Ellied> ah, very minimal

16:07 <APlayer> I can see some sort of signal when the V/div is set to 0.1V, weirdly

16:07 <Ellied> probably noise

16:08 <Ellied> brb, gotta go between buildings

16:08 <APlayer> Not even

16:08 <APlayer> The noise is smaller

16:08 <APlayer> It's a clear, occasional drop to 0 V

16:10 UmbralRaptop has joined #kspacademia

16:11 <APlayer> Voltmeter says the line is at 4V rather than 5 (as opposed to 3.5 or so V shown by the scope, if I read that correctly

16:16 <APlayer> https://i.imgur.com/Qe6qGxF.jpg ignore the current Sec/div, the signal arrived at a setting of 2 sec/div and did not scale when I changed that

16:56 <APlayer> Well, forget the scope, I can't get it to work

16:56 <APlayer> The sensor does seem to report things, though

16:56 <APlayer> Not that those things are terribly correct, but it's something

16:57 <APlayer> (I removed the communication checks from the sketch)

16:59 <APlayer> The arduino tried to read some sort of "WHO_AM_I" register and would use its value to check if the communication was good. The value was incorrect, but the communication seems to work

17:17 <rqou> Ellied are you diodelass on Twitter?

17:24 <APlayer> Yes

17:48 <bofh> 11:44:19 <@egg> bofh: but I would guess the z * z * y would screw with any attempt at getting the last few ULPs?

17:48 <bofh> you'd need to do the last iteration I think in double-double (or at least part of the expression)

17:49 <bofh> 12:39:16 <@egg> bofh: those tradeoffs move back and forth as the cat says https://twitter.com/stephentyrone/status/891326241663594496

17:49 <kmath> <stephentyrone> @kittenpies3 mostly went out of fashion when HW multiplication became fast. Expect them to be more widely used agai… https://t.co/aFtKyzvDAz

17:51 <bofh> I doubt it ever will, there's a reason I avoid using Padé Approximants unless I'm working in the neighbourhood of a pole or a critical point (i.e. the Padé Approximant of Gamma(1+x), |x| < ~0.5 is much easier to get accurate than the minimax polynomial)

17:51 <bofh> 11:41:21 <@egg> bofh: so, ran some benchmarks, predictably the cat's cbrt (or rather the arm one which is written in C but uses the cat's polynomial)

17:51 <bofh> I am curious if there was a way to combine these two approaches, so you didn't need the gigantic table of mantissa values.

17:52 <egg|cell|egg> Meow

17:53 <bofh> (Also for rootn(x, -3), I really want to see how https://en.wikipedia.org/wiki/Muller%27s_method will compare against Newton, since I think the division there goes away as well)

17:53 <egg|cell|egg> What are your 3 points

17:53 <bofh> (The division in Halley's won't go away since in Halley you're finding roots of a linear/linear Padé Approximation to the function)

17:54 <bofh> That's the annoying bit, if it used *two* points it'd be perfect since you'd use initial and one round of Newton.

17:55 <bofh> But for three it'd be too slow in the binary64 case over 4x Newton to matter.

17:55 <bofh> Hm.

17:56 <egg|cell|egg> Clearly this just means we're back to the age of remezing all the things

17:57 <egg|cell|egg> And tweaking nodes to find ones that give you moar bits

17:58 <egg|cell|egg> !Wpn -add:adj Remez

17:58 <Qboid> egg|cell|egg: Adjective added!

17:58 <egg|cell|egg> Bofh: no news from the cat?

17:59 <bofh> I'm not *yet* convinced minimax polynomials alone are optimal, but I still need to look at some things. First I really want to actually find out what the periodicity in the Approximate rootn constant order graph is.

18:00 <bofh> egg|cell|egg: nope, but given https://twitter.com/stephentyrone/status/980611396420493314 it makes sense

18:00 <kmath> <stephentyrone> Current status: cooking Riley’s lunches for the week before leaving on a business trip tomorrow. ⏎ ⏎ Respect for singl… https://t.co/lCrYcJUsif

18:02 <egg|cell|egg> Hmmm but that's the human cooking, should leave free time to the cat

18:05 tawny- has joined #kspacademia

18:09 <UmbralRaptop> Presumably one measures food in gULPs?

18:11 <UmbralRaptop> meep https://vulpine.club/@rey/99791031982647668

18:37 <egg> bofh: have you reproduced that graph?

18:43 <egg> bofh: also have you done this already https://twitter.com/Zaikarion/status/973690462036004866

18:43 <kmath> <Zaikarion> @bofh453 Why do I have the sudden urge to write code with comments in Akkadian?

18:57 <bofh> egg: my Akkadian isn't good enough for that yet, sadly. I'll have to poke Zaikarion :P

19:09 <egg> bofh: have you managed to reproduce the γ(n) graph or should I add more details to that pdf

19:10 StCypher has joined #kspacademia

19:29 <bofh> egg: haven't yet tried, attempting now

19:39 <egg> bofh: so fwiw, all of the implementations mentioned above are better than the microsoft std::cbrt (probably not a fair comparison, I'm running the Kahan stuff without branches for rescaling and denormals)

19:39 <egg> ---------------------------------------------------------

19:39 <egg> Benchmark Time CPU Iterations

19:39 <egg> ---------------------------------------------------------

19:39 <egg> BM_KahanCbrt 34741 ns 35295 ns 20364 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

19:39 <egg> BM_AtlasCbrt 25239 ns 24850 ns 26408 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

19:39 <egg> BM_KahanNoDivCbrt 40705 ns 40806 ns 17231 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

19:39 <egg> BM_MicrosoftCbrt 65994 ns 64523 ns 8960 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

19:43 <bofh> ehh, those branches will be (hopefully) marked with unlikely branch hints, so should have negligible effect. I think. Still, am not surprised, MSVCRT's libm is far from fast (or in some cases, even accurate).

19:43 <egg> yeah true

19:43 <egg> bofh: also the cat is good as Fiora has told us many times

19:47 <bofh> Indeed, not surprised.

19:52 <awang> !u Θ

19:52 <Qboid> U+0398 GREEK CAPITAL LETTER THETA (Θ)

19:59 <egg> bofh: also, those are latency benchmarks (x = cbrt(x)), for 1000 cbrt + one FP add and one integer increment

20:04 <bofh> !u ᘛ⁐̤ᕐᐷ

20:04 <Qboid> U+161B CANADIAN SYLLABICS CARRIER JA (ᘛ)

20:04 <Qboid> U+2050 CLOSE UP (⁐)

20:04 <Qboid> U+0324 COMBINING DIAERESIS BELOW (◌̤)

20:04 <Qboid> U+1550 CANADIAN SYLLABICS R (ᕐ)

20:04 <Qboid> U+1437 CANADIAN SYLLABICS CARRIER HI (ᐷ)

20:06 <bofh> egg: interesting, I wonder how y = cbrt(x) with y not having a dependency chain on x (i.e. x is for instance rand() / (double)RAND_MAX) compares.

20:12 <egg> bofh: so fwiw the 10th order Householder method for cbrt maxes out the precision and the 9th order one comes close

20:12 <egg> (on the guess using the approximate rootn thing)

20:13 <bofh> How does the 9th order method compare with 4 Newton rounds in terms of perf?

20:13 <egg> well I'd have to do common subeggspression elimination on it

20:13 <egg> also I'm afraid to ask :-p

20:14 <bofh> why? :P

20:15 <bofh> (like the eggspression is so godawfully complicated iirc that I'd be *extremely* surprised if it's faster in any capacity)

20:15 <egg> yeah it's probably going to be silly slow

20:15 <egg> bofh: also it's not 4 rounds of newton, I'm doing it for the cbrt approach not the inverse cbrt

20:16 <bofh> oh, does cbrt require *less* rounds?

20:16 <egg> bofh: well it's more that if you're doing a division anyway there's no point going through the inverse cbrt path

20:18 <bofh> what? I was simply wondering how starting Kahan guess + enough order-2 Householder iterations to get 1ULP or better (which I think will be still 4) compares perf-wise to starting Kahan guess + 1 iteration of the order-9 or order-10 Householder method.

20:18 <bofh> :P

20:18 <bofh> ...WE JUST GOT ELEVEN CENTIMETERS OF SNOW WHY ARE WE ON TORNADO WATCH

20:19 <egg> bofh: if you go the inverse cbrt path you need to do your x * x * y at the end, losing ULPs; if you're paying the price of the expensive householder you might as well do it directly

20:20 <egg> also how do I horner/estrin a polynomial in 2 variables D:

20:20 <bofh> Yuck. Wait, what 2-var polynomial?

20:20 <egg> Ideally estrin we're dealing with high degree

20:22 <egg> bofh: so you have a giant rational function

20:22 <egg> 3 15 12 9 2 6 3 3 4 5

20:22 <egg> 9 x (x - y) (5 x + 98 x y + 323 x y + 256 x y + 46 x y + y )

20:22 <egg> 18 15 12 2 9 3 6 4 3 5 6

20:22 <egg> ----------------------------------------------------------------------------

20:22 <egg> 55 x + 1452 x y + 6765 x y + 8350 x y + 2850 x y + 210 x y + y

20:23 <egg> bofh: if you're ever going to get speed out of that it's by benefiting from throughput so Horner is out

20:23 <bofh> For starters, z = x^3 and replace everywhere

20:23 <egg> (well I guess if you Horner the numerator and denominator they don't depend on each other so at least there's that)

20:23 <bofh> But otherwise I don't think this looks very nice for any sort of Horner-type scheme since it kinda looks like a binomial-ish expansion

20:26 <egg> bofh: so fwiw integer arithmetic + one Halley iterate is faster than Atlas's method

20:27 <egg> (it's also something like 5 sig. dec. obviously)

20:28 <egg> bofh: but it means that you can afford one division; the question is how much around that

20:28 <bofh> egg: I think that might suffice for single prec, actually.

20:28 <bofh> Hm.

20:32 <egg> bofh: nah it doesn't

20:33 <egg> maybe the next householder, probably the one after that

20:47 <UmbralRaptop> bofh: apparently you're getting a really weird blizzard?

20:52 <egg> bofh: okay it's better than the Kahan method with two fdivs

20:52 <egg> modern machines are weird

20:52 <egg> (it's worse than Atlas of course)

20:52 <egg> -----------------------------------------------------------------

20:52 <egg> Benchmark Time CPU Iterations

20:52 <egg> -----------------------------------------------------------------

20:52 <egg> BM_AtlasCbrt 25876 ns 23996 ns 28000 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

20:52 <egg> BM_OneHalleyIterateCbrt 17139 ns 16741 ns 44800 +9.99995112999074731e-01; ∛2 = +1.25994621743568636e+00

20:52 <egg> BM_HouseholderOrder10Cbrt 31353 ns 31808 ns 23579 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

20:52 <egg> BM_KahanCbrt 36295 ns 34494 ns 19478 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

20:52 <egg> BM_KahanNoDivCbrt 41837 ns 39550 ns 16593 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

20:52 <egg> BM_MicrosoftCbrt 65839 ns 64174 ns 11200 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

20:53 <Ellied> inb4 floodout?

20:53 <egg> bofh: wait I haz a miscalculation

20:53 <egg> wait no I don't

20:53 <egg> it's last-bit correct on cbrt(2)

20:53 <egg> Ellied: apparently that's not too much flooding :-p

20:54 <egg> Ellied: also how do you like my ascii art equations

20:54 <Ellied> they are difficult to follow but I'm not sure how much better they would be on paper/chalkboard

20:54 <bofh> Huh, the order 10 Householder is that fast? what the everliving fuck.

20:55 <egg> bofh: https://github.com/eggrobin/Principia/blob/cbrt-benchmarks/benchmarks/cbrt.cpp#L50-L75

20:55 <egg> bofh: this implementation thereof at least

20:55 <egg> bofh: there is *a lot* to be gained from throughput, high degree polynomials

20:55 <bofh> sec. (will take a look at it in an hour, tutorial)

20:55 <egg> bofh: it can probably be Estrined to make it faster

20:56 <bofh> I'm perplexed, high-degree polynomials should not be this much faster than multiple iterates of lower-degree ones.

20:56 <APlayer> !wpn egg

20:56 * Qboid gives egg a graphite exception

20:56 <egg> bofh: dependencies!

20:56 <egg> bofh: four newtons has dependencies on *every operation*

20:56 <egg> so you're waiting for the latency of all of the things

20:58 <egg> bofh: like seriously, Estrin those polynomials in y and you might have a chance at beating the cat

21:00 <APlayer> Anyway, see you tomorrow folks!

21:02 <egg> bofh: fwiw those benchmarks are run on my laptop which is a skylake

21:02 <egg> Core i7-6600U

21:02 <egg> might get different results on my sandy desktop

21:02 rqou has left #kspacademia [AndroIRC]

21:03 rqou has joined #kspacademia

21:03 <egg> !wpn rqou

21:03 * Qboid gives rqou a shift/reduce diapsid

21:03 <egg> rqou: how goes the VHDL grammar

21:03 <rqou> friiggin androirc

21:03 * egg pets rqou

21:03 APlayer has quit [Ping timeout: 182 seconds]

21:03 <rqou> um... i haven't worked on that in months

21:03 * rqou meows

21:04 <egg> bofh: but in general, never underestimate the high order methods; they're a lot better than one might think (see also the integrators etc.)

21:04 <egg> bofh: it's just that people are stuck using leapfrog because everybody in the literature does that

21:04 <egg> Since Newton :-p

21:04 <egg> rqou: how do you like my notation https://github.com/eggrobin/Principia/blob/rootn/documentation/Approximate rootn.pdf

21:04 <egg> aaargh space

21:05 <egg> rqou: https://github.com/eggrobin/Principia/blob/rootn/documentation/Approximate%20rootn.pdf

21:09 <rqou> my phone refuses to open that

21:09 <egg> rqou: https://github.com/eggrobin/Principia/raw/rootn/documentation/Approximate%20rootn.pdf

21:10 <egg> that should just point to an actual PDF

21:12 <rqou> oh wtf

21:12 <rqou> initially i thought that was some weird mojibake

21:13 <egg> rqou: it's floating-point/fixed-point unchecked conversions :D

21:13 <egg> (well, idealized to reals, but same)

21:14 <rqou> as in, i thought the chinese characters were mojibake at first

21:14 <egg> yeah, but no, they're notation :D

21:14 <rqou> but yeah, "internationalized" math is quite unexpected

21:14 <egg> for "interpret float as fixed-point" and "interpret fixed-point as float" respectively :D

21:14 <egg> rqou: not for greek and hebrew and fancy german fonts though

21:14 <egg> so why not CJK :D

21:15 <rqou> something something imperialism

21:15 <egg> rqou: also have you seen #1787

21:15 <Qboid> [#1787] title: Just say no to romanization | | https://github.com/mockingbirdnest/principia/issues/1787

21:15 <egg> DormandالمكاوىPrince1986RKN434FM, 鈴木1990, and 吉田1990Order8D are good identifiers :D

21:16 <egg> rqou: in the process I found the ja version of http://sci-hub.hk/10.1016/0375-9601(90)90962-N, https://www.jstage.jst.go.jp/article/soken/82/3/82_KJ00004703731/_pdf

21:16 <rqou> i assume you've seen https://en.m.wikipedia.org/wiki/Modern_Arabic_mathematical_notation ?

21:17 <egg> yeah that's nice

21:17 <egg> !u مجــــــــــــ

21:17 <Qboid> egg: Too many characters! (Maximum: 10)

21:18 <egg> tatweel ftw

21:19 <egg> rqou: I mean just with the latin script you get fun stuff with the name of trigonometric and hyperbolic functions

21:19 <egg> rqou: in french, (sin x)/(cos x) = tg x

21:19 <egg> not tan

21:19 StCypher has quit [Read error: Connection reset by peer]

21:19 <egg> rqou: and the hyperbolic functions are sh ch th

21:20 <egg> rqou: also in older french texts you see things where the word is properly abbreviated, with a full stop, so "tang."

21:21 <rqou> so i used to think "eh, internationalizing stem stuff isn't _that_ useful"... and then i visited my cousin in china who was learning html+css

21:22 <rqou> css is unnecessarily inconsistent with naming things and uses far too many "complicated" words

21:25 <egg> rqou: so fwiw I learned Ada (and VB but let's not talk about that) without speaking english usefully, writing french identifiers

21:26 <egg> (and got extremely pissed off at the IDE which counted accented letters as digits and capitalized the following letter)

21:26 <egg> (running the pretty-printer fixed it though because my father did his job right)

21:53 <egg> bofh: Estrin on Householder 10 is ~as fast as the cat's method

21:53 <egg> on the one hand this makes sense, on the other hand WTF

21:53 <egg> -----------------------------------------------------------------------

21:53 <egg> Benchmark Time CPU Iterations

21:53 <egg> -----------------------------------------------------------------------

21:53 <egg> BM_AtlasCbrt 24564 ns 24554 ns 28000 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

21:53 <egg> BM_OneHalleyIterateCbrt 16606 ns 16392 ns 44800 +9.99995112999074731e-01; ∛2 = +1.25994621743568636e+00

21:53 <egg> BM_HouseholderOrder10EstrinCbrt 24673 ns 24554 ns 28000 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

21:53 <egg> BM_HouseholderOrder10Cbrt 30204 ns 29994 ns 22400 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

21:53 <egg> BM_KahanCbrt 34314 ns 34424 ns 21333 +1.00000000000000000e+00; ∛2 = +1.25992104989487319e+00

21:53 <egg> BM_KahanNoDivCbrt 40118 ns 38365 ns 17920 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

21:53 <egg> BM_MicrosoftCbrt 64221 ns 64174 ns 11200 +1.00000000000000000e+00; ∛2 = +1.25992104989487297e+00

21:54 <egg> bofh: implementation: https://github.com/eggrobin/Principia/blob/cbrt-benchmarks/benchmarks/cbrt.cpp#L77-L99

21:55 <SnoopJeDi> oh no https://twitter.com/SerhiyStorchaka/status/980385335766024192

21:55 <kmath> <SerhiyStorchaka> Empty set literal: {*()} ⏎ #python

21:58 tawny- has quit [Quit: 「Roundabout」 - To Be Continued]

21:58 tawny has joined #kspacademia

22:42 <egg> !wpn bofh

22:42 * Qboid gives bofh a Молния woomera

22:46 * egg pokes bofh with benchmarks

22:52 UmbralRaptor has joined #kspacademia

22:53 <egg> !wpn UmbralRaptop

22:53 * Qboid gives UmbralRaptop a snub woomera

22:53 <bofh> egg: yeah, I just got back, and like, I'm amazed since for me this is *un*intuitive. like, high-order integration techniques are useful, but I'm shocked high-order root finders can be as well.

22:53 <egg> bofh: high-order *everything* is good actually

22:54 <egg> bofh: see also https://twitter.com/whitequark/status/950921688488738816

22:54 <kmath> <whitequark> OH: "at which point I have to ask why [are you] evaluating 11th-order polynomials in your critical path?"

22:54 <egg> bofh: also Estrin is magic

22:54 <egg> bofh: also should I tweet those benchmark results

22:54 <egg> s/tweet/publish/ or something

22:54 <whitequark> egg: yeah, котя was outside and got a few scratches

22:54 <egg> whitequark: ow, is she OK?

22:55 <whitequark> sure

22:55 <whitequark> котя is an outside cat, I would personally be scared for whatever creature decided to attack her

22:55 UmbralRaptop has quit [Ping timeout: 190 seconds]

22:55 <egg> whitequark: :D

22:55 <bofh> 22:53:22 <@egg> bofh: high-order *everything* is good actually

22:55 <whitequark> the gray kitten isn't very smart

22:56 <whitequark> the котя-colored kitten is almost exactly like котя

22:56 <bofh> which both surprises me and goes against what my intuition used to say

22:56 <egg> whitequark: well it's a kitten, I think you've observed that they're very dumb initially?

22:56 <whitequark> but the котя-colored kitten is much smarter.

22:56 <egg> oh the котяkitten has inherited the smarts?!

22:56 <bofh> egg: and yes you prolly should post those benchmark results somewhere at some point

22:56 <egg> is there a gene that controls both the colour and the smarts

22:56 <whitequark> the latter is already eating kitten food and exploring everything

22:56 <whitequark> the former is just sucking milk and sleeping

22:57 <egg> bofh: as we said, those tradeoffs change all the time

22:57 <whitequark> also it might be related to gender

22:57 <whitequark> we originally intended to keep only the котя-colored kitten (the initial size of litter was five but no way in hell we were gonna raise five kittens)

22:57 <whitequark> but then printer decided to keep the (male) gray kitten too

22:57 <whitequark> we'll give that one away later

22:58 UmbralRaptor has quit [Quit: Bye]

22:58 UmbralRaptop has joined #kspacademia

23:00 UmbralRaptor has joined #kspacademia

23:00 <egg> bofh: https://twitter.com/eggleroy/status/980942746037948416 (Twit. J. Numer. Anal.)

23:00 <kmath> <eggleroy> Interesting ∛ benchmark results: Kahan’s 2-div method from “Computing a Real Cube Root” is soundly beaten by… https://t.co/XewMX49eX3

23:01 UmbralRaptor has quit [Client Quit]

23:01 UmbralRaptor has joined #kspacademia

23:01 UmbralRaptop has quit [Ping timeout: 182 seconds]

23:02 <bofh> egg: heh.

23:06 UmbralRaptop has joined #kspacademia

23:06 UmbralRaptop has quit [Client Quit]

23:06 UmbralRaptor has quit [Read error: -0x1: UNKNOWN ERROR CODE (0001)]

23:06 UmbralRaptop has joined #kspacademia

23:10 <egg> bofh: now optimizing γ for 10th order Householder might actually be hard

23:10 UmbralRaptor has joined #kspacademia

23:12 <egg> because its effect on the relative error will have singular points outside 0, which may become extrema of the error; you'd have to either prove that they're not global extrema or to find them (and finding them entails computing roots of high degree polynomials)

23:12 UmbralRaptop has quit [Ping timeout: 190 seconds]

23:13 <egg> I mean you can just compute those roots numerically, invert the error function to find the value, and check the error there but that seems really annoying (and doing the error analysis on that process seems even worse)

23:13 <egg> (although I guess since it's offline you can throw 100 sig. dec. at it and say that the result will be good enough :-p)

23:14 <bofh> Yeah, also I question if it matters since unoptimized γ looks to be good enough and I don't think optimizing γ will make the 9th-order one good enough.

23:14 <bofh> I'm curious what order is sufficient for single precision. I want to say 5th?

23:24 <egg> bofh: yeah it doesn't matter at all

23:24 <egg> bofh: for single precision, let me see

23:25 <bofh> sec, brb again

23:29 StCypher has joined #kspacademia

23:36 <egg> bofh: yeah 5th order will do in single precision

23:40 <egg> bofh: .... which means that if the error is decent with this method (caveat, it might not be, atlas's method is correctly rounded in binary32), that would give you a cbrtf faster than the cat's