Just tested a second-order ADAA of this, and while I imagine the code could be optimized, it’s pretty expensive (~6x the CPU of first-order, and ~1.5x first-order tanh).
Statistics: Posted by Dogue — Tue May 28, 2024 2:01 am