AlmightySnoo 🐢🇮🇱🇺🇦

Yoko, Shinobu ni, eto… 🤔

עַם יִשְׂרָאֵל חַי Slava Ukraini 🇺🇦 ❤️ 🇮🇱

  • 6 Posts
  • 15 Comments
Joined 2 years ago
cake
Cake day: June 14th, 2023

help-circle


  • Automatic differentiation in C++17, had to do it from scratch as they weren’t open to 3rd-party libraries and it had to integrate with many symbolic calculations that are also done in the library (so even if you used a 3rd-party library you’d still have to rewrite parts of it to deal with the symbolic stuff). Succeeded in doing it and basically the performance was on par with what people in the industry would expect (all derivatives at once in around 3 times the evaluation time, sometimes much less if the calculation code has no dynamic parts and is differentiated entirely at compile-time).

    It was pretty cool because it was a fun opportunity to really abuse template meta-programming and especially expression templates (you’re basically sort of building expression trees at compile-time), compile-time lazy evaluation, static polymorphism and lots of SFINAE dark magic and play around with custom memory allocators.

    Then you get scolded by the CI guys because your template nightmare makes the build times 3x slower, so the project then also becomes an occasion to try out a bunch of tricks to speed up compilation.





  • I’d say since you’re a beginner, it’s much better to try to implement your regression functions and any necessary helper functions (train/test split etc…) yourself in the beginning. Learn the necessary linear algebra and quadratic programming and try to implement linear regression, logistic regression and SVMs using only numpy and cvxpy.

    Once you get the hang of it, you can jump straight into sklearn and be confident that you understand sort of what those “blackboxes” really do and that will also help you a lot with troubleshooting.

    For neural networks and deep learning, pytorch is imposing itself as an industry standard right now. Look up “adjoint automatic differentiation” (“backpropagation” doesn’t do it any justice as pytorch instead implements a very general dynamic AAD) and you’ll understand the “magic” behind the gradients that pytorch gives you. Karpathy’s YouTube tutorials are really good to get an intro to AAD/autodiff in the context of deep learning.



  • Reimplementing stuff from scratch, overengineering and, if you’re coding in a compiled language, knowing a bit of assembly to be able to make better performance-related decisions.

    EDIT to clarify the overengineering part: the idea is not to do it at work obviously because there you will have to meet deadlines and you will have to spend time on what’s most important, but the idea instead is to do that when working on your own personal projects. Try to make every component of your project as “perfect” as possible as if you were trying to rebuild a Mercedes-AMG car. That’s how you’ll learn a bunch of new tricks.







  • If you want multi-line code, you need to put it like this:

    For these kinds of questions, your best friend is the documentation. In particular, a man 'printf(3)' yields:

    Format of the format string

    The format string is a character string, beginning and ending in its initial shift state, if any. The format string is composed of zero or more directives: ordinary characters (not %), which are copied unchanged to the output stream; and conversion specifications, each of which results in fetching zero or more subsequent arguments. Each conversion specification is introduced by the character %, and ends with a conversion specifier. In between there may be (in this order) zero or more flags, an optional minimum field width, an optional precision and an optional length modifier.

    The overall syntax of a conversion specification is:

    %[$][flags][width][.precision][length modifier]conversion