• 0 Posts
  • 16 Comments
Joined 2 years ago
cake
Cake day: June 11th, 2023

help-circle


  • The one case where I prefer video is when I know next to nothing about the topic and the other choice is mediocre to low-quality writing. Most people aren’t great technical writers, and it’s easy to skip over steps either because the writer assumes too much prior knowledge or simply because it takes effort to put that information in. On the other hand, videos are the opposite where it takes effort to cut stuff out, so you usually get all the steps which is what I need when I don’t know anything.

    If I have the option of a well-written, step-by-step tutorial though, or if I already know the topic and have a vague idea of what I’m looking for, then text is much better for being able to search/skim/go back and forth at my own pace.




  • The behavior is defined; the behavior is whatever the processor does when you read memory from address 0.

    If that were true, there would be no problem. Unfortunately, what actually happens is that compilers use the undefined behavior as an excuse to mangle your program far beyond what mere variation in processor behavior could cause, in the name of optimization. In the kernel bug, the issue wasn’t that the null pointer dereference was undefined per se, the real issue was that the subsequent null check got optimized out because of the previous undefined behavior.



  • My main point is that PRQL makes no distinction. If you didn’t inspect that SQL output and already know about the difference between WHERE and HAVING, you would have no idea, because in PRQL they’re both just “filter”.

    Hmm, I have to disagree here. PRQL has no distinction in keyword, but it does have a distinction in where the filter goes relative to the aggregation. Given that the literal distinction being made is whether the filter happens before or after the aggregation, PRQL’s position-based distinction seems a lot clearer than SQL’s keyword-based distinction. Instead seeing two different keywords, remembering that one happens before the aggregation and the other after, then deducing the performance impacts from that, you just immediately see that one comes before the aggregation and the other after then deduce the performance impacts.

    As far as removing arbitrary SQL features, I agree that that is it’s main advantage. However, I think either the developers or else the users of PRQL will discover that far fewer of SQL’s complexities are arbitrary than you might first assume.

    That’s fair, I was just thinking of things that frustrate me with SQL, but I admittedly haven’t thought too hard about why things are that way.


  • What are the implications of WHERE vs HAVING? I thought the only primary difference was that one happens before the aggregation and the other happens after, and all the other implications stem from that fact. PRQL’s simplification, rather than obscuring, seems like a more clear and reasonable way to express that distinction.

    I don’t know if PRQL supports all SQL features, but I think it could while being less complex than SQL by removing arbitrary SQL complications like different keywords for WHERE vs HAVING, only being able to use column aliases in certain places, needing to recompute a transformation to use it in multiple clauses, not forcing queries to be in SELECT… FROM… WHERE… order, etc.



  • Agreed, smartness is about what it can do, not how it works. As an analogy, if a chess bot could explore the entire game tree hundreds of moves ahead, it would be pretty damn smart (easily the best in the world, probably strong enough to solve chess) despite just being dumb minmax plus absurd amounts of computing power.

    The fact that ChatGPT works by predicting the most likely next word isn’t relevant to its smartness except as far as its mechanism limits its outputs. And predicting the most likely next word has proven far less limiting than I expected, so even though I can think of lots of reasons why it will never scale to true intelligence, how could I be confident that those are real limits and not just me being mistaken yet again?



  • Ask it a question about basketball. It looks through all documents it can find about basketball…

    I get that this is a simplified explanation but want to add that this part can be misleading. The model doesn’t contain the original documents and doesn’t have internet access to look up the documents (though that can be added as an extra feature, but even then it’s used more as a source to show humans than something for the model to learn from on the fly). The actual word associations are all learned during training, and during inference it just uses the stored weights. One implication of this is that the model doesn’t know about anything that happened after its training data was collected.



  • I disagree with the author on operator overloading. They claim that this function in C

    float foo(float a, float b) {
    	return a+b;
    }
    

    is perfectly clear because you know it’s doing floating point addition, while this function in Python isn’t

    def foo(a, b):
    	return a + b
    

    because you don’t know if it’s floating point addition, integer addition, or string concatenation, and what happens if the inputs are different types?

    I think that’s fundamentally mistaken. You could also ask of the C version if it’s doing normalized floating point addition, denormalized floating point addition, infinity addition, or NaN propagation. What happens if you mix different types of floats? And the answer is that it doesn’t matter. These are all just aspects of floating point addition. It returns the most sensible result in whatever format is best to hold that value, and you don’t need to worry yourself about how floats are stored under the hood.

    The same is true of the Python version. It doesn’t matter if it’s integer addition or floating point addition or string concatenation. Those are just different aspects of the addition operator and it returns the most sensible result in whatever type is best to hold that value.