• 0 Posts
  • 67 Comments
Joined 1 year ago
cake
Cake day: August 12th, 2023

help-circle

    1. Yes, but devil’s advocate: you also need a program to text files, needing a program to read sqlite files is not worse.

    2. I am confused by your requirements. Why do you need to store your data as json or XML? Would it suit your requirements to read in text files, convert to sqlite for processing and then save as a text file? What do you gain by being able to edit the files in a text editor, as opposed to a table editor? Do you maybe just need a config file (e.g. in toml format) and don’t actually do much data processing?





  • I’m not here to sell you something. In fact, the reason it took so long for me to reply, was because I only have access to ChatGPT at work and had to wait until I had free time there. I’m not paying closed AI any money either, but despite that I can accept that their flagship product is actually really good.

    I am criticising that your post is based on a mediocre model (which version and temperature did you use?), but written as if it were representative of the whole field. And if I’m being honest I’m kinda salty that I was downvoted based on examples from such a meh model.

    Since a few days ago llama 3 was released. On ai.nvidia.com you can test out different models, including the new 8B and 70B versions. I only did a quick check but even llama 3 8B beats the examples you gave here.



  • Ok, so I finally got to check this and I simply can’t reproduce your results at all.

    Gpt4 turbo preview. Temperature 0.2

    It answers all questions correctly. When pressed for details it did not lie to me, but instead correctly explained why Dijkstra can’t be used to find the longest path, and instead pointed out that this is a NP hard problem. It also correctly stated that Dijkstra can’t be used for graphs with negative weights. It correctly suggested Bellman-Ford as an alternative to Dijkstra and knows their respective runtime complexities (for Dijkstra it differentiated between the og version and one with a Fibonacci heap). When I told it my data type for distances does not support infinity it correctly stated the bound to be “larger than any possible path length in your graph”.

    My initial opinion was that you simply should not use a tool for something it can’t do. I assumed that GPT is simply not knowledgeable enough to answer such domain specific questions.
    I have now changed my opinion. I don’t know what your version of GPT is, but GPT4 turbo preview with a temperature of 0.2 answers all the questions in your post correctly. Therefore I think GPT can be a good teacher for even Domain specific problems if they are sufficiently entry level (but still domain specific, which is impressive!)




  • LLMs are pretty good at stuff that an untrained human can do as well. Algorithms and data structures are wayyy to specialized.

    I recently asked gpt4 about semiconductor physics - not a chance, it simply does not know.

    But for general topics it’s really good. For one reason that you simply glossed over - you can ask it specific questions and it will always be happy to answer.

    Okay, at least it’s not incorrect, there are no lies in this, although I would nitpick two things:

    1. It doesn’t state what the actual goal of the algorithm is. It says “fundamental method used in computer science for finding the shortest paths between nodes in a graph”, but that’s not precise; it finds the shortest paths from a node to all other nodes, whereas the wording could be taken to imply its between two nodes.
    2. “infinity (or a very large number)” is very weird without explanation. Dijkstra doesn’t work if you put “a very large number”, you have to make sure it’s larger than any possible path length (for example, sum of all weights of edges would work).

    Those nitpicks are something you can ask it to clarify! Wikipedia doesn’t do that. If you are looking for something specific and it’s not in the Wikipedia article - tough luck, have fun digging through different articles or book excerpts to piece the missing pieces together.

    The meme about stack overflow being rude to totally valid questions does not come from nothing. And ChatGPT is the perfect answer to that.

    Edit: I’m late, but need to add that I can’t reproduce OPs experience at all. Using GPT4 turbo preview, temperature 0.2, the AI correctly describes dijkstras algorithm. (Distance from one node to all other nodes, picking the next node to process, initializing the nodes, etc).
    To respond to one of the nitpicks I asked the AI what to do when my “distance” data type does not support infinity (a weak point of the answer that does not require me to know the actual bound to question the answer). It correctly told me a value larger than any possible path length is required.

    It also correctly states that Dijkstras algorithm can’t find the longest path in a graph and that the problem is NP hard for general graphs.

    For negative weights it explains why Dijkstra doesn’t work (Dijkstra assumes once a node is marked as completed it has found its shortest distance to the start. This is no longer a valid assumption if edge weights can be negative) and recommends the Bellman-Ford algorithm instead. It also gives a short overview of the Bellman-Ford algorithm.





  • A new database specifically designed for financial transactions.

    I’m not an expert on finance software, so I can’t critically assert how good they really are. But they claim much much higher throughput than traditional databases, higher fault tolerance, self healing networks if several replicas are running, etc.
    From a purely technical standpoint it’s interesting for being written in zig. Because the database scope is so narrow they know exactly how much memory they will need on startup and just allocate all required memory on startup and never allocate more, nor free the aquired memory.


  • They never would have been able to get the same performance from any solution that incorporates a general purpose database.

    Their requirements/explicitly-not-required-ments include that it’s fine to drop 1s of data. That would be an insane proposition for any other database. Also their read/write rates and latency requirements are unusual to say the least.

    It’s the same thing as tiger beetle. Ridiculously narrow domains allow for ridiculous performance improvements compared to of-the-shelf solutions.



  • My comment was not asking for clarification, I am contradicting your claim.

    Granted, my experience is mostly limited to python and rust. But I find that in python you reach the end of “jump to definition” much much sooner. Fundamental core libraries of Python are written in C, simply because the performance required cannot be reached with python alone. So after jumping two levels you are through the thin wrapper type and your compiler will give you an “I don’t know, it’s byte code”.
    In Rust I have yet to encounter this. Byte code is rarely used as a dependency, because compiling whatever is needed is no issue - you’re compiling anyway - and actually can allow a few more optimizations to be performed.

    Edit: since wasm is not yet wide spread, JavaScript may be the best language to dig deep into libraries.