• 0 Posts
  • 11 Comments
Joined 2 years ago
cake
Cake day: June 2nd, 2023

help-circle
  • On the topic of exposing sequence number in APIs, this has been a security issue in the past. Here is one I remember: https://www.reuters.com/article/us-cyber-travel-idUSKBN14G1I6/

    From the article:

    Two of the three big booking systems - Amadeus and Travelport - assign booking codes sequentially, making brute-force computer guesswork easier. Of the three, Amadeus, through its web portal CheckMyTrip, is especially vulnerable, Nohl said.

    The PNRs (flight booking code) have many more security issues, but at least nowadays, their sequential aspect should no longer be exposed.

    So that’s one more reason to be careful when exposing DB id in APIs, even if converted to a natural looking key or at least something easier to remember.


  • The main problem is that dynamic linking is hard. It is not just easier for the maintainers of the languages to ignore it, it removes an entire class of problems.

    Dynamic linking does not even reliably work with C++, an “old” language with decades of tooling and experience on the matter. You get into all kind of UB when interacting with a separate DSO, especially since there are minimal verification of the ABI compatibility when loading a dynamic library. So you have to wait for a crash to be certain you got it wrong. Unless you control the compilation of your dependencies, it’s fairly hard to be certain you won’t encounter dynamic linking related issues. At which point you realize that, if the license allows it, you’re better off static linking everything, including the C++ library itself: it makes it much more predictable, you’re not forcing an additional dependency on your users and most UB are now gone (especially the one about raising exception across DSO boundaries, which can happen behind your back, unless you control the compilation of all your dependencies…).

    That’s especially true if you are releasing a library where you do not know it’s runtime: it might be dynamically loaded via dlopen by a C++ binary that will load its own C++ library first, but some of your users use the version that is stuck on C++14 and your codebase is in C++23. This can be solved, by playing with LD_LIBRARY_PATH, but the application is already making use of it to load the C++ library it comes with instead of the one provided by the system (which only provides C++11 runtime), and it completely ignores the initial state of the environment variable (how could it do otherwise? It would have to guess the path to the libstdc++ is for a newer version and not the older one provided by the system). Now imagine the same issue with your own transitive dependency on top of that: it’s a nightmare.

    So dynamic linking never really worked, except maybe for C when you expect a single level of dependency, all provided by the system. And even then that’s mostly thanks to C simpler ABI and runtime.

    So I expect that is the main reason newer languages do not bother with dynamic linking: it introduces way too many issues. Look at your average rust program and how many version of a same dependency it loads, transitively. How would you solve that problem as to be able to load different versions when it matters but try first and foremost to load only one if possible? How would you be able to make the right call? By using semver? If nobody made any mistake why not, but you will rather be required to provide escape hatches that, much like LD_LIBRARY_PATH and LD_PRELOAD, will be misused. And by then, you only “solved” the simplest problem.

    Nowadays, based on how applications are delivered on Windows and OSX, and with the advent of docker, flatpack/snap and appimage, I do not see a way back to dynamic linking anytime soon. It’s just too complicated of a problem, especially as the number of dependency grows.



  • I’ve used scratch to introduce kids (6~10) to programming. It works quite well IMO. They had a laptop with windows. I recommend a touch screen if possible, especially for younger kids. Though at 8~12yo that should not be as much of an issue.

    I used it with the microbit from the BBC. While not required, a dedicated piece of hardware makes it much more interactive and fun, for a basic introduction. Basically, the microbit can be turned into a remote control for your characters in scratch for example.

    Though, kids get fond of the ability to create pre-programmed scenes. That are not very logic intensive, more like an animated movie. And since they can add their own drawings and voice, they can get very engaged on this sole basis. So the microbit is not required at all.

    Though if you want to use it, Microsoft has its own scratch for microbit that is more annoying to use IMO (you need to flash the program every time, which is not easy for younger kids that have trouble with the mouse), but it unlocks all the capabilities of the microbit for even more interactive applications. You can make them communicates through a basic protocol over 2.4GHz radio, control led strips or even robots for example (though the robots are far from cheap for what they are 😕).

    Both scratch and makecode (the links mentioned above) have plenty of resources if you want to get a lab going. Personally, I would set my expectation fairly low and plan for many additional small features that kids that are really interested could implement on their own. In my experience, some kids will not be interested at all, not until they see a feature they want to interact with at least. Others will try to see what they can do by themselves, before the lab even begin. But usually, the older they get, the less likely they are to experiment by themselves and they’d rather wait for instructions. Which is a shame, but that’s how it is I guess.

    Also, try to make sure they can continue their work from home. Scratch is available on many platforms (though makecode sucks on Android last time I checked) and is trivial to get up and running. That said, importing a project is another matter for kids barely familiar with computers, which is why I would distribute a document aimed at their parent to get them set up.


  • I feel like since Glassdoor has a job board, there is an obvious conflict of interest that cannot be solved and ultimately yields to the company review part to be mostly useless. If you browse through a few company profiles, some that offer jobs on Glassdoor (and reply to reviews) and some that do not, you will quickly see what I’m talking about.

    I mean, who is going to post job offers on a board that also list your company at <4 stars with plenty of mention of “bad company culture” and “horrendous working condition”?

    I also do not like how most of the would-be-intersting-info is gatekeep behind you sharing various info on your own employer. I mean, it seems to make sense on the surface, but with the above, the whole thing feels like a scam. Where both employers and employees get scammed.

    How does it compare to alternatives ? Well, LinkedIn is terrible, with abysmal search function, results completely irrelevant to your profile, and the same seems to be true for recruiters seeing how often I’m proposed jobs completely outside of my skillset.

    So I guess Glassdoor is a viable source of job offer, especially if you have already selected companies you’d like to work at and those companies do not have their own board. But take all reviews and comments with a huge grain of salt.


  • I disagree. The question is not really “should we give programmer more power at the cost of yet another UB” but more “should we grow the API and add another UB for the select few for whom it might matter”. When you consider choices made on other parts of the STL, such as std::unordered_map, then you realize the STL is not about being the most performant things around, but rather a collection of reliable tools covering basic usage for 80% of the user base.

    With that in mind, I am against adding yet another function, which has its pitfalls, for minimal benefits. Again, such a function would be made almost entirely obsolete by a safe function that works with iterators/generators of known sizes. So I see even less benefit in adding a function that will just become yet another liability down the line.


  • The benchmark looks off. The msvc one may be the only one vaguely reliable. I suspect clang and GCC were able to optimize the synthetic benchmark to a little more than a loop doing additions. At 96ns for 1000 iteration, you are looking at 10G iterations per seconds. Which can only be achieved by a loop of two instructions executing at 2 inst/s on a 5GHz processor. And you will not get a 5x just for removing a highly predictable branch.

    So yes, std::vector leaves performance on the table, but no more than 10~15% for trivial loops that are not that uncommon but are rarely a bottleneck.

    Then you have to ask yourself, is it worth it to add yet another function that can crash your program if misused just for that 10% in a situation where they might not even matter. I mean, I know, it’s c++, zero cost abstraction, yadi yada, but if you’re looking for consistent performance you should have moved away from the STL already. As this post shows, your STL vendor already has a huge impact on the performance, and there are widely available options to optimize specific cases.

    So I’d rather keep the STL fairly simple. Add one function to work well with generators/iterators that have a known size if you want, but adding unchecked versions of every insertion function of every STL container is not worth it IMO.


  • First of, especially in C, you should very carefully read the documentation of the functions you use. It then should be obvious to you you are currently misusing it on two accounts:

    • You are not checking for errors
    • You are assuming the presence of a \n that might never be there (this one leads to your unexpected behavior)

    The manual tells you it will insert a \0 at the end of what it reads within the limits of the buffer. So this \0 is what you will need to look for when determining the size of the input.

    If there is a \n, it will precede the \0. Just make sure the \0 is not at index 0 before trying to erase the \n. If there is no \n before the \0, you are in either of two cases (again, this is detailed in the documentation): the input is truncated (you did not read the full line, as in your unexpected behavior above) or you are reaching the end of the stream. Note that even if the stream ends with \n, you might need to issue an additional fgets to know you are at the end of the stream in which case a \0 will be placed as the first byte of your buffer.

    If you really want to handle input that exceeds your initial buffer, then you need to dynamically allocate one and grow it as needed. A well behaved program will have an upper limit to the size of the input anyway (and this is why you don’t use gets). So you will need a combination of malloc/realloc and string concatenation. That means you need to learn all the pitfalls of dynamic memory allocation in C and how to use valgrind. For the string concatenation, even though strcat should be OK in your case, I’d recommend against it.

    In order to use strcat properly you need to keep track of the usage of the dynamically allocated buffer by hand anyway because you want to know when you will attempt to store more bytes in the buffer than is currently allocated. And once you know the number of bytes stored in the buffer, copying over the bytes that fgets returns by hand is fairly trivial and has less pitfalls. This also circumvent one of the performance pitfall of strcat: it needs to find the \0 in the destination buffer for every call. So effectively, it can transform all by itself a trivial usecase such as yours, that one would expect to be linear in algorithmic complexity, to be of O(N^2) complexity.

    On a final note: fgets does not allow you to handle binary data properly because you wont be able to tell apart a legitimate \0 coming from your input from a \0 inserted by fgets. So you will need to use fread in this case. I actually recommend using fread instead of fgets because it directly returns the number of bytes read, no need to use strlen to guess it and it makes error handling easier. Though you’ll need to add the trailing \0 yourself.


  • I tried to introduce tests to one of the team I worked at. I was somewhat successful in the end but it took some time and effort.

    Basically, I made sure to work with the people interested in testing their code first. It’s good to have other people selling testing instead of being the only voice claiming testing will solve all some problems.

    Then I made examples: I tried to show that testing some code, believed to be untestable, was actually not that hard.

    I was also very clear that testing everything was not the end goal, but, new projects especially, should try and leverage testing. Both as a way to allow for regression testing later on and to improve the design. After all, a test often is the first user of a feature. (This was for internal libraries, I expect it would be a harder sell for GUI where the end design might come from a non programmer such as a UX designer).

    At this point, It was seen as a good measure to add a regression tests for most bugs found and fixed.

    Also, starting from the high level, while harder (it’s difficult to introduce reliable integration and end to end tests), usually yields benefits that are more obvious to most. People are much less nervous reworking a piece of code that has a testing harness, even if they are not in a habit of testing their code.

    I did point at bugs that could have been easily prevented by a little bit of testing, without blaming anyone. Once the framework is in place and testing has already caught a couple of mistakes, it’s much harder to defend the argument that time spent testing could be better spent elsewhere. And that’s where we started to get discussions on the balance to strike between feature work and testing. It felt like a win.

    It took two years to get to a point where most people would agree that testing has its uses and most new projects were making use of UT.


  • Contrieved examples you usually see in design pattern tutorials do not properly highlight the use case and usefulness of a specific pattern.

    What you did might have been fine. You only rarely need the builder design pattern.

    It’s useful when the object you want to build has a complex constructor, or several ones, or you want to more easily enforce the self consistency of the created object (some languages make it hard to share initialization code) or you want to hide some internal details (I.e., you are at an API boundary and want to be able to freely change the objects your main object is composed of).

    In short, Builder is a way to have a handy interface to an object constructor. It’s nice for users of a library, or if you find yourself repeating a complex constructor invocation often, but you usually do not want to have to maintain such an interface that is, at the end of the day, mostly boilerplate code.


  • I second that. You can also go a fair way without coding much. So you can learn the engine as you learn programming.

    Also, look at some example projects. There are plenty, like a vampire survivor like. Try to understand the structure and add some very small feature.

    There are also many resources online to get started. Don’t hesitate to create many small projects where you only try out a concept or idea. Godot is very composable once you know your way around, this will make things easier to move your learnings from one project to the next.