Memory context: how PostgreSQL allocates memory

bahmanm@lemmy.ml · 2 years ago

First off, I was ready to close the tab at the slightest suggestion of using Velocity as a metric. That didn’t happen 🙂

I like the idea that metrics should be contained and sustainable. Though I don’t agree w/ the suggested metrics.

In general, it seems they are all designed around the process and not the product. In particular, there’s no mention of the “value unlocked” in each sprint: it’s an important one for an Agile team as it holds Product accountable to understanding of what is the $$$ value of the team’s effort.

The suggested set, to my mind, is formed around the idea of a feature factory line and its efficiency (assuming it is measurable.) It leaves out the “meaning” of what the team achieve w/ that efficiency.

My 2 cents.

Good read nonetheless 👍 Got me thinking about this intriguing topic after a few years.

bahmanm@lemmy.ml · 2 years ago

A bit too long for my brain but nonetheless it written in plain English, conveys the message very clearly and is definitely a very good read. Thanks for sharing.

bahmanm@lemmy.ml · 2 years ago

When i read the title, my immediate thought was “Mojolicious project renamed? To a name w/ an emoji!?” 😂

We plan to open-source Mojo progressively over time

Yea, right! I can’t believe that there are people who prefer to work on/with a closed source programming language in 2023 (as if it’s the 80’s.)

… can move faster than a community effort, so we will continue to incubate it within Modular until it’s more complete.

Apparently it was “complete” enough to ask the same “community” for feedback.

I genuinely wonder how they managed to convince enthusiasts to give them free feedback/testing (on github/discord) for something they didn’t have access to the source code.

PS: I didn’t downvote. I simply got upset to see this happening in 2023.

bahmanm@lemmy.ml · 2 years ago

I’ve been using sdkman for about a decade now and am totally pleased w/ it. It does a very good job of managing JDK versions for you and much more, eg SBT, Gradle, Scala, Groovy, Leiningen, SpringBoot, …

Now, technically you could use sdkman in your CI/CD pipeline too but I’d find it a strong smell. I’ve always used dedicated images pre-configured for a particular JDK version in the pipeline.

bahmanm@lemmy.ml · 2 years ago

I work primarily on the JVM & the projects (personal/corporate) I work w/ can be summarised as below:

Building & running the repo is done on the host using an SCM (software configuration management tool) such as Gradle or SBT.
The external dependencies of the repo, such as Redis, are managed via adocker-compose.yml.
The README contains a short series of commands to do different tasks RE (1)

However one approach that I’ve always been fond of (& apply/advocate wherever I can) is to replace (3) w/ a Makefile containing a bunch of standard targets shared across all repos, eg test, integration-test. Then Makefiles are thinly customised to fit the repo’s particular repo.

This has proven to be very helpful wrt congnitive load (and also CI/CD pipelines): ALL projects, regardless of the toolchain, use the same set of commands, namely

make test
make integration-test
make compose-up
make run

In short (quoting myself here):

Don’t repeat yourself. Make Make make things happen for you!

bahmanm@lemmy.ml · 2 years ago

Since I haven’t heard/read about any bugs, I plan to release v5.0.0 on the 13th (😬)

I’ll keep this post, well, posted 🙂

bahmanm@lemmy.ml · 2 years ago

That’s a 404 I’m afraid.

bahmanm@lemmy.ml · 2 years ago

Recently, I’ve found myself posting more often on Mastodon a Lemmy & blog way less - indeed credits go to Fediverse and the mods for making it a safe and welcoming place ❤

Here’s my latest one: https://www.bahmanm.com/2023/07/firefox-profiles-quickly-replicate-your-settings.html

It’s not self-hosted, rather I’m using Google’s blogspot. I used to host my own website and two dozens of clients’ and friends’ until a few years ago (using Plone and Zope.) But at some point, my priorities changed and I retired my rock-solid installations and switched to blogspot.

bahmanm@lemmy.ml · 2 years ago

Memory context: how PostgreSQL allocates memory

bahmanm@lemmy.ml · edit-2 2 years ago

[DISCUSS] Pros/cons of videos for technical documentation?

bahmanm@lemmy.ml · 2 years ago

could not resize shared memory

That means too many chunky parallel maintenance workers are using the memory at the same time (max_parallel_maintenance_workers and maintenance_work_mem.)

VACCUMing is a very important part of how PG works; can you try setting max_parallel_maintenance_workers to 1 or even 0 (disable parallel altogether) and retry the experiment?

I did increase shared_buffers and effective_cache_size with no effect.

That probably rules out the theory of thrashed indices.

https://ctxt.io/2/AABQciw3FA https://ctxt.io/2/AABQTprTEg https://ctxt.io/2/AABQKqOaEg

Since those stats are cumulative, it’s hard to tell anything w/o knowing when was the SELECT run. It’d be very helpful if you could run those queries a few times w/ 1min interval and share the output.

I did install Prometheus with PG exporter and Grafana…Anything specific you can suggest that I should focus on?

I’d start w/ the 3 tables I mentioned in the previous point and try to find anomalies esp under different workloads. The rest, I’m afraid, is going to be a bit of an investigation and detective work.

If you like, you can give me access to the Grafana dashboard so I can take a look and we can take it from there. It’s going to be totally free of charge of course as I am quite interested in your problem: it’s both a challenge for me and helping a fellow Lemmy user. The only thing I ask is that we report back the results and solution here so that others can benefit from the work.

bahmanm@lemmy.ml · 2 years ago

I used to be in a relatively similar position years ago so I totally relate to what you’ve got to do on a daily basis.

These are the the titles that come to my mind (leaving ths seniority level up to you):

Software engineer
Software engineer in Integrations
Software engineer in Enterprise Applications
ERP Implementation consultant

bahmanm@lemmy.ml · 2 years ago

Oh, updated the link 🤦‍♂️

The stock Grafana dashboard for PG is a good starting point. At least, that’s how I started. You really should add new metrics to your dashboard if you really need them as you said.

Don’t forget to install node-exporter too. It gives some important bits of info about the PG host. Again the stock dashboard is a decent one to start w/.

bahmanm@lemmy.ml · edit-2 2 years ago

A few things off the top of my head in order of importance:

How frequently do you VACCUM the database? Have you tried VACCUMing a few of times over a 5 min span & see if there are changes to the disk I/O aftewards?
I’ve got no idea how Lemmy works but “seeding content”, to my mind, possibly means a lot of INSERT/UPDATEs. Is that correct? If yes, there’s a chance you may be thrashing your indices & invalidating them too frequently which triggers a lot of rebuilding which could swallow a very large portion of the shared_buffers. To rule that out, you can simply bump shared_buffers (eg 16GB) & effective_cache_size and see if it makes any difference.
Please include a bit more information about PG activity, namely from pg_stat_activity, pg_stat_bgwriter & pg_stat_wal.
You’ve got quite a high value for max_connections - I don’t believe that’ s the culprit here.

And finally, if possible, I’d highly recommend that you take a few minutes & install Prometheus, Prometheus node exporter, Proemetheus PG exporter and Grafana to monitor the state of your deployment. It’s way easier to find correlations between data points using the said toolset.

bahmanm@lemmy.ml · 2 years ago

I’m not at my desk ATM but I think this is a prime usecase for crosstabs.

bahmanm@lemmy.ml · edit-2 2 years ago

[ANN] persianutils 5.0-SNAPSHOT

bahmanm@lemmy.ml · 2 years ago

Bookmarked!

bahmanm@lemmy.ml · 2 years ago

Can you keep this thread posted please? Or you can share a PR link so I can follow up the progress there. Am very interested.

bahmanm@lemmy.ml · 2 years ago

How did it go @RoundSparrow@lemmy.ml? Any breakthroughs/

bahmanm@lemmy.ml · 2 years ago

Why is my WAL directory so large?

bahmanm@lemmy.ml · edit-2 2 years ago

[intro] bmakelib: The minimalist Makefile standard library you'd always wished for!

bahmanm@lemmy.ml · 2 years ago

potential to reuse

I have a feeling that it’s going to make a noticeable difference; it’s way cheaper than a JOIN ... GROUP BY query.

order they are updated in so that the deepest child gets count updated first

Given the declarative nature of SQL, I’m afraid that’s not possible - at least to my knowledge.

But worry not! That’s why there are stored procedures in almost every RDBMS; to add an imperative flare to the engine.

In purely technical terms, Implementing what you’re thinking about is rather straight-forward in a stored procedure using a CURSOR. This could be possibly the quickest win (plus the idea of COUNT(*) if applicable.)

Now, I’d like to suggest a possibly longer route which I think may be more scalable. The idea is based around the fact that comments themselves are utterly more important than the number of child comments.

The first priority should be to ensure INSERT/UPDATE/SELECT are super quick on comment and post.
The second priority should be to ensure child_count is eventually correctly updated when (1) happens.
The last priority should be to try to make (2) as fast as we can while making sure (3) doesn’t interfere w/ (1) and (2) performance.

Before rambling on, I’d like to ask if you think the priorities make sense? If they do, I can elaborate on the implementation.

bahmanm@lemmy.ml · 2 years ago

First off, IIRC, COUNT(*) used to be slightly faster (~10-15%) than COUNT(some_column) in PG. There’s a chance that recent versions of PG have fixed this inconsistency but still worth benchmarking.

Now to the query:

To my mind, we’ve already got comment_aggregate which is supposed to store the result of the query shared above, right? Why do we need to run that SELECT again instead of simply:

-- pseudo-code
SELECT 
  ca.id, ca.child_count, ca.path
FROM 
  comment_aggregate ca
WHERE 
  ca.post_id = :post_id

I think I’m confusing matters here b/c I don’t know lemmy’s DB structure. Is there a link to an ERD/SQL/… you could share so I could take a look and leave more educated replies?

bahmanm@lemmy.ml · 2 years ago

DISCLAIMER: I’ve never looked at lemmy’s code base. 🤦‍♂️

I think no matter any possible optimisation to the query (if any), the current design may not be going to scale very well given it traverses all the comment X comment space every time a comment is added.

To my mind, it works well when there are many shallow comments (ie little nesting/threading) which might not be the best strategy for the content lemmy serves.

Can you share the structures of comment_aggregates and comment? I feel there’s a good opportunity for denormalisation there which may mean better performance.

That said, here’s one concrete idea that crossed my mind and could be worth benchmarking:

Create an AFTER UPDATE trigger on comment_aggregates which updates a comment’s immediate parent(s) child_count (basically increment it by 1.)
Re-write the posted query to only update the immediate parents of the comment being added.

That results in the trigger being run exactly m times where m is the number of comments of the subtree where the new comment was just added to.

Does that make sense?

bahmanm@lemmy.ml · 2 years ago

Oh, I see. It kind of makes sense. Who’d need Jython when GraalVM’s out there nowadays!? Well, unless it’s a legacy app 😬

bahmanm@lemmy.ml · 2 years ago

Makefile: If target name contains colon (:)

bahmanm@lemmy.ml · edit-2 2 years ago

Timeseries with PostgreSQL

bahmanm@lemmy.ml · 2 years ago

Forth: The programming language that writes itself