Weeks ago I had my moment of facing the attitude of keeping all this secret.

Just casually mention join_collapse_limit was tried behind-the scenes a month ago, then why are there zero post or comments in the entire Lemmy search for join_collapse_limit? I searched the entire GitHub project - no mention of join_collapse_limit. But Ready on the Spot to reveal the secret private communications tried join_collapse_limit log ago.

You know what join_collapse_limit is telling yo8u? Too many JOIN is a performance problem! The entire ORM Rust code and reliance on new JOIN is going to lead to other unpredictable performance problems that varies when there are 10,000 posts vs 2 million posts! And that’s exactly the history of 2023… watching the code performance wildly swing based on the size of communities being queried, etc.

What I see is that pull request for ideas get created only after noise is made on a subject. There is a lack of openness to make mistakes in public.

For me,** the server crashes are what annoys me**, not human beings working on solutions. But for most of the people on the project, what seems to anthem is needing to have proper tabs vs. spaces on source code and even adding layers of SQL formatting tools in the middle of what clearly can be described as an SQL performance crisis.

Things keep getting broken: the HTML sanitation took a few hours to add to the project but now weeks of broken titles, HTML code blocks, even URL parameters are now broken on everyday links. The changes to delete behavior have orphaned comments and that has gone on for weeks now.

  • RoundSparrow @ BTOPM
    link
    fedilink
    1
    edit-2
    11 months ago

    PostgreSQL keeps failing

    And I feel like the project keep ignoring that basic fact. The servers crashing aren’t a feature, they are a bug! Yes, now there are 1500 instances to brag about, but they are all pulling data from lemmy.world and all the broken things in federation are smoldering issues.

    join_collapse_limit is the PostgreSQL design team telling you don’t build apps with 15 JOIN on real-time no-caching queries. And look what happens, it goes off into wild behaviors depending on the amount of data that has built up on a given server. And new instances starting with zero data gives the illusion that the problem is solved… but once data starts getting into that database, the overhead of all that JOIN logic and counting grows and grows.

  • RoundSparrow @ BTOPM
    link
    fedilink
    111 months ago

    The reason join_collapse_limit needs OPEN DISCUSSION is because it highlights the core of he problem. Too many JOIN in the primary logic of listing posts. The ‘too many fields’ was kind of obvious, the size of the SELECT statement is huge! It’s machine generated.

    And I can’t even REMOVE joins that aren’t needed for anonymous. The Rust objects are so binding, that “saved posts” - which can not be saved for an anonymous user, can’t be decoupled.

    The servers crashing isn’t treated like an actual problem… as a siren going off saying the code design is faulty. The mere existence of join_collapse_limit as a topic being ignored - shows the lack of design concern. Now instance blocking is being added, another new layer of work for this query.

  • RoundSparrow @ BTOPM
    link
    fedilink
    111 months ago

    Back to Basics

    All this INSERT overhead, real-time counting. Real time votes. But it is only chewing up dead tuples with constant rewrites of PostgreSQL rows to +1 every single thing in the site to give non-cached results.

    And it isn’t benefiting the SELECT side of reading that data, it’s burdening it.

    The subscribed table is likely merged for federated and local users. But when it comes time to list posts, having to sort through remote users data in the same table is overhead for every post listing. Same goes for votes, and yes - every SELECT looks at granular votes - because it wants to show the UI which items were already voted on. But it’s a huge amount of data in that table to filter out all the votes on outdated posts, votes from user snot even on this server, etc.

    And there are no limits… you could block every person and make the database have to labor away filtering out all the people you blocked. You can block a community. The testing code to reproduce these edge cases alone is a lot of work that isn’t being done… and it creates sitting time bomb that some user who hits the ‘save’ on every post or block on every user throws queries into wild behaviors.

    I think some sanity has to be considered, like “2 weeks worth of posts” is how data is organized… and then at least someone who goes wild with save post or blocking users - there is a cut-off.

    I think the personalization of data should pretty much be an entire post-production layer of the app. The core engine should be focused on post and comment storage and retrieval. “saved post” lists, blocking of instances, blocking of persons… let post-production deal with that.

    There will be major world news events where people want to get in and see the latest comments, and the code will be crashing left and right because of personal block lists that some hand full of users built up to 80,000 people (on a single account) with some script file. Meanwhile, nobody has made a test script file to see what happens at 80,000 people on a block list…

    • RoundSparrow @ BTOPM
      link
      fedilink
      111 months ago

      Ok… so where to begin?

      1. language choices. I think it’s a noble gesture, but it’s hard to ignore the overhead factor and all the end user who accidentally hide their posts and comments by getting confused by it.

      2. all sorts but “Most comment”, “old”, and “Controversial” come down to recent posts. Nobody is complain about a 3 week old post not appearing… with one exception, featured. I think I have some tricks to play with featured. Can some basic sanity be added to the project by putting a limit on time? 3 days? Are most people here to browse the most recent 3 days of content? 7 days? Can all data be divided and organized around this? With the exception being: single community?

      3. Is there a limp mode? Can something short of Beehaw and Lemmy.world turning off their entire front page - need to be built into the app. I think it needs to be done. In emergency / limp mode, you could cut off old data, or cut off personalization.

      I think the project has fundamentally misinformed the population that servers are too busy because of too many users. I just don’t see that many users!! Everything I see is too many JOIN statements! Moving to new virgin servers starts with zero data, that’s why it worked. Lemmy.world has way more data than some empty instance that is 3 weeks old. And the project leaders have failed to understand or communicate this basic issue.