• wonderingwanderer@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    2 days ago

    because their underlying design requires some randomization to reflect human conversation.

    That’s just false. Although the first step of creating an LLM from scratch is to generate a gaussian distribution, which is randomized, those matrices get overwritten multiple times throughout the process of pre-training and fine-tuning, when parametric weights are finely adjusted based on the training data.

    During inferencing, tokens pass through various layers along specific embedded vectors weighted for relevance. It’s not random at all. It’s non-deterministic, but that’s not the same thing as random.

    If the training data all came from JSTOR or DevDocs or even WikiPedia, it’s going to make much more accurate inferences than if it was trained on Reddit, Quora, and Yahoo Answers.

    I’m not defending AI here, but lets keep our criticisms factual.

    • SparroHawc@piefed.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      Except if you make the output token temperature too cold, it has a higher tendency to get stuck in loops and the like. A little bit of actual randomness is important.

      • wonderingwanderer@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        That’s just adding noise, it’s not unique to AI. It’s also used in audio and visual design, and even cryptography.

        • SparroHawc@piefed.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 day ago

          It’s not unique to AI, no, but no one said it was. My point is that the noise is important to the functioning of the AI - and makes it even less deterministic, which also makes it poorly suited to automation in critical systems.