Ars Technica, the Condé Nast-owned technology outlet, fired senior AI reporter Benj Edwards after it retracted one of his stories over the use of AI-fabricated quotes.

  • Greddan@feddit.org
    link
    fedilink
    English
    arrow-up
    26
    ·
    2 days ago

    The guy had several explanations rolled into one so it seems more like a dishonest lie than an honest mistake. The guy the article was about had a decent explanation of how it happened though. His blog has AI scraping protections enabled, so when the so called journalist asked an AI to write an article for him citing the blog post, the AI couldn’t access it, and did what AI do, made shit up.

    • 0x0@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      ·
      2 days ago

      His blog has AI scraping protections enabled,

      Tell me more.

    • hexagonwin@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      lmao, didn’t know the ‘ai’ tool is that stupid to not handle website blocks/exceptions…

      • deadbeef79000@lemmy.nz
        link
        fedilink
        English
        arrow-up
        9
        ·
        2 days ago

        It’s not quite like that. The tools used to scrape the web for training data couldn’t access the site to scrape the data, so it’s not encoded in the model.

        The query interface for the model just hallucinates when there’s a ‘vacuum’.

        • hexagonwin@lemmy.today
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          i was thinking of some “automated browser” type, like if the browser returns an error page saying it’s blocked, the LLM would get the "blocked from website"ish error as the page content, and shouldn’t it say something around “I’m sorry, I couldn’t access the website” instead of “Sure! Here’s a summary of that webpage” followed by hallucinated bs. well maybe that’s not the case here?

          • atrielienz@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            ·
            edit-2
            2 days ago

            It doesn’t say something like that specifically because it isn’t an algorithm that receives x input and spits out Y. It’s an algorithm that receives x query and spits out the most common variant word that comes after “query” . If there isn’t a most common word that makes sense to a human, the AI doesn’t know that and so it still gives the most common word in its training set.

            If the query is “Juicy” it may output “melons” . If “melons” were not available in its training set it might output “grapes” or “cherries” , but if those weren’t available it might output “apple bottom jeans” which would have made sense in 2003 but likely wouldn’t make sense to the average kid today who’s never heard of juicy couture.

            It doesn’t understand anything. It can’t reason.