April 27, 2026 (Mainichi Japan)

TOKYO (Kyodo) – ChatGPT scored the highest marks in this year’s entrance exams of the University of Tokyo and Kyoto University, two of Japan’s top universities, surpassing those of the actual top scorers, an AI venture said Monday.

According to LifePrompt Inc., the generative AI chatbot scored 50 points higher than the top test-taker on the University of Tokyo’s most competitive Natural Sciences III medical track exam and received a perfect score in mathematics. The achievement follows the AI’s failure to pass all of the school’s entrance exams in 2024. ……

  • NekoKoneko@lemmy.world
    link
    fedilink
    English
    arrow-up
    26
    ·
    18 days ago

    Stochastic parrot parrots stochastically. News at 11.

    According to LifePrompt Inc., the generative AI chatbot scored 50 points higher than the top test-taker … …Since the answers included essay responses, they were graded by teachers from major cram school Kawai Juku.

    So it’s an AI company that conducted its own administration of the tests and had them graded by people who are not official graders, so the headline is just false since it’s not even official or controlled. I wonder, did Mainichi consider that maybe, just maybe, they may have a self-interested bias in making the AI seem more capable than it actually is?

    I get it - the AI company wants to exaggerate and there’s always a press outlet willing to push a story that’s too good to fact-check - but I’m tired, boss.

    • givesomefucks@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      18 days ago

      Yeah, it’s not impressive. AI just has to try it thousands of times and eventually it will put out what the graders want. Hell, this was more likely shotgun approach. Have it submit 10,000 variation, brag about the best one

      It’s like “man on the street” intreviews. It doesn’t matter if 10 thousand people said the moon was real, if I only air the clips of the 5 who said it’s really a balloon put there by ancient Egyptians.

      This “perfect” score doesn’t matter because it almost certainly didn’t happen in a vacuum, they’re just reporting the result they want us to hear.

    • XLE@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      18 days ago

      Training data definitely changed in two years too. If OpenAI management knew about a prominent test that would gather good PR, they could rig the results by pushing data meant for it. And even if they didn’t, these things naturally bubble into scrapee datasets on their own.