• @smayonak@lemmy.world
      link
      fedilink
      151 day ago

      There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.

    • Fonzie!
      link
      fedilink
      5
      edit-2
      1 day ago

      JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head. Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.