• ashtrix@lemmy.ca
    link
    fedilink
    arrow-up
    21
    ·
    2 years ago

    Yeah, it’s already too late. Why didn’t they provide this before they already scraped websites?

    • P03 Locke@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      13
      ·
      2 years ago

      You think Google thought about robots.txt before they developed their search engine? Nah, it’s all public Internet, and they scraped away.

      A non-zero percentage of web sites will bother to follow these instructions, but it might as well be zero.

      • Scrubbles@poptalk.scrubbles.tech
        link
        fedilink
        English
        arrow-up
        8
        ·
        2 years ago

        Yeah I always assumed robots.txt only told them to hide it from search results, but Google still scrapes everything they can from you. The illusion they skipped over you

      • The Doctor@beehaw.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 years ago

        Very early on, at least, their spiders respected robots.txt.

        I know there are folks that have all of the Big G in their robots.txt files on principle, might want to ask them if it works or not.