• P03 Locke@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    13
    ·
    2 years ago

    You think Google thought about robots.txt before they developed their search engine? Nah, it’s all public Internet, and they scraped away.

    A non-zero percentage of web sites will bother to follow these instructions, but it might as well be zero.

    • Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 years ago

      Yeah I always assumed robots.txt only told them to hide it from search results, but Google still scrapes everything they can from you. The illusion they skipped over you

    • The Doctor@beehaw.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      Very early on, at least, their spiders respected robots.txt.

      I know there are folks that have all of the Big G in their robots.txt files on principle, might want to ask them if it works or not.