Cynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 7 months agoHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizexternal-linkmessage-square49fedilinkarrow-up123arrow-down17
arrow-up116arrow-down1external-linkHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizCynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 7 months agomessage-square49fedilink
minus-squareNullPointer@programming.devlinkfedilinkarrow-up4·7 months agorobots.txt will not block a bad bot, but you can use it to lure the bad bots into a “bot-trap” so you can ban them in an automated fashion.
minus-squareDave.@aussie.zonelinkfedilinkarrow-up1·7 months agoI’m guessing something like: Robots.txt: Do not index this particular area. Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such. Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.
robots.txt will not block a bad bot, but you can use it to lure the bad bots into a “bot-trap” so you can ban them in an automated fashion.
I’m guessing something like:
Robots.txt: Do not index this particular area.
Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such.
Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.