# Format of this file is described at # http://info.webcrawler.com/mak/projects/robots/robots.html # Most of these web-scraping spam email collectors probably ignore # the robots exclusion protocol, but they do so at their peril on # this web site. These user agents all get special treatment if # they come to www.discord.org. # Also see http://www.psychedelix.com/agents1.html User-agent: EmailWolf Disallow: / User-agent: ExtractorPro Disallow: / User-agent: Mozilla.*NEWT Disallow: / User-agent: Crescent Disallow: / User-agent: CherryPicker Disallow: / User-agent: WebBandit Disallow: / User-agent: NICErsPRO Disallow: / User-agent: Microsoft.URL Disallow: / User-agent: EmailCollector Disallow: / User-agent: DaviesBot/1.7 Disallow: / # Hammering my site at night with over 300 GET requests. # http://www.yama.info.waseda.ac.jp/~yamana/es/index_eng.htm User-agent: e-SocietyRobot Disallow: / User-agent: ichiro/2.0 Disallow: / User-agent: RufusBot Disallow: / # Added 2013-03-26 (first change since Nov 5 2005) User-agent: AhrefsBot Disallow: / # http://www.become.com/site_owners.html User-agent: BecomeBot Crawl-Delay: 10 Disallow: /cgi-bin/ # Added 2023-11-13 User-agent: VelenPublicWebCrawler Disallow: /cgi-bin/ # Added 2024-11-25 User-agent: Scrapy Disallow: / User-agent: * Disallow: /cgi-bin/