因为我自己网站是真的被Facebook爬虫和Amazonbot这两个狗东西爬崩溃过,所以我给每个WordPress都加上了robots.txt这个文件。
大家可以去了解以下每个爬虫的名字,除了我们熟知的搜索引擎爬虫外,还有很多浪费服务器资源的垃圾爬虫,像什么帮别人做SEO分析的爬虫都要来爬你的网站,着你受得了?
上面说我的网站被那两个狗东西爬崩溃的,Facebook的爬虫是meta-externalagent,亚马逊的爬虫是Amazonbot,还有OpenAI的爬虫GPTBot也很可怕,它们搞AI的爬虫都很可怕。
如果你用robots.txt屏蔽等不及,因为要24小时才生效,那你直接在网站日志里看它们的IP,把它们IP段直接禁了,禁IP段的教程看这个:https://www.shoushai.com/p/982
robots.txt设置(适用于WordPress):
User-agent: * Disallow: /wp-admin/ Disallow: /wp-include/ Disallow: /wp-content/plugins/ Disallow: /wp-login.php?redirect_to=* Disallow: /go?_=* Allow: /wp-admin/admin-ajax.php User-agent: GPTBot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: Amazonbot Disallow: / User-agent: MJ12bot Disallow: / User-agent: YisouSpider Disallow: / User-agent: SemrushBot Disallow: / User-agent: SemrushBot-SA Disallow: / User-agent: SemrushBot-BA Disallow: / User-agent: SemrushBot-SI Disallow: / User-agent: SemrushBot-SWA Disallow: / User-agent: SemrushBot-CT Disallow: / User-agent: SemrushBot-BM Disallow: / User-agent: SemrushBot-SEOAB Disallow: / user-agent: AhrefsBot Disallow: / User-agent: DotBot Disallow: / User-agent: Uptimebot Disallow: / User-agent: MegaIndex.ru Disallow: / User-agent: ZoominfoBot Disallow: / User-agent: Mail.Ru Disallow: / User-agent: BLEXBot Disallow: / User-agent: ExtLinksBot Disallow: / User-agent: aiHitBot Disallow: / User-agent: Researchscan Disallow: / User-agent: DnyzBot Disallow: / User-agent: spbot Disallow: / User-agent: YandexBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: SemrushBot-SA Disallow: / User-agent: SemrushBot-BA Disallow: / User-agent: SemrushBot-SI Disallow: / User-agent: SemrushBot-SWA Disallow: / User-agent: SemrushBot-CT Disallow: / User-agent: SemrushBot-BM Disallow: / User-agent: SemrushBot-SEOAB Disallow: /
本文来自投稿,不代表首晒立场,如若转载,请注明出处:https://www.shoushai.com/p/985