Aggressive or Bad Bots We Have Blocked To Our Shared Hosting Network Print

  • 0

PLEASE NOTE: This article currently applies ONLY to our Shared hosting servers.

Our Cloud, Business Enterprise, WordPress Business, WordPress Developer/Agency and Dedicated Servers are not affected by this.

 

We are blocking some web "Bots" due to their often aggressive crawler techniques which impact on both bandwidth and server resources; see the following for "Bots" that are currently blocked (as of writing):

allenai\.org 
Amazonbot/0.1 
anthropic\.com 
babbar.tech 
Barkrowler 
Bytespider 
CCBot 
ClaudeBot 
cohere\.ai 
DotBot 
facebookexternalhit/1.1 
GPTbot 
meta-externalagent/1.1 
MJ12bot 
mistral\.ai 
openai\.com/bot 
omgili\.com 
perplexity\.ai 
Petalbot 
Petalsearch 
spider\.cloud 
sqlmap.org 
^Mozilla$ 
Yandex 
^Go-http-client 
^test-bot$ 
^my-tiny-bot$ 
^thesis-research-bot$ 

 

If you must allow one or more of these, you can do so by adding an entry for each website's .htaccess file.

Whitelist Web Bots

SetEnvIfNoCase User-Agent facebookexternalhit/1.1 Whitelist

SetEnvIfNoCase User-Agent meta-externalagent/1.1 Whitelist


We have also blocked requests that do not set a User Agent i.e. it is blank; to allow those requests, you would add the following entry for each website's .htaccess file:

SetEnvIfNoCase User-Agent ^-?$ Whitelist

It is important to note that:

  • The bots listed can cause negative impact of server resources - and effect website operation and uptime, as such it is advised to only allow access for blocked bots if absolutely necessary.
  • If we find that, after the bots are allowed access, that they cause aggressive resource use and/or are causing instability on the server we may block them again; if that is the case then you may need to move the website to a Cloud VPS/Dedicated server.

 


Was this answer helpful?

« Back