Interesting set of hits using forged searchbot User Agents

You may recall that I am IP-blocking a large percentage of the IPv4 (currently 34.7%) from hitting my web server. This leaves a reduced IPv4 universe from which hosts perhaps a small and diverse set of rogue hosting players that get through, such as this:

23.161.169.62 (AS400529 Infraly, LLC)

Today's several dozen hits from that IP consisted of blindly asking for files in various /.env, /.git and /api paths and json files like config.json, firebase-adminsdk.json, google-credentials.json, secrets.json and service-account.json - none of which I have. It also crawled part of my site (html files only). I never see that sort of combination (probe for vulnerabilities and then crawl the site a little).

But here's the thing - it alternated these hits using a variety of user-agents. The complete list:

CCBot/2.0 (https://commoncrawl.org/faq/)
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Mozilla/5.0 (compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Mozilla/5.0 (compatible; DeepSeekBot/1.0; +https://www.deepseek.com/bot)
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; Google-CloudVertexBot; +https://cloud.google.com/vertex-ai-bot)
Mozilla/5.0 (compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot)
Mozilla/5.0 (compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
Mozilla/5.0 (compatible; xAI-SearchBot/1.0; +https://x.ai)
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/1.0; +https://openai.com/bot)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.3; +https://openai.com/gptbot)

(I don't think I've seen Google-CloudVertexBot before, a topic for another thread?)

I find this list very useful - because of the presence of these two UA's:

Mozilla/5.0 (compatible; DeepSeekBot/1.0; +https://www.deepseek.com/bot)
Mozilla/5.0 (compatible; xAI-SearchBot/1.0; +https://x.ai)

I have only seen them once before - in April this year from a rogue IP (208.92.235.45 - AS399244) - another crack-pot entity (now IP-blocked). It asked for a handful of .env and .git files. So I'm counting those as fake deepseek and xAI hits.

I rarely (and I mean rarely) have ever seen a hit claiming a main-line search-bot UA that was forged. And even then it was not part of session that systematically worked through a list of search bots.

But the more important thing here for me is the DeepSeek and xAI searchbot UA's. I have never seen an actual legit hit from either of those 2 bots, so I can only wonder if those UA's above represent actual working UA's or fabricated speculation of what they might look like.

Has anyone here ever had hits from those 2 bots? Legit hits, not forged?

Interesting set of hits using forged searchbot User Agents

Including DeepSeek and xAI (?!)

SumGuy

lucy24

Martin Potter

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week