Forum Moderators: phranque

Message Too Old, No Replies

Random, 5-letter requests hitting many of my sites.

googlebot, search engines, random non-existent links

         

Andova Begarin

4:19 pm on Mar 1, 2018 (gmt 0)

10+ Year Member



Why do you think Googlebot would make these requests when they do not exist? (There are no such links on my sites - yup, more than one.)

 66.249.75.154 "GET /QXYoZ/ HTTP/1.1" 404 160 "-" "... Googlebot/2.1..."
66.249.79.148 "GET /ZlPmZ/ HTTP/1.1" 404 160 "-" "... Googlebot/2.1..."
66.249.69.116 "GET /OVQfZ/ HTTP/1.1" 404 160 "-" "... Googlebot/2.1..."
66.249.79.88 "GET /RPKLZ/ HTTP/1.1" 404 160 "-" "... Googlebot/2.1..."


Then I discovered other Bots doing the same:

 "GET /XNmLZ/ HTTP/1.0" 404 160 "-" "CCBot/2.0 (http://commoncrawl.org/faq/)"
"GET /UcLaZ/ HTTP/1.1" 404 160 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; ..."
"GET /KfcTZ/ HTTP/1.1" 404 160 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; ..."


Then random IPs (typical exploit bots) try for similar URIs. Here is a really odd one:

 "GET /LfYgZ/MPUfZ/XfhOZ/XeXLZ/XhhdZ/RZXbZ/QdpXZ/MhUMZ/ HTTP/1.1"


Somewhere, me thinks, is someone - who knows who or why - has links to my sites like these. (I've seen SemrushBot publish pages to an old site of mine with invalid links.)

For there are been some referers like:

 "GET / HTTP/1.1" 200 11592 "http://www.example.com/SRjUZ/"
"GET / HTTP/1.1" 200 408 "http://www.example.com/SLhcZ/"


And these obvious exploiters:

 "GET /wp-login.php HTTP/1.1" 404 3 "http://www.example.com/PTfRZ/wp-login.php"
"GET /wp-login.php HTTP/1.1" 404 3 "http://www.example.com/UeWfZ/wp-login.php"
"GET /KOlQZ/wp-login.php HTTP/1.1" 404 3 "http://www.example.com/wp-login.php"


Exploits grow exponentially...

--
I pose rhetorical questions and data that others might know about or have interest in.
If my posts are lame, please, just don't reply. Thank you.

Andova Begarin

4:22 pm on Mar 1, 2018 (gmt 0)

10+ Year Member



(I used "code" blocks, re, the highlighting. What should I have used for a "pre/plain text" block?)

Peter_S

4:30 pm on Mar 1, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



Are you on a dedicated IP?

If so, it's possible the previous site running at this IP, is still pointing to it.

Verify that your site's configuration is answering only to requests made to your domain name.

TorontoBoy

5:07 pm on Mar 1, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



Bing often sends me random test queries for exploits. I was once hacked and all the search engines continued looking for the hacked URLS long after the fix. This continued for over 6 months.

not2easy

6:31 pm on Mar 1, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I used "code" blocks

Code blocks are fine. If the formatted color is distracting, you can use "quote" blocks (as I did, above). Just click "Preview" rather than "Submit" and you can have easier access to the post formatting tools.

In case you missed getting a Welcome to WebmasterWorld [webmasterworld.com] it offers a lot of helpful tips on using the forums.

lucy24

7:38 pm on Mar 1, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why do you think Googlebot would make these requests when they do not exist?
Pro tip: Questions that contain the words “why” and “Google” will rarely have satisfactory answers. You’re doing the right thing by returning a 404.

The 5-character format does make it sound as if they’re asking for someone else’s files, either due to a DNS hiccup or following someone else’s erroneous links. It’s entirely different from the routine Soft 404 request, which uses a string of 8-16* lower-case letters (random) at root level, like this:
/wxnxozulnwkj.html
/xhlxdtrkg.html
/slkbgyovyzrset.html


* Funny. I always thought it was exactly 12, but I checked and it varies.