Welcome to WebmasterWorld Guest from 50.19.57.50

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

sitemap.xml requests

     
11:42 pm on Jan 11, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5459
votes: 3


I do realize there is another forum, for this topic [webmasterworld.com].

Does anybody see unusual requests for this file?

A legitimate SE IMO would be a valid request.

A solitary request by a standard users IP would NOT be a valid request.
8:56 am on Jan 12, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2004
posts:1851
votes: 49


Yesterday got one from 131.253.38.67 for BingSiteAuth.xml, never had that file to begin with.

I usually get, a dozen or so requests a month, referrals from goog/bing for "add+exchange link" from IP ranges in India & Vietnam(which are 403d anyhow) followed by sitemap.xml requests from the same IPs.
12:38 am on Jan 13, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


A solitary request by a standard users IP would NOT be a valid request.

Depends whether "standard user" means someone who has already set foot on the site and established themselves as human. Heck, I once got a request for "humans.txt" ;)

Yesterday got one from 131.253.38.67 for BingSiteAuth.xml

That's the standard Bing wmt file, like the google or yandex versions with all the numbers, from the now-standard IP for this request. I don't know when exactly they transferred this task to the 131.253 range. Somewhere between early June and early November, based on before-and-after around a big hole in my logs.

Much of the worry goes away when you add to htaccess something like

<FilesMatch "\.(js|txt|xml|php)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>


(adjusted for any non-page extensions you may happen to use) because really I'm more worried about google indexing my robots.txt than about malign robots reading it :)