homepage Welcome to WebmasterWorld Guest from 54.205.242.179
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
sitemap.xml requests
wilderness




msg:4535382
 11:42 pm on Jan 11, 2013 (gmt 0)

I do realize there is another forum, for this topic [webmasterworld.com].

Does anybody see unusual requests for this file?

A legitimate SE IMO would be a valid request.

A solitary request by a standard users IP would NOT be a valid request.

 

blend27




msg:4535449
 8:56 am on Jan 12, 2013 (gmt 0)

Yesterday got one from 131.253.38.67 for BingSiteAuth.xml, never had that file to begin with.

I usually get, a dozen or so requests a month, referrals from goog/bing for "add+exchange link" from IP ranges in India & Vietnam(which are 403d anyhow) followed by sitemap.xml requests from the same IPs.

lucy24




msg:4535569
 12:38 am on Jan 13, 2013 (gmt 0)

A solitary request by a standard users IP would NOT be a valid request.

Depends whether "standard user" means someone who has already set foot on the site and established themselves as human. Heck, I once got a request for "humans.txt" ;)

Yesterday got one from 131.253.38.67 for BingSiteAuth.xml

That's the standard Bing wmt file, like the google or yandex versions with all the numbers, from the now-standard IP for this request. I don't know when exactly they transferred this task to the 131.253 range. Somewhere between early June and early November, based on before-and-after around a big hole in my logs.

Much of the worry goes away when you add to htaccess something like

<FilesMatch "\.(js|txt|xml|php)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>


(adjusted for any non-page extensions you may happen to use) because really I'm more worried about google indexing my robots.txt than about malign robots reading it :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved