homepage Welcome to WebmasterWorld Guest from 54.205.197.66
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
What if your domains robots.txt
Was pointed to your domain?
Daily Sparring




msg:1527997
 10:59 pm on Sep 26, 2004 (gmt 0)

I have a site that was banned from Y and MSN. The sites Robots.txt (www.mydomain.com/robots.txt) would take you to the home page. So if you entered the above url, you would see the home page. Could this have been why I was banned? As Spam?

Please help...

 

Terabytes




msg:1527998
 2:45 am on Sep 27, 2004 (gmt 0)

the robots.txt....does NOT have the ability to redirect..

The redirect was most probabably done with a META refresh tag in the HTML...

(or with a MOD rewrite)

questions? 8-)

Daily Sparring




msg:1527999
 2:47 am on Sep 27, 2004 (gmt 0)

if it was done with a mod rewrite, would that affect the bots ability to spider or crawl.

Terabytes




msg:1528000
 3:22 am on Sep 27, 2004 (gmt 0)

the mod rewrite (in easy to understand terms...) is basicly a way of redirecting a user or robot from behind the scenes...(one of its functions, there are many...)

The user/robot interprets the page results as the genuine page it requested..even tho it's been redirected to another page.

(IE..the user/robot is looking for "pagename.htm", however that page no longer actually exists on the website, the mod rewrite can redirect the user/robot to another page without actually telling the user/robot that it was redirected, so it's not really looking at the page it requested but it doesn't really know the difference. The user/robot receives the same OK codes it would have received if it went to the originaly named page...)

(confusing?)

jdMorgan




msg:1528001
 3:22 am on Sep 27, 2004 (gmt 0)

An HTML page would be seen as an invalid robots.txt file format, so the 'bot could decide that either
A) It should crawl the site
B) It should not crawl the site
The interpretation could vary between different 'bots, and even between different revisions of the same 'bot.

Don't waste time worrying about it, just remove the redirect and continue looking for any other problems, such as multiple subdomains resolving to the same pages, i.e. more that just domain.com and www.domain.com resolving to the same content.

After any change to robots.txt, and even just periodically, it's a good idea to validate [searchengineworld.com] it.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved