homepage Welcome to WebmasterWorld Guest from 107.21.187.131
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt code sensitive to missed spaces?
Marketing Guy




msg:3696174
 9:37 am on Jul 11, 2008 (gmt 0)

Quick question about robots.txt files; are these 2 lines treated as the same or is the space required:

(1) disallow: /folder/
(2) disallow:/folder/

I know that number 1 is the correct way to present the code - but my question is would number 2 be ignored or will it do the same job?

Cheers
MG

 

goodroi




msg:3696247
 12:22 pm on Jul 11, 2008 (gmt 0)

When dealing with robots.txt there is no room for error. I don't mean to scare you but I have seen some huge websites that earn millions become deindexed because of a typo in their robots.txt.

It is true that some search engine bots do a better job than others with error handling and can accommodate minor typos with no damage to your site. Why take the risk? Be careful and make sure your robots.txt validates 100% properly.

jdMorgan




msg:3696261
 12:57 pm on Jul 11, 2008 (gmt 0)

The only way to find out is to test... You first. :)

I once had a major problem with a third-tier search robot, because of a missing blank line at the end of the file. Since the definition of a "record" in robots.txt is that it ends with a blank line, it was understandable -- the robot considered that record to be "unclosed." But it came as a shock, nonetheless.

Jim

g1smd




msg:3696311
 2:17 pm on Jul 11, 2008 (gmt 0)

Looking at the repository of robots.txt files (and especially at the summary data of user-agents that people are blocking in their robots files) over at the BotSeer project, it is clear that a large percentage, certainly into double-digits, of robots.txt files are hosed in one way or another; often in multiple ways.

I would think that most bots do follow the old mantra of "Be liberal in what you accept, and conservative in what you send" but I would never like to test it out. The problem is obviously troubling to Google, as they have a whole section of WebMasterTools dedicated to verifying and checking your robots.txt file.

Google tripped me up a few years ago, when I tried something new at the time: [webmasterworld.com...]

Lord Majestic




msg:3696321
 2:26 pm on Jul 11, 2008 (gmt 0)

A well programmed bot won't require that space to be present (and I believe robots.txt standard does not require space to be there), however it is best to use space as you don't want to take chances, it's not hard to add space and sleep well at night.

Key_Master




msg:3696342
 2:48 pm on Jul 11, 2008 (gmt 0)

Lord Majestic is correct:
[robotstxt.org...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved