homepage Welcome to WebmasterWorld Guest from 23.22.97.26
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Including specific robots necessary or not?
silverbytes




msg:1528095
 1:21 pm on Jul 25, 2005 (gmt 0)

I received a newsletter with this declaration:

What follows is a "back to the basics" on getting good rankings

"Having a robots.txt to include the pages that you want the search engines to include"

My understanding is that you can ban non desired bots and those what you don't even mention will crawl anyway. What I read there is different, "include those you want to crawl your site". Is that necessary really?

 

Clint




msg:1528096
 3:02 pm on Jul 25, 2005 (gmt 0)

Sounds like it just may have been a spam ploy, if it was spam you received. All I ever heard of for ALLOWING is:

User-agent: *
Disallow:

But I don't know much about the robots.txt file. I'd be interesting in knowing more.

jimbeetle




msg:1528097
 3:11 pm on Jul 25, 2005 (gmt 0)

Not much to it, the protocol is very, very simple: Web Server Administrator's Guide to the Robots Exclusion Protocol [robotstxt.org].

One most overlooked point to keep in mind is:

Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".

Clint




msg:1528098
 5:36 pm on Jul 25, 2005 (gmt 0)

So what you are saying is that the email "silverbytes" received is bogus: "include those you want to crawl your site".?

elcapitan




msg:1528099
 10:52 pm on Aug 28, 2005 (gmt 0)

bottom line : is this useful at anything? robots.txt that is.

I don't see any use in using it :)

Dijkgraaf




msg:1528100
 11:01 pm on Aug 28, 2005 (gmt 0)

robots.txt is usefull if you don't want bots requesting certain URL's.
Reason for banning bots requesting certain resources include
1) That they would use up to much bandwidth e.g. images
2) That they would cause some problems in your logging, tracking or counting of users
3) Has contents you don't want indexed (Note: robots.txt is not ideal for this use, as they can still index the URL)
4) Would cause events to be trigered that you don't wan't (ie. CGI script calls, shopping baskets etc.)

elcapitan




msg:1528101
 8:27 am on Aug 29, 2005 (gmt 0)

i my opinion the only use i see for this robots.txt is if you are an online drug dealer or firearm dealer or whatever in this area and you don't need to show up on google as you have your own buyer network - in this case robots.txt is pure gold.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved