Welcome to WebmasterWorld Guest from 54.159.214.27

Forum Moderators: goodroi

Including specific robots necessary or not?

   
1:21 pm on Jul 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I received a newsletter with this declaration:

What follows is a "back to the basics" on getting good rankings

"Having a robots.txt to include the pages that you want the search engines to include"

My understanding is that you can ban non desired bots and those what you don't even mention will crawl anyway. What I read there is different, "include those you want to crawl your site". Is that necessary really?

3:02 pm on Jul 25, 2005 (gmt 0)



Sounds like it just may have been a spam ploy, if it was spam you received. All I ever heard of for ALLOWING is:

User-agent: *
Disallow:

But I don't know much about the robots.txt file. I'd be interesting in knowing more.

3:11 pm on Jul 25, 2005 (gmt 0)

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Not much to it, the protocol is very, very simple: Web Server Administrator's Guide to the Robots Exclusion Protocol [robotstxt.org].

One most overlooked point to keep in mind is:

Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".
5:36 pm on Jul 25, 2005 (gmt 0)



So what you are saying is that the email "silverbytes" received is bogus: "include those you want to crawl your site".?
10:52 pm on Aug 28, 2005 (gmt 0)

5+ Year Member



bottom line : is this useful at anything? robots.txt that is.

I don't see any use in using it :)

11:01 pm on Aug 28, 2005 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



robots.txt is usefull if you don't want bots requesting certain URL's.
Reason for banning bots requesting certain resources include
1) That they would use up to much bandwidth e.g. images
2) That they would cause some problems in your logging, tracking or counting of users
3) Has contents you don't want indexed (Note: robots.txt is not ideal for this use, as they can still index the URL)
4) Would cause events to be trigered that you don't wan't (ie. CGI script calls, shopping baskets etc.)
8:27 am on Aug 29, 2005 (gmt 0)

5+ Year Member



i my opinion the only use i see for this robots.txt is if you are an online drug dealer or firearm dealer or whatever in this area and you don't need to show up on google as you have your own buyer network - in this case robots.txt is pure gold.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month