homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
How to get spider to visit parts of site more frequently
Is this possible?
Pimpernel




msg:1526619
 9:00 am on Apr 30, 2004 (gmt 0)

I have a client who has the following within their robots.txt file:

<meta name="robots" content="all,follow">
<meta name="distribution" content="global">
<META name="revisit-after" content="10 days">

We are now creating a dynamic part of their site (i.e. changing alot, not database driven) and we want the spiders to revisit every day or even more frequently. They are PR8 so it should not be a problem, but they want to keep the rule that the spider visits the rest of the site every 10 days. The section that we want the spider to revisit more frequently will be in its own directory with a link from the home page, so the URL link on the home page will be to www.domainname.com/directory. I assume that we need the spiders to visit both the home page and all the pages within this directory in order for the directory to gain the banafit of PageRank / rating from the home page link. 2 questions:

1. Is it possible?
2. If so, what is the command?
3. Will the search engines pick up this command immediately, i.e. do the spiders look for robots.txt every time they revisit or is it just an occasional thing?

Thanks

 

ukgimp




msg:1526620
 9:19 am on Apr 30, 2004 (gmt 0)

>>I have a client who has the following within their robots.txt file:

Really? That should not be in there. What you have there should be in the head of your document. Think of the robots.txt as an exclusion protocol, it wont make the spidering process any quicker

>>and we want the spiders to revisit every day or even more frequently.

You mention every ten days, well I am afraid you cant really dictate how often they come. A simple way to get crawled more regularly is to add regular content and get a few links into that section.

>> banafit of PageRank / rating from the home page link. 2 questions:

Forget page rank, you are after being crawled, if you have enough beef in your page you will rank regardless of it. :)

If you want Google and others to crawl all your pages you donít even really need to have a robots.txt file. It is best to have one though, even if it is just blank.

I would be tempted to lose the top and bottom of the two you have, but put them where they should be :). I donít know what the global one does. Never used it.

Cheers

Pimpernel




msg:1526621
 10:16 am on Apr 30, 2004 (gmt 0)

ukgimp, thanks for that. I am a bit puzzled though. This site is a major site on the INternet. At the moment they only want the robot to visit every 10 days due to the huge number of pages they have and to preserve bandwidth (questionable motives but there you are). But for the new directory we need the robot to visit at least every day. So I suppose that what I am asking is how can you get the robot to visit /directory as often as it chooses but to visit the rest of the site only every 10 days? Bear in mind the rest of the site is not structured by directory folders, but is straight URLs off the domain name.

makemetop




msg:1526622
 10:27 am on Apr 30, 2004 (gmt 0)

Simple answer?

You can't (unless you pay some engines for frequent spidering). The revisit tag is completely spurious IMO.

There is no way to dictate to spiders how often you want them to crawl. They work out frequency themselves.

ukgimp




msg:1526623
 10:35 am on Apr 30, 2004 (gmt 0)

if you want to save bandwidth you need to look at the if modified since directive

[webmasterworld.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved