Forum Moderators: goodroi

Message Too Old, No Replies

my robots file

need help

         

experienced

7:27 am on Dec 11, 2006 (gmt 0)

10+ Year Member



I have stopped google to crawl the pages below but still my urls are listed in url only type. This file exist since the site is uploaded 1st time then why google is crawling the pages bloccked through robot file.

User-agent: *
Disallow: /admin/
Disallow: /faq.php
Disallow: /groupcp.php
Disallow: /login.php
Disallow: /memberlist.php
Disallow: /modcp.php
Disallow: /posting,
Disallow: /posting.php
Disallow: /privmsg,
Disallow: /privmsg.php
Disallow: /search.php
Disallow: /viewforum,
Disallow: /viewforum.php
Disallow: /viewonline.php
Disallow: /viewtopic,
Disallow: /viewtopic.php

User-agent: *
Disallow: /viewtopic
Disallow: /posting
Disallow: /viewforum
Disallow: /privmsg

need your help on this

thanks a lot

exp...

jdMorgan

7:57 am on Dec 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



1) Google is not crawling your pages if they are shown as URL-only. By "Crawling," Google means fetching and reading your page. They don't have to fetch or read your page to list a link to it.

Basically, if Google finds a link to one of your pages, either on your own site or anywhere on the Web, they may include a URL-only listing for it. Yahoo! will do the same, except that they often use the link-text of the 'best' link pointing to your page as the title for the search listing (Yahoo! defines 'best' link here, not me.)

I disagree strongly with this approach, as it makes it difficult to prevent people from landing on a page out of context. This might mean landing in the middle of an article whose logic depends heavily on the previous pages having been read, or it might mean landing in the middle of the checkout process for a simple shopping cart, with all of the purchased items 'undefined'. Neither of these could be said to enhance the user's experience, but this is the search engines' decision, and I just live with it.

My consolation is that it's easier to make other pages rank higher for the link-text Yahoo! uses, and that very few people use the site:example.com-type searches in Google.

You can cloak these pages, and rewrite the robots' requests to a password-required page. But I've never bothered.

2) Your robots.txt is not quite valid. Only one "User-agent: *" record should appear in the file. I suggest combining all of your disallows into one record under a single "User-agent: *" directive

Jim

phranque

12:26 am on Dec 12, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



google has a robots.txt testing tool if you are subscribed.
you can view what google has cached for your robots.txt file and enter test urls.
you can also temporarily modify the cached version (in the form) and submit it for testing before you install changes.
check out [google.com...]