homepage Welcome to WebmasterWorld Guest from 54.204.77.26
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Why does Google index search-results urls?
Aimee




msg:741668
 3:14 am on Mar 22, 2006 (gmt 0)

I'm very surprised to see that googlebot crawl our site like this URL:
/DG/cecdg_do_search.php3?keywords=8810152342&wherekey=USER_ID&orderby=approve_date+desc

That's a search result page.

I don't know how googlebot find this url.

 

konrad




msg:741669
 4:06 am on Mar 22, 2006 (gmt 0)

Yes! I'd like to know that too! The same happens to me.

frox




msg:741670
 8:20 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

If it's just a few, random ones, then it might be that someone (maybe outside you site) is linking to that specific search result.

whatcartridge




msg:741671
 8:54 am on Mar 22, 2006 (gmt 0)

Search engines find them in referrer logs too.

Aimee




msg:741672
 9:01 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

About 85% or more.

Aimee




msg:741673
 9:11 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

some number to be clear.
March 21 date.
Total googlebot crawl: 20,111
like above mentioned url: 19,943 almost 99%
March 20 date:
Total googlebot crawl: 21,851
like above mentioned url: 21,614 almost 98.9%

kaled




msg:741674
 11:29 am on Mar 22, 2006 (gmt 0)

I don't know how googlebot find this url

That would be the Google Spybar.

If the PR tool is enabled, every page you visit is tracked. How Google use this data is questionable but they have this data nonetheless. This may be one of the reasons Google fought the DoJ so hard in court recently. If forced to hand over search records, they might be forced to hand over this data too - that would be even more contentious.

Incidentally, if you use the PR extension for Firefox, Google still gets the same data - this is probably the reason that Google tolerates its use.

Kaled.

zCat




msg:741675
 11:36 am on Mar 22, 2006 (gmt 0)

If the PR tool is enabled, every page you visit is tracked.

This is why I have a seperate Firefox profile (with the official toolbar) which I only use on "special" occasions. (Also has the added benefit of sparing me from "green bar angst".

BillyS




msg:741676
 11:56 am on Mar 22, 2006 (gmt 0)

>>That would be the Google Spybar.

This was a mistake I made early on. I kept that darn spybar on all the time when working on my site and testing pages. What a nightmare.

Now I just use the Firefox extention and only visit pages I want Google to know about.

Web_speed




msg:741677
 12:07 pm on Mar 22, 2006 (gmt 0)

Sounds like the Google Spybar.

Aimee




msg:741678
 12:46 am on Mar 23, 2006 (gmt 0)

Any more information?
Googe Spy bar? how it works?
I don't think some one would search the "keyword" in our site.

g1smd




msg:741679
 12:53 am on Mar 23, 2006 (gmt 0)

Always put a <meta name="robots" content="noindex"> tag on every page of a site that is not supposed to be indexed, and then it never will be.

See also the related forums [webmasterworld.com] discussion.

tedster




msg:741680
 1:00 am on Mar 23, 2006 (gmt 0)

I don't think some one would search the "keyword" in our site.

From monitoring site searches done on a few sites, it's clear to me that some users will search on anything at all, thinking that it's web search. Not all users are clear that the box will only search the site.

kaled




msg:741681
 1:30 am on Mar 23, 2006 (gmt 0)

If all search results have the appearance of existing in a single directory (according to the url) then excluding it using robots.txt is perfectly viable. My robots.txt file excludes all bots from the cgi-bin - that's a pretty common strategy I think.

Kaled.

Aimee




msg:741682
 1:33 am on Mar 23, 2006 (gmt 0)

The problem is all pages are value to be crawled by google but didn't, all of them, home page,sub-pages and article pages.
Googlebot crawl what? it crawl forum user's profile, the search result of forum user's post. that's all.

What I can do?

g1smd




msg:741683
 1:40 am on Mar 23, 2006 (gmt 0)

See my post above for one solution to the problem.

abates




msg:741684
 4:31 am on Mar 23, 2006 (gmt 0)

Does it not make sense to use method="post" instead of method="get" for search forms? That way the parameters will never appear in the URL and never appear in anyone's log files for Googlebot to find.

pageoneresults




msg:741685
 4:37 am on Mar 23, 2006 (gmt 0)

Does it not make sense to use method="post" instead of method="get" for search forms?

It sure does and thanks for bringing it up. The above along with this (as mentioned above by g1smd)...

<meta name="robots" content="none">

...will keep your search results page out of the index. I've done it, and I've done it many times successfully.

There is one drawback, those search results cannot be bookmarked, emailed, etc. For the basic user anyway. ;)

And, if you really want to get granular with blocking the bots...

<meta name="googlebot" content="noindex, nofollow, noarchive">

<meta name="msnbot" content="noindex, nofollow">

In theory, and in practice, the standard...

<meta name="robots" content="none">

Should be sufficient to block all bots. There may be times though where you just want to block Googlebot and/or MSNBot.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved