homepage Welcome to WebmasterWorld Guest from 54.242.241.20
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Why does Google index search-results urls?
Aimee

10+ Year Member



 
Msg#: 33583 posted 3:14 am on Mar 22, 2006 (gmt 0)

I'm very surprised to see that googlebot crawl our site like this URL:
/DG/cecdg_do_search.php3?keywords=8810152342&wherekey=USER_ID&orderby=approve_date+desc

That's a search result page.

I don't know how googlebot find this url.

 

konrad

10+ Year Member



 
Msg#: 33583 posted 4:06 am on Mar 22, 2006 (gmt 0)

Yes! I'd like to know that too! The same happens to me.

frox

5+ Year Member



 
Msg#: 33583 posted 8:20 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

If it's just a few, random ones, then it might be that someone (maybe outside you site) is linking to that specific search result.

whatcartridge

5+ Year Member



 
Msg#: 33583 posted 8:54 am on Mar 22, 2006 (gmt 0)

Search engines find them in referrer logs too.

Aimee

10+ Year Member



 
Msg#: 33583 posted 9:01 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

About 85% or more.

Aimee

10+ Year Member



 
Msg#: 33583 posted 9:11 am on Mar 22, 2006 (gmt 0)

How many of these pages are there?

some number to be clear.
March 21 date.
Total googlebot crawl: 20,111
like above mentioned url: 19,943 almost 99%
March 20 date:
Total googlebot crawl: 21,851
like above mentioned url: 21,614 almost 98.9%

kaled

WebmasterWorld Senior Member kaled us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 11:29 am on Mar 22, 2006 (gmt 0)

I don't know how googlebot find this url

That would be the Google Spybar.

If the PR tool is enabled, every page you visit is tracked. How Google use this data is questionable but they have this data nonetheless. This may be one of the reasons Google fought the DoJ so hard in court recently. If forced to hand over search records, they might be forced to hand over this data too - that would be even more contentious.

Incidentally, if you use the PR extension for Firefox, Google still gets the same data - this is probably the reason that Google tolerates its use.

Kaled.

zCat

10+ Year Member



 
Msg#: 33583 posted 11:36 am on Mar 22, 2006 (gmt 0)

If the PR tool is enabled, every page you visit is tracked.

This is why I have a seperate Firefox profile (with the official toolbar) which I only use on "special" occasions. (Also has the added benefit of sparing me from "green bar angst".

BillyS

WebmasterWorld Senior Member billys us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 11:56 am on Mar 22, 2006 (gmt 0)

>>That would be the Google Spybar.

This was a mistake I made early on. I kept that darn spybar on all the time when working on my site and testing pages. What a nightmare.

Now I just use the Firefox extention and only visit pages I want Google to know about.

Web_speed



 
Msg#: 33583 posted 12:07 pm on Mar 22, 2006 (gmt 0)

Sounds like the Google Spybar.

Aimee

10+ Year Member



 
Msg#: 33583 posted 12:46 am on Mar 23, 2006 (gmt 0)

Any more information?
Googe Spy bar? how it works?
I don't think some one would search the "keyword" in our site.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 12:53 am on Mar 23, 2006 (gmt 0)

Always put a <meta name="robots" content="noindex"> tag on every page of a site that is not supposed to be indexed, and then it never will be.

See also the related forums [webmasterworld.com] discussion.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 1:00 am on Mar 23, 2006 (gmt 0)

I don't think some one would search the "keyword" in our site.

From monitoring site searches done on a few sites, it's clear to me that some users will search on anything at all, thinking that it's web search. Not all users are clear that the box will only search the site.

kaled

WebmasterWorld Senior Member kaled us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 1:30 am on Mar 23, 2006 (gmt 0)

If all search results have the appearance of existing in a single directory (according to the url) then excluding it using robots.txt is perfectly viable. My robots.txt file excludes all bots from the cgi-bin - that's a pretty common strategy I think.

Kaled.

Aimee

10+ Year Member



 
Msg#: 33583 posted 1:33 am on Mar 23, 2006 (gmt 0)

The problem is all pages are value to be crawled by google but didn't, all of them, home page,sub-pages and article pages.
Googlebot crawl what? it crawl forum user's profile, the search result of forum user's post. that's all.

What I can do?

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 1:40 am on Mar 23, 2006 (gmt 0)

See my post above for one solution to the problem.

abates

10+ Year Member



 
Msg#: 33583 posted 4:31 am on Mar 23, 2006 (gmt 0)

Does it not make sense to use method="post" instead of method="get" for search forms? That way the parameters will never appear in the URL and never appear in anyone's log files for Googlebot to find.

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33583 posted 4:37 am on Mar 23, 2006 (gmt 0)

Does it not make sense to use method="post" instead of method="get" for search forms?

It sure does and thanks for bringing it up. The above along with this (as mentioned above by g1smd)...

<meta name="robots" content="none">

...will keep your search results page out of the index. I've done it, and I've done it many times successfully.

There is one drawback, those search results cannot be bookmarked, emailed, etc. For the basic user anyway. ;)

And, if you really want to get granular with blocking the bots...

<meta name="googlebot" content="noindex, nofollow, noarchive">

<meta name="msnbot" content="noindex, nofollow">

In theory, and in practice, the standard...

<meta name="robots" content="none">

Should be sufficient to block all bots. There may be times though where you just want to block Googlebot and/or MSNBot.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved