Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Odd Google query for my site - www.example.com/+site:www.example.com

         

crobb305

1:19 am on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This may be obvious, but not to me. Can someone tell me what this search reveals?

www.example.com/+site:www.example.com
(no spaces)

I have seen someone or something viewing cached pages of my site stemming from this search. My initial guess was someone at G, but the visits are 0 seconds, and only the homepage. Maybe a bot? I can't tell what the search reveals, because the Google serps only let's me see about 30% of the results. The site in question has 107 pages, but this search query shows 465 results. Once I get to the 10th page of results, that number drops back down to 107. I simply have no idea what I can glean from this, and why someone would for my site using this query.

C

gn_wendy

8:05 am on Mar 10, 2010 (gmt 0)

10+ Year Member



it will search for occurrences of "www" "example" "com" on the domain http://www.example.com

crobb305

5:34 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think you're right but there is more to it that I don't understand. I ran this search for another domain, and got a 1:1 ratio of returned serps and actual pages on the site. For my domain, however, this search now shows 459 results, whereas I only have 107 pages on the site. As I mentioned, beyond page 10 in the serps, the page count that Google shows drops back to 107. So somewhere in the original 459 pages found, is a supplemental bed that I can't see (nothing new), though I did manage to see 2 urls with query strings (my site is static html) which is creating duplicate content that competes with the canonical, AND Googlie is displying my robots.txt file under this search. So it seems to me this query shows EVERYTHING for the domain.

When I search just using site:www.example.com, there are only 107 results -- not 459.

So, I have no idea what this should be telling me or why someone would search my site using this query, and look at a cached page for 0 seconds. A bot, I could understand staying for 0 seconds, but why would a bot search this query? Coincidentally, the same IP that performed this search was only used once before: 2 days before my penalty hit January 11.

Finally, if there is a correlation between this search and my penalty, AND if you're correct Wendy, then maybe I have over optimized my site for it's own name. I have never once considered this possibility, but I do often use the site name in anchor text, and my logo always displays the site name in the alt tag. Is it possible to overuse your own site name and trip a filter? Maybe someone at G was following up, possibly per my reconsideration request. Otherwise, I am at a loss for understanding what this query would tell anyone (since you can't see the supplemental bed).

tedster

6:17 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



this search now shows 459 results, whereas I only have 107 pages on the site


What kind of URLs make up the "extras"? Do they still use your domain? Are they duplicates?

crobb305

6:29 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What kind of URLs make up the "extras"? Do they still use your domain? Are they duplicates?


That's what I can't determine. When I initially perform this query, Google displays "results 1 through 10 of about 459 from www.example.com" as I click to the end, the count drops to 107 (the true number of pages on my site AND the number displayed on the normal site: search) and I can't see what's in the middle. On the very last serps page, right before my robots.txt file, I can see two urls with query strings indexed. Each of these have query strings like ?ref=pislikcs or ?referer=www appended to the intended url. These are the only two I am able to see before the results shown drop from 459 back to the actual 107. I can only assume the entire supplemental bed is made up of duplicate content because of query strings. My fear is that the initial 459 is telling me something about how my site is being seen by Gbot, and may be a signal regarding my filtering, especially if each of my pages AND their duplicates use my site's name in the logo. I saw Whitney's comments in the yo-yo thread about duplicate alt tags, and that got me concerned.


(Sorry for the edits, my browser is being screwy and duplicating my posts somehow)

[edited by: crobb305 at 6:39 pm (utc) on Mar 10, 2010]

crobb305

6:29 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the way, every url that this query allows me to see uses my domain name. If there are some in the supplemental bed that display a different domain name, then I am unable to see it.
C

tedster

6:46 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



of about 459


Ah, now I see - that "459" number is just an artifact of the way Google estimates the number of pages for most queries. It's usually an inflated number, and often grossly so. The back end structure at Google does not easily allow precision for the number of pages.

From the good old days of a visible "supplemental index" we know that Google indexes information for URL PLUS crawl date, not just URL alone. There are other variants involved as well that are filtered out by the time you get to the "last" page of results.

Ever notice these days that, even for queries with millions or billions of results, you often don't get even 999 pages when you click through to the last available page? The available results often stop somewhere between 850 and 950.

crobb305

6:54 pm on Mar 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Should I be concerned about using the same alt tag on my logo (risk of on-site OOP for my company name, esp if a lot of the IBLs use my company name)? Also, should I be concerned with the query strings such as ?ref=pislikcs or ?referer=www -- or do you think G handles them ok? I ask the latter because I am not an htaccess expert, but I do my best and feel like I keep adding more and more rules to handle canonical issues. It seems I could go forever and never catch everything.

I still wonder why someone would search for my site using this odd query. At least it's odd to me. Just such a coincidence that the exact same user IP came to my site 2 days before the penalty hit -- but it's probably just a coincidence (it wasn't a Mountain View IP, rather from Glendale, but then again a reviewer doesn't have to be in Mountain View).