homepage Welcome to WebmasterWorld Guest from 54.227.11.45
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 49 message thread spans 2 pages: 49 ( [1] 2 > >     
Google SITE: Comand, what does it show?
c41lum




msg:4180511
 11:35 am on Aug 2, 2010 (gmt 0)

We all know that G's site command is a useful tool to show how many pages our sites have listed in the index, but what else does it show?

Does the site: command show what pages G thinks are important on your site in the first positions? If so, then if your top pages aren't showing at the top when you do the site: command search does this show the webmaster that there is some kind of penalty or devaluation going on, on those pages.

On my site at the min Im noticing that pages that we really do lots of work on are showing really far down in the list when I do a site: search. While other poor low traffic pages are showing in the top positions.

 

gn_wendy




msg:4180536
 12:21 pm on Aug 2, 2010 (gmt 0)

We all know that G's site command is a useful tool to show how many pages our sites have listed in the index


No, we all know that it is broken. So very, very, very broken.

Does the site: command show what pages G thinks are important on your site in the first positions?


This is an interesting question. I've tried profiling som sites using the site:- operator, but it doesn't seem to be useful. Unless you combine the operator with a search phrase, in which case you will see what pages do well, for specific terms.
IMO that is more useful, since if a page ranks well for a term, which is relevant for me, i would rather get a link from there, than from other pages.

Andylew




msg:4180579
 1:49 pm on Aug 2, 2010 (gmt 0)

I dont think site: is broken, a simple site:domain.com search will return main indexed pages. A trick for the toolkit, a site:domain.com -spellingmistake will return all pages google has for you including the supplemental index.

peterdaly




msg:4180582
 1:57 pm on Aug 2, 2010 (gmt 0)

we all know that it is broken. So very, very, very broken.


Wrong. I agree with Andylew. It's not broken. You just need to know how to disable the default filter using the -agkh (or any random characters) option.

It's actually quite useful to be able to view the site: both ways to compare what different.

netmeg




msg:4180587
 2:03 pm on Aug 2, 2010 (gmt 0)

Actually these days I mainly use it to see which urls might have gotten into the index that I want to remove or redirect.

peterdaly




msg:4180588
 2:03 pm on Aug 2, 2010 (gmt 0)

I suspect the site: command shows pages on your site in terms of descending PageRank value with a duplicate content filter limiting some results, as well as the supplemental index functionality already referenced in this thread.

I've not spent any time trying to validate that assumption.

c41lum




msg:4180593
 2:11 pm on Aug 2, 2010 (gmt 0)

What im trying to understand and I think would be helpful for other users as well is, does Google show site: command results in the order of page strength (quality/importance). If so the site command can be used to see which pages have a devaluation and which pages need attention.

For instance when I do the site: search is shows my most important page (1000's external links) sitting on page 40 of the results page this page has the highest TBPR of all of my pages.

peterdaly




msg:4180616
 3:03 pm on Aug 2, 2010 (gmt 0)

does Google show site: command results in the order of page strength

I think yes. I think it's PageRank order, descending. If that's not exactly it, then it's probably a similar value model based on more than one factors.

c41lum




msg:4180620
 3:12 pm on Aug 2, 2010 (gmt 0)

I think yes. I think it's PageRank order, descending.


Yeah it would make sense for it to show results in descending page strength.

In which case you should be able to identify pages that have a no page or duplicate issue?

Has anybody used the command for this reason?

skweb




msg:4180625
 3:20 pm on Aug 2, 2010 (gmt 0)

Andylew, thanks for the hack. I did not know about it and was able to find details that I never could before. More pages indexed than I thought.

gn_wendy




msg:4180627
 3:22 pm on Aug 2, 2010 (gmt 0)

@Andylew ; @peterdaly and @anyone-who-feels-the-site-operator-works

It works (more or less) depending on what you want to use it for, but when dealing with large sites (or any size of site with more than 1000 pages of indexable AND indexed content) it simply doesn't work.

The -spellingmistake -agkh -[insertRandom] will return different results, and in some cases more correct/accurate results, but it's still very broken.

Your posts got me to reevaluate my stance on the site:- operator and I really gave it my all trying to see if Google might have fixed it after about 12-18 months of documented brokenness.

Like I said, it will do what you want it to most of the time, but if you are using it to see how much of your total website is indexed by Google, it's not going to work.

I tried a site:example.com search for a site I work on. I then tried a bunch of -[random] searches.

I then repeated this for another two sites I work with... and for good measure on two sites I don't work with (but have shared access to).

site:- is still broken for me. I wouldn't trust it as far as I could throw it. I'm not saying it isn't useful. I use it countless times a day. All I'm saying is - as a KPI for your SEO efforts and checking what your indexing is like and all similar intents and purposes - the site:- operator is broken.

Anyone who can show me differently - that would be great!
It used to be one of my favorite (and most powerful tools) for analyzing performance and indexing of deep pages.

c41lum




msg:4180635
 3:34 pm on Aug 2, 2010 (gmt 0)

No one seems to have a definitive answer to the OP.

which was, dose the site: command show results according to what google thinks is page importance/quality. If it does then surely this is a easy way to spot pages that have a penalty/devaluation.

randle




msg:4180638
 3:37 pm on Aug 2, 2010 (gmt 0)

the site:- operator is broken


When you run the site command, are you saying that you have pages that do not show in those results, that are in fact indexed and cached in Google? (or that particular data center)

What exactly do you mean by "broken"?

tedster




msg:4180644
 3:45 pm on Aug 2, 2010 (gmt 0)

I never saw a pattern of Google listing the site: URLs in any particular order - and I really wanted it to be that way. On some sites, in fact, it does looks sort of like that. I was just looking at some site: results for a client earlier today and there was no relationship to PR or trust or traffic - the order was pretty borked to my eye.

The one thing you can sort of depend on is that the domain root should be #1 - if it isn't, something is wrong somewhere. But what that may be (your problem or Google's) is not always clear. The only thing that is clear to me is that it is not a healthy sign.

c41lum




msg:4180653
 4:03 pm on Aug 2, 2010 (gmt 0)

It would make life much easier for webmasters if the site: results were listed in page trust/quality.

In my case our root url isnt position #1. The page that is #1 is what I would class as a low quality, very little content page with only a small number of inbound back links.

randle




msg:4180658
 4:15 pm on Aug 2, 2010 (gmt 0)

It would make life much easier for webmasters if the site: results were listed in page trust/quality.


Thats never, ever going to happen, much to the chagrin of link buyers and sellers everywhere. Google has been consistently increasing the amount and degree of disinformation about all kinds of site and page indicators.

If you are having trouble trying to understand the ordering of the command results, thats because Google wants it that way.

netmeg




msg:4180671
 4:38 pm on Aug 2, 2010 (gmt 0)

My pages in the site: command are listed with no discernible pattern, as mentioned by tedster. (This is easily checked with the SEO Info FF plugin) My home page is on top (unless I've done something stupid - which has happened) and the next one could be a PR0, or a PR4, and I've found PR3s and PR4s on page 4 or 5 of the results.

So I use it to remove urls, check snippets, find mistakes. But I don't put any particular trust in it.

I'm pretty sure Google wants it that way, too.

epmaniac




msg:4180709
 6:11 pm on Aug 2, 2010 (gmt 0)

The one thing you can sort of depend on is that the domain root should be #1 - if it isn't, something is wrong somewhere. But what that may be (your problem or Google's) is not always clear. The only thing that is clear to me is that it is not a healthy sign.


@tedster and other experts

site operator does not return domain root at # 1 for my site... can you please guide me what can i do to resolve this issue

pontifex




msg:4180758
 7:47 pm on Aug 2, 2010 (gmt 0)

Hi folks,

IMHO does the site: command indicate interesting things - besides the fact that is a bit wacky...


1st: I see a decline in results on my domain starting some days/weeks before the backlink count changes OR a new public PageRank is shown. So I go down in steps of 2-5,000 pages for a certain period of time. It stops, the index updates. That happened before MayDay and I watched it 2 times before that as well.

It has little to NO impact on the daily unique visits in that time frame of decline.


2nd: the "site: -blahblooptypo" jumps around a bit, but that could IMHO be just a change in datacenter/dataset for my query. It does, however, show the amount of pages I would suspect G to have in the index. So: supplemental is true.


3rd: Like most of the things Google shows (analytics, WMT, keyword search volume) this is IMHO just another indicator to make educated guesses, not a tool of precision.


Some wild guessing:
Google does things in "waves" (no, not the stupid social thing) by building

. the fresh part of the index (updates, blogs, news) - spider
. ranking + applying spam filters
. the core index - spider
. ranking + applying spam filters
. the supplemental index - spider
. ranking + applying spam filters

that in a never ending cycle with URLs being moved between these "shards" (or silos if you like) they talked about @Standford a few years back.

I suspect the "site:" command with it flavors just shows some snapshots out of these "shards" and because the whole dataset is shifting all the time, it is an indicator more than a tool.

2 cents,
P!

peterdaly




msg:4180765
 7:57 pm on Aug 2, 2010 (gmt 0)

site operator does not return domain root at # 1 for my site... can you please guide me what can i do to resolve this issue

1. Where does the homepage rank?
2. Do the pages that rank ahead of it have a large quantity/quality of incoming links?

epmaniac




msg:4180779
 8:26 pm on Aug 2, 2010 (gmt 0)

1. Where does the homepage rank?
2. Do the pages that rank ahead of it have a large quantity/quality of incoming links?


1. Homepage ranks at number 106

2. The pages that are ranking ahead are a mix. Many have outside incoming links but many dont have any.

c41lum




msg:4180796
 8:53 pm on Aug 2, 2010 (gmt 0)

in my case

1. Homepage is 5th
2. The pages that rank high have very few In Bound Links no where near as many as the homepage in quality or quantity.

Plus: my main pages that have lots of links are listed way back in the Four Hundreds, which I find very strange.

Lorel




msg:4180857
 11:46 pm on Aug 2, 2010 (gmt 0)

One way to see if Google's site command is listing your best quality pages first is to compare it with a site monitor that lists which pages people visit the most. They should match.

gn_wendy




msg:4180975
 6:42 am on Aug 3, 2010 (gmt 0)

What exactly do you mean by "broken"?


Definition of broken: the site:- operator returns bogus numbers for "About 1,240,000 results (0.23 seconds)", i.e. it will not return the number of a site's pages in the index (supl. or otherwise).

Using the operator to limit a search query to a domain, sub-domain and/or specific domain directory works superbly. That is to say, if you are using it to find information (as in: show me the page about blue sparkly fluffy widgets on www.example.com), you're golden!


About the broken pages-in-index - for "example.com" i got these numbers:

About 1,240,000 results (0.23 seconds) - example.com
About 12,900,000 results (0.20 seconds) - example.com -spellingmistake
About 9,400,000 results (0.38 seconds) - example.com -agkh

The actual number of indexed pages for "example.com" is between 3,500,000 and 4,000,000 (verified through other tools and painstakingly extensive research). The total indexable pages on "example.com" sum up to roughly 6,000,000.

An interesting note is that "example.com -spellingmistake" returns pages (URL only, no snipet, no title) blocked by the robots.txt.
This is nothing new - but it does really mess up the numbers, because Google will return the blocked URLs (and how many of them there are) when you use the site:- operator; further obfuscating the results.

Robert Charlton




msg:4181003
 8:08 am on Aug 3, 2010 (gmt 0)

pontifex - Very thoughtful post... thanks for sharing your observations.


...Google will return the blocked URLs (and how many of them there are) when you use the site:- operator

It's possible for a competitor to find out a lot about you by using the site: operator, which is another reason not to use robots.txt to block pages you don't want referenced. Use meta robots noindex if you don't want URLs referenced.

futureX




msg:4181253
 4:24 pm on Aug 3, 2010 (gmt 0)

Ha, Google has just made a liar out of me, I was about to post that every site I can remember of mine has had homepage at #1 in the results, but I've just checked a site and the homepage is not even on the first page.

It's even below many pages that have no external links and only one or two internal links, where the main page has a handful of externals and all ~700 interal pages link to it.

The main reason I use the "site:" command is to see what has been indexed in small/new sites, I also use it plenty for seeing which images are indexed on a domain, which seems to be accurate.

I assume that in general it will return all indexed pages that are not supplemental, the same site as above regularly drops 60% of the pages indexed from the "site:" command, it will pick them up at some other point again.

c41lum




msg:4181344
 6:21 pm on Aug 3, 2010 (gmt 0)

Hi FutureX have you seen a drop in the SERPS. It seems if you homepage isnt position 1 then there might be a problem on the site somewhere.

epmaniac




msg:4181347
 6:25 pm on Aug 3, 2010 (gmt 0)

what could be the problem with the site whose homepage isnt on position 1 ?

is this sign of a penalty?

tedster




msg:4181356
 6:35 pm on Aug 3, 2010 (gmt 0)

Yes, it can be a sign of a penalty, or of low trust - or of a bug in Google's back end.

epmaniac




msg:4181486
 10:31 pm on Aug 3, 2010 (gmt 0)

lightinthebox.com is among top 1000 sites according to alexa, and it has been ranking consistently good and is popular site... whats your take on site:lightinthebox.com as homepage is not ranked at position 1

This 49 message thread spans 2 pages: 49 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved