Google's 2 indexes & you - (deprecated) Google News Archive forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Google's 2 indexes & you

Google is alternating between 2 indexes, every 3 days.

«
1
2
3
4
»

Namaste

10:31 am on Jul 5, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I am sure you all are seeing this too, but every two-three days the index changes and if a page drops, it drops to the same position it was 3 days back.

For example, my result for Buy Widgets is alternating between position 6 and position 11. This has been happening for 2 weeks.

To me, this shows that Google is publishing 2 indexes and is alternating between the two.

What is the reason for this?

Theory 1: SEO Neutralisation. Since Jan, we have seen steps towards this. Obvious step being reciprocal links neutralized. Why would Google want to do this? "It's the money stupid", Less predictable results means more spend on Google Adwords by webmasters!

Theory 2: Continuous Update. Inorder to publish a new index and allow it to settle across all data centers 3 days is required. So, DeepFreshBot scours the depths of the web continously and a new index is published every three days. GoogleGuy allows the new index to settle and then pushes the red botton,making live the new index. (Best way to test this is to make title tag modifications to deep pages and see if they appear with the new index).

Theory 3: Both. How do you kill two birds with one red button? Simple, you undertake 2 above, but use two seperate algos, one for each index(theory 1). Thus, index A is on for 3 days, meanwhile index B(with different algo) is being prepared and published. Then Index B goes live and index A goes in for updation, and so on and on. To make it greater fun, every now and then GG inserts experimental algo C for 3 days.

Enjoy!

Mozart

10:51 pm on Jul 5, 2003 (gmt 0)

10+ Year Member

Namaste:

I wrote:

All would be great had I not checked the red widgets SERPs! There my page was as well! I repeat, same page, different SERPS, different display! Both SERPs had the full www.domain.com/dir/filename.html with the red widgets result having no fresh tag and the blue widgets result having the fresh tag of the last spidering! But one with the old title, the other with the new title!

You wrote:

this is actually a very old problem and I have seen this for as long as I have been watching Google. Google often displays old title tags.

Hmmm, well, I would not have been confused if Google always, continuously and consistently were to display either the old or the new title tag (and I don't care where in the rankings, that is a different issue!), but I repeat: I had the two SERPs open in two separate windows, kept refreshing them and still had in window 1 the title tag being blue widgets and at the same time in window 2 the title tag being red widgets for the exact same one page.

So my observation was that this one single page existed seemingly two times in the index, once as the old page and also as the new page with new title tag... One and the same page just can not exist twice in the same index (one and the same page meaning right down to the www.domain.com/dir/filename.html in both windows exactly the same)!

You were saying that Google often displays the old title tags. Do you mean only the old, or also at the same time the new title tags, depending on what you searched for?

A thought: Google has in fact quite a few indices. First there are all those different datacenters. But then there is another index, the (deep)freshbot index. This index is added to whatever the other index at that moment may be. Result perhaps: If the page has not changed since the last deep(fresh)bot crawl the possible rankings get added up. So that happens every three or so days and if the old index position versus new index position is drastically different jumps between say #3 and #50 occur. And maybe the page even drops out altogether.

And if the page has changed since the last deep(fresh)bot crawl in such things as the title tag a bug in the algo fails to recognise it as the same page, even with same fully qualified url. And then the one page can actually appear to be twice in the index.

But that of course is just a thought...

Mozart

steveb

10:52 pm on Jul 5, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I've never searched -xx2, but always search all nine datacenters. I'm not sure your point. Results are all over the place every "three days" (with that three days term being not some rigid number you can count on) on the nine datacenters. In addition to the wild fluctuations, fresh crap gets everfluxed in all the time. That is a different thing. The nine datacenters lined up sanely around the 1st or second and now are on drugs again (though lined up), although this time I see less goofball results than ten days ago.

mipapage

11:19 pm on Jul 5, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Hmmm...

We have not seen any pattern to the serps we watch.

fresh crap gets everfluxed in all the time.

Subjective, but true ;-]

Our observations:

When things settled with Esmerelda,
(1) our missing index page returned. Then some fresh-dates happened,
(2) which coincided with our index page going missing (again) from the serps (though it was there with a fresh tag if I searched with site:www.mysite.com -iug;iug) and
(3) 'fresh *crap* was everfluxed. (crap that wasn't there when Esmerelda settled - I'm thinking this stuff was removed by a spam filter then returned for whatever reason with the fresh listings)

Since that equilibrium-rattling fresh-tag mentioned above, I have seen the same serps, no 3 day cycle at all.

g1smd

11:20 pm on Jul 5, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

>> I'm not sure your point. Results are all over the place every "three days" <<

My point was that I wanted to know what results you were talking about -- how, and where, you searched, as that does make a difference.

mfishy

1:20 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

g1smd,

When pages dissapear it is across all datacenters. Sometimes they will appear on one or two, than all of the nine, than none.

The strange part about it is almost every page I follow where this happens goes from a top spot on all or most the datacenters to out of the top 500, than reappears right back at #1.

A page that was missing on all datacenters as of friday has just showed up #1 on cw and va as of tonight. If it follows the pattern it will be back on most by tomorow.

BTW, this is a widespread issue. We checked 50 keywords from different industries a week ago (500 total pages- top 10). As of Wednesday 44 out of the 500 pages were missing from, not only the top 10 but TOP 50. As of today, 27 are back in the top 10. We are unrelated to most of the pages we researched.

steveb

1:48 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Perhaps an alternation is close. I see that hovering my cursur over the category link on the toolbar that a category name created for the first time in dmoz one month ago is showing up as the google Directory category for that site... even though the Google Directory still has no such category/page.

I was about to suggest this is evidence of an imminent Directory update... until I checked the other sites in that category. None show this category in the toolbar, although some show this new category as a backlink.

Namaste

7:48 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I have been vigrously monitoring the sites above me to see what are the factors that are getting them ahead during the cycle that I am low. My finding so far is that those sites have a much higher KeyWord Density than my site, and Google is giving lesser importance to PR. Can anyone else check these factors and report.

Discounting PR would explain the widely fluctuating results. You have built your rank heavily on PR, then suddenly you are no. 100 - 500. Check the sites above you for PR and inpage factors.

Powdork

8:28 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Mipapage,
What you described is exactly what has happened to me. Once on May 16 and again on June 16. Those were the first two fresh tags my site had gotten, though visited regularly by freshbot. Each time it lasted for three days. On May 19 I came back at #3 up from 5. On June 18, I came back at #1.
Here are what i think may have been contributing factors. (or at least interesting points.)
Unusually high keyword density for the main keyphrases
44.4% for main three word phrase
32% for main two word phrase (according to sew density analyzer,
title on metas off)
The company name is the two word phrase.com
The url is the two word phrase with no hyphen.
The three word phrase is 'Lake two word phrase'
Therefore, the vast majority of incoming anchor is the two word
phrase with the three word phrase making up the difference. That
means close to 100% of the incoming anchor at least includes the
two word phrase, which is listed twice in the title (Three word
phrase at two word phrase.com).
Unlike some others, my allinanchor did suffer greatly during these
periods.

I partially subscribe to the filter theory. The two fresh tags and resulting drops were a result of a visit from a bot with a particular filtering agenda. Seeing the results of the crawl, the engineers turned the 'knobs' based on their feeling for how the filters worked and input from regular folks like us. I think mysite.com could easily fall victim to any overzealous overoptimisation filter but would easily survive any rigorous hand check.
I have received fresh tags since then with no ill effects.:)

Jeroen van de Wiel

8:54 am on Jul 6, 2003 (gmt 0)

10+ Year Member

Marvel: "The other noted changes are that the directory (I think its been noted here before) doesnt change for the change in SERPs. As has been pointed out before the directory seems to be very old and has not had a DMOZ RDF in some time(or has reverted to a much older directory)"

One of the pages I regularly monitor has dropped probably because of old dmoz data. Even stranger: if i refresh de Google directory it's listed in, once every ten times i try i get a Google directory / dmoz listing from over months ago. (one listing instead of six listings in that directory) This might prove your theory right.

zafile

11:07 am on Jul 6, 2003 (gmt 0)

Powdork, you should take into account the following events in terms of your keyword density:

October 22, 2002 - Google sued over site ranking [news.com.com...]

November 13, 2002 - SearchKing Google Rank Restored [thewhir.com...]

January 10, 2003 - Google counters search-fix lawsuit [news.com.com...]

May 27, 2003 - Judge dismisses suit against Google [news.com.com...]

As the last link tells you, SearchKing lost its claim against Google. U.S. District Court Judge Vicki Miles-LaGrange ruled:

"PageRanks are opinions--opinions of the significance of particular Web sites as they correspond to a search query." "The court simply finds there is no conceivable way to prove that the relative significance assigned to a given Web site is false." "Accordingly, the court concludes Google's PageRanks are entitled to full constitutional protection."

The page [pradnetwork.com...] written by Bob Massa owner of SearchKing encourages Webmasters to "make sure your keyword phrase is about 5 to 7% of your body text."

Because Google now counts with full legal protection, you can be certain that its new algorithms will target pages with keyword phrases set at 5 to 7% of the BODY TEXT, not anchor text.

Recently, I had my keyword phrase word_1 word_2 word_3 word_4 optimized over 20% total.

Worried about the court's resolution of May 27, I brought down density as follows:

word_1 5.23%
word_2 5.23%
word_3 4.65%
word_4 4.65%

I was hoping the 19.76% optimized page would pass the fresh tags of 3 days ago. I was upset when the page vanished in despite my efforts.

Yesterday, I brought down density a little lower to a total of 18.18%:

word_1 4.69%
word_2 4.69%
word_3 4.40%
word_4 4.40%

I must clarify my competitors have pages optimized as follow: 19.16%, 18.44%, 17.68%, 17.66%, 9.65% and 7.31% (total). Therefore, with 18.18% my page is somewhere in the middle.

All of my competitors remain in the top ten but my former 19.76% page vanished 3 days ago. The only difference is my page is 7 or 8 months old. The pages of my competitors are 2 to 5 years old.

Note that the SearchKing issue began on October 22, 2002. I bet you most sites with the index page problems made Google's Top Ten after October 22, 2002 ...

[edited by: zafile at 11:54 am (utc) on July 6, 2003]

zafile

11:07 am on Jul 6, 2003 (gmt 0)

Double posting, sorry ...

[edited by: zafile at 11:09 am (utc) on July 6, 2003]

sachac

11:09 am on Jul 6, 2003 (gmt 0)

10+ Year Member

Namaste

You're right. The "new" sites that are now appearing ahead of me have awesome keyword density. One is deliberately so and very spammy. It seems strange that Google would view this as an improvement in their new algo, so this must be a glitch. Remember, GG had stated previously that keyword density was not very important to your keyword ranking. That made sense to me then and still does.

Also, hypnenated keywords in the domain, seem to have taken on greater significance.

bekyed

11:19 am on Jul 6, 2003 (gmt 0)

10+ Year Member

Za File.

Absolutely right.

We too have been sitting at no 15 for a popular search term, we added more key-phrases to the page and anchor text and poof we have gone completely.
There is no doubt that google is punishing sites with too many keywords together eg: keyword1 keyword2 keyword3

We have now seperated the keywords on the page and reduced the density to see what will happen after the fresh bot.

I will let you know my results.

Bek

zeus

11:34 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Keyword Density other SEO stuff, I dont think that has anything to do with it.

Everytime you are gone from your top rankings you have fallen many pages down and then return,
I think it has something to do with the fresh tags and maybe some space limited on Google, because why els will they want to loose good sites in the top rankings, THEY have put you there so why would they have you gone sometimes, so there must be a problem, everytime they give fresh tags, some tech/soft. problem.

zeus

zafile

11:36 am on Jul 6, 2003 (gmt 0)

I believe the main issue with Google is body text and not anchor text.

Remember that Sergey Brin and Lawrence Page have always engouraged to use your main keywords in anchor text:

"The text of links is treated in a special way in our search engine. Most search engines associate the text of a link with the page that the link is on. In addition, we associate it with the page the link points to. This has several advantages. First, anchors often provide more accurate descriptions of web pages than the pages themselves. Second, anchors may exist for documents which cannot be indexed by a text-based search engine, such as images, programs, and databases. This makes it possible to return web pages which have not actually been crawled. Note that pages that have not been crawled can cause problems, since they are never checked for validity before being returned to the user. In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it. However, it is possible to sort the results, so that this particular problem rarely happens."

"This idea of propagating anchor text to the page it refers to was implemented in the World Wide Web Worm [McBryan 94] especially because it helps search non-text information, and expands the search coverage with fewer downloaded documents. We use anchor propagation mostly because anchor text can help provide better quality results. Using anchor text efficiently is technically difficult because of the large amounts of data which must be processed. In our current crawl of 24 million pages, we had over 259 million anchors which we indexed." [www7.scu.edu.au...]

Now, if you are abusing the number of times in which you use your keyword phrase in your anchor, you might be experiencing problems. On the other hand, if you use responsibly your keyword phrase in anchor text, you should be ok. What about your body text? ...

claus

12:36 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Mozart:

>> this one single page existed seemingly two times in the index

Again, i confirm this: A new and an old version of the exact same page coexist in the index. It is not the case, that it is just the old version that is being shown, both versions are there.

In my case, one shows up for "gadgets widgets" and the other shows up for "widgets gadgets". Searches with any of these two, as well as the "and" word yield weird results.

g1smd:

I just checked, these results are consistently the same across all 9*2 datacenters (including the -xx2) as well as the www-search. No differences at all, that is.

/claus

mil2k

2:08 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Has Google Guy or anyone else at Google said that reciprocal linking was not a good thing and or that we shouldn't do it?

I had seen a Powerpoint Presentation about experiments done by Krishna Bharat of Google (Google News Fame) and that experiments suggested 30% (I think!) improved results when discounting reciprocal linking. Anyways just wanted to say it's not impossible. One of the important things I leant here is today's Brilliant SEO maybe tomorrow's Spam. :)

Tropical Island

2:46 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I am one of those suffering a missing index page for our two main regional search terms for one of our sites (not the one in our profile). The index page is not missing from other search terms - just the two high traffic and relevant terms. Our FAQ page and an activiites page is showing around posistion 60 -70. The index page has been showing up on the other data centres however for the last few days not.

My first inclination is to start making changes. If in fact this is not a glitch - what do I change? Cut back on the search terms in the body copy? Change the description tag to eliminate the search terms? I don't think so. When you look at the sites with high rankings that's exactly what they have.

I see many things that are not current. A competitor still has backlinks showing that were eliminated two months ago, another site is still showing PR0 when it has existing backlinks, many cache pages are still April & May dates.

I firmly believe that there are still glitches in the system that they are trying to work out and that in the end our site will return to it's proper (IMHO) posistion as one of the best information sites on our area.

rfgdxm1

2:46 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>I had seen a Powerpoint Presentation about experiments done by Krishna Bharat of Google (Google News Fame) and that experiments suggested 30% (I think!) improved results when discounting reciprocal linking. Anyways just wanted to say it's not impossible. One of the important things I leant here is today's Brilliant SEO maybe tomorrow's Spam.

The problem with this idea is that this often isn't true with niche topics. Consider say the case of websites that are dedicated to some 1970s bubblegum rock band that was only slightly popular. With a topic as narrow as this, there may be only 6 sites, and quite possibly even less, devoted to it. And, it is quite likely most sites will link to the others. On a topic like this, good chance the webmasters are in e-mail contact exchanging notes about the band and such. Thus, if Google wants to maintain relevance they have to find some way for the algo to take situations like this into account.

Clark

3:58 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

So that means that the 4 links at the bottom of this page are discounted? The "newer" "older" "this forum" "global" links all point to pages that point back to each other.

annej

4:32 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

improved results when discounting reciprocal linking

At what cost? So I am supposed to be afraid to link to relevant sites for fear that they have linked back to me. It will be a sad day if SEs come to this.

JoeHouse

4:54 pm on Jul 6, 2003 (gmt 0)

10+ Year Member

Hello Everyone

Question regarding new website and indexing. I have a new website that just got indexed in June 2003.

I was wondering with every index that comes and goes after the initial indexing can I expect more traffic to my website?

I was told by professionals in the business that the longer your site is on the internet the more traffic and links you will get, and then to expect a max-out in about 6 months to 1 year.

Would someone be so kind to elaborate on this subject. I need to be educated on this so I know what direction I need to go in.

Someone's utmost attention on this subject would be very much appreciated. Happy belated 4th everyone. Thanks!

rfgdxm1

5:35 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>At what cost? So I am supposed to be afraid to link to relevant sites for fear that they have linked back to me. It will be a sad day if SEs come to this.

Interesting way of analyzing this. Many have pointed out that the problem with PR is that it encourage reciprocal linking by unrelated sites, because the webmasters of both figure it'll give them a boost in the SERPs. However, when sites *are* related, quite often reciprocal links happen naturally. If linking isn't being done to manipulate SEs, then site A will only link to site B if there is something relevant on site B for their users. However, if this is the case, then it probably means that it makes sense for site B to link to site A.

Namaste

5:43 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

if you are abusing the number of times in which you use your keyword phrase in your anchor

this is under investigation, and remains a strong possibility...it fits in with the whole SEO neutalisation business

Here are what i think may have been contributing factors. (or at least interesting points.)
- Unusually high keyword density for the main keyphrases
44.4% for main three word phrase
32% for main two word phrase (according to sew density analyzer,
title on metas off)
- The company name is the two word phrase.com
- The url is the two word phrase with no hyphen.
- The three word phrase is 'Lake two word phrase'
- Therefore, the vast majority of incoming anchor is the two word phrase with the three word phrase making up the difference. That means close to 100% of the incoming anchor at least includes the two word phrase, which is listed twice in the title (Three word phrase at two word phrase.com).

this is exactly what I am seeing too, and I think this is what it is that is diffrenciating the two indexes. But I am also see it with hyphens in the URL, which is very disturbing . I am seeing a whole bunch of spamy sites above me, becuase they have done this "red-widgets-texas.com" thing, they have terrible pages loaded with keywords, low PR and various other dirty tricks.

Do you mean only the old, or also at the same time the new title tags, depending on what you searched for?

yes, this is an old Google bug.

Marval

6:10 pm on Jul 6, 2003 (gmt 0)

10+ Year Member

Back at the beginning of this thread, I reported that some 10% of the SERPs for a famous keyword had been dropped. That did propagate to all datacenters, however now there has been another drop on a few datacenters with about another 8% dropped. This is not a small amount when you consider that at the beginning of this week, Google had 202M pages listed, Wed or so it dropped to 180M, and as of today has dropped to 168 mill. Thats 34 mill pages gone...were they spam? Dont really know, as I know on ly a few of the sites and they are still in there, bouncing all over the place.

The second observation is that the 3 day cycle has been extended for the holiday weekend. has become a 5 day placement this time, although the other set of results seems to be on -fi where its been the whole time.

steveb

9:57 pm on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

"There is no doubt that google is punishing sites with too many keywords together eg: keyword1 keyword2 keyword3"

There is no doubt you have this backwards.

The current batch of results have a load of crap sites that are nothing but disposable keyword insta-sites. Unfortunately this is the way to "beat" Google right now: instant linksmanager garbage links with targeted anchor text; keywords strung together on a page without any context or regard for coherent sentences; keyword-hyphen-domain.

Each relatively "good" bit of data/results/serps that comes out (which would be nice to see now...) tends to send these crap sites, if not to oblivion, at least below the first couple pages of results... which is where in competitive fields any brand new site should be since there is nothing to measure its authority.

Namaste

8:06 am on Jul 7, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

yep, no way to beat google. But they're their own worst enemy: junk results, no freshness, instability, etc. etc.

How long before users start noticing and start shifting to other search engines.

Napoleon

9:12 am on Jul 7, 2003 (gmt 0)

>> yep, no way to beat google. <<

Yes there is.... but not the content-content-content and good link route followed thus far.

>> But they're their own worst enemy <<

Which follows nicely, because by closing off the traditional 'social contract' route above, they would drive webmasters in another direction, which wouldn't be good for Google (or indeed the web) longer term.

If this actually is the new direction for Google, I believe they have made a real mess of it.... FAR too much collateral damage. However, to be fair, there are probably too many balls still in the air to make a firm judgment just yet.

On balance, I don't believe 2 indices or the ridiculous instability between centers can really be what they want going forward. There's still time for everything to settle as one and the excessive casualties to return.

If this doesn't happen, however, the discontent and confusion presents an excellent platform for the forthcoming MSN and Yahoo offerings to make serious inroads.

zafile

9:51 am on Jul 7, 2003 (gmt 0)

"If this actually is the new direction for Google, I believe they have made a real mess of it..."

I believe the real mess was created in part by the infamous SearchKing lawsuit. It's been only 6 weeks since that mess was clarified by the US legal system.

Prior to May 27, 2003 anyone was entitled to mess around with Google's search results. After May 27 Google is fully entitled to clean its index from fraudulent Web sites.

How long does it take to clean the Google databases full with information for 3,083,324,652 web pages? It's obvious that until today Google hasn't accomplished the clean-up task.

Google's new direction might be releated as well with a Web search patent it obtained in late February 2003. The patent deals with "an improved search engine that refines a document's relevance score based on interconnectivity of the document within a set of relevant documents."

I hope Google gets its clean-up job done well. But I also hope Google does it soon!

Key events and dates:

October 22, 2002 - Google sued over site ranking [news.com.com...]

November 13, 2002 - SEARCHKING GOOGLE RANK RESTORED [thewhir.com...]

January 10, 2003 - Google counters search-fix lawsuit [news.com.com...]

February 26, 2003 - Google lands Web search patent [news.com.com...]

May 27, 2003 - JUDGE DISMISSES SUIT AGAINST GOOGLE [news.com.com...]

claus

7:28 pm on Jul 7, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

zafile, thanks. Good links :)

February 26, 2003 - Google lands Web search patent [news.com.com...]
The patent linked to by the news.com article is:
6,529,903 Methods and apparatus for using a modified index to provide search results in response to an ambiguous search query

- it is an interesting patent that employs pattern matching to query-strings (similar to that found in modern cellphones when it comes to easy entry of SMS-messages.) It even expands this notion to encompass "querystrings" of audiovisual nature.
However, the right patent is this one:
6,526,440 Ranking search results by reranking the results based on local inter-connectivity

You can find the details at the US Patent & Trademark Office, Patent Full-Text and Image Database [patft.uspto.gov]
- i've been going through this one, and it is important for SEOs. Right now, im writing a post about it, i hope it will become a new thread, as it is too off topic for this one, even though the topic headline is actually quite close to the subject matter.
/claus

This 104 message thread spans 4 pages: 104

«
1
2
3
4
»