homepage Welcome to WebmasterWorld Guest from 54.211.201.65
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 662 message thread spans 23 pages: < < 662 ( 1 ... 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 17 18 19 20 21 ... 23 > >     
:-( : Update...false alarm Sept 2005
What *is* an Update?
straticus




msg:771047
 8:06 am on Sep 4, 2005 (gmt 0)

Continued from:
[webmasterworld.com...]



It seem the backlink update has begun, not surprising after the heavy spidering lately, good luck everyone!

 

Rick_M




msg:771407
 5:36 pm on Sep 9, 2005 (gmt 0)

Not sure this really fits in here, but I was thinking about the large increase in pages showing in the index.

What if Google is counting the same URL more than once - if it is different at different times? The old page would be in the supplemental index, and the new page in the current index. It would be a combination of Google and archive.org. I am guessing Google has the ability to do this, and there would be definite advantages to doing this.

There has been speculation that time factors were introducted with one of the updates (was it Florida?) - and the Sandbox started sometime after that.

It is one factor I think I'd be interested in looking at to rank websites on quality. The best way to calculate the development of a site over time would require you to have snapshots of the web at different time points.

Additionally, there may be relevant content that was on a site last year, but it got removed. Additionally, a site may have linked to another site from it's main page last month, but now is linking to the site from an archive page on the site this month - should the pagerank from the original link maybe still carry some weight? You could get some very elaborate forumla for rankings sites if you added a time dimension.

Of course while the formula may become more elaborate, it doesn't necessarily equate to better search results. As it may help filter artificial linking schemes and duplicate content issues, though, I think it would have potential. If I'm thinking of it - I'm sure someone at Google has considered it at some point. I wouldn't be surprised if it's actually been discussed here at WW before - sorry if it has, or if it is of no interest. I just think it's important to consider all possibilities when trying to understand changes in the SERPs.

Murdoch




msg:771408
 5:54 pm on Sep 9, 2005 (gmt 0)

One thing we have just done is go through a rebranding exercise. That entailed visiting every page and just tweaking the wording slightly and replacing a few images here and there. I wonder if that could be seen as bening "gaming google" which I was reading about here.

Henry, did you do all of this tweaking over the course of one or two days? If that's the case then Google may have put you in the back of the queue while it waits to re-index all of your pages to ensure that you are not actually trying to "game" it. This is why it is generally implied on this forum to do EVERYTHING to your website slowly but constantly. Large spikes in any factor throws up a flag. If it's a .tv domain based in the UK it may even take longer as (granted this is just an assumption here) the US based .com domains take first dibs on reinclusion.

HenryWills




msg:771409
 8:02 pm on Sep 9, 2005 (gmt 0)

Hi Murdoch, the update was done in one blast as it took me a couple of days to go through the site and alter it. There were new logos, phone number images etc that were different enough to cause layout issues. I generally work on a development server anyway and then update in one hit.

The problem I guess is that no one was checking just before the update so no one is really sure when the domain dropped off google. The only change WE made recently was this, and so that triggered a reaction with our client that it must be something wrong with the site.

I have advised them to hold tight for a week (or 2), and hope that it reappears. The above example of the .tv TV channel website is omminous and does indicate that incorrect assumptions are being made concerning what is a UK site.

steveb




msg:771410
 9:26 pm on Sep 9, 2005 (gmt 0)

"the large increase in pages"

It's not a large increase in pages, drivel from both Yahoo and Google notwithstanding.

With Google it's a large increase in URLs and dead/nonexistent Supplemental listings.

While basically harmless, the "pages" propaganda put out by the engines should not be repeated. Increased result numbers does not necessarily mean increased pages indexed. In some cases it will, but Google's database is overflowing with two differnet types of things that are very definitely not pages.

Kirby




msg:771411
 9:44 pm on Sep 9, 2005 (gmt 0)

Google has either added a significant number of new pages to its index or significantly changed the way that it reports counts.

Google needs to define what it considers to be a page.

HenryWills




msg:771412
 10:22 pm on Sep 9, 2005 (gmt 0)

RESULT! (or on the way to one)

I think I got a "human" reply from Google. They say that "Our crawlers identify the country that corresponds to a site by factors such as the physical location at which the site is hosted, the site's IP address, and its domain restrict."

My weekend just got a whoel lot better, because they then say "If you feel that we're incorrectly detecting the location of this site, please send us the site's IP address and the physical location at which it's hosted."

Done that, now let's see what happens.

joeduck




msg:771413
 10:36 pm on Sep 9, 2005 (gmt 0)

Google needs to define what it considers to be a page.

Yes indeed. I just found thousands of "pages" from our site that are indexed at Google. They are simply dynamic URLs and the 'page' is almost empty - the result of an events search when we had nothing in the database. I'm thinking this may have triggered a duplicate content filter on our entire site and I'll be trying to remove these from the index.

reseller




msg:771414
 6:39 am on Sep 10, 2005 (gmt 0)

Back to business..

These two DCs were very famous during Allegra and Bourbon.

They show now new results for my testing keyphrases and BL.

66.102.9.99
66.102.9.104

Call it everflux, everchanging, rotating algos, reshuffling, update, "constant state of low-level changes" or whatever you like. Who knows for sure these days what things should be called ;-)

texasville




msg:771415
 7:11 am on Sep 10, 2005 (gmt 0)

>>>At the same time, as G gets ever better at identifying spam,<<<<

If anything they are getting worse. I am convinced from what I see that they think they wiped out all the "old" ways of spamming and their algos don't detect them anymore. I am seeing "hidden text". bogus "sitemaps" and old blackhat junk prevailing in my sector. I reported a couple of severe ones over 4 months ago...nothing done. I am convinced they are only chasing "new" blackhat and I am about to prove it. I am buying an old domain and I am going to tweak it to the max. I bet I can take it to the top. Not for any reason but to prove my theory.

phantombookman




msg:771416
 8:17 am on Sep 10, 2005 (gmt 0)

One of my sites has 1,250 actual pages, it is a vanila.htm site no active ages etc.
6 months ago Google lists 3,250 pages for the site.
Now it says over 10,000

When I clicked through them all to see what the extra pages were the results actually finished around the 1,200 mark.

As this has only happened on one site and all of mine are built the same way, it is a puzzler?
Unless of course it is simply Google inflating the size of its database

reseller




msg:771417
 8:40 am on Sep 10, 2005 (gmt 0)

phantombookman

>>Unless of course it is simply Google inflating the size of its database <<

And thats what I call a real UPDATE ;-)

Google... You Rock!

Vimes




msg:771418
 8:53 am on Sep 10, 2005 (gmt 0)

66.102.9.99
66.102.9.104

they look like the old BL totals.
SERP's look very similar as well.

Vimes.

lammert




msg:771419
 9:58 am on Sep 10, 2005 (gmt 0)

I have found one very interesting site with inflated numbers. This site reports 183,000 pages with the site:www.example.com search, but only 2 pages are visible in the SERPs. Even when the &filter=0 parameter is added, only two pages are reported.

This specific site tried to increase its rankings in the past by adding a lot of pages with static SERPs. I wouldn't call them a scraper because they are a search engine themselves and exist longer than Google, but some months ago Google wiped out all their static pages from the Google index. It now seems that Google uses a different approach. Instead of deleting the URLs from the index, they now count them in the index, but don't show them in the SERPs. In this way the URL count of 183,000 is valid, but those URLs are not shown in the SERPs because of quality issues.

I think it is a changing approach of Google regarding sites they want and don't want in the SERPs. In the earlier days the only way they had to punish spam sites was to delete the sites from the SERPs. In the last months Google changed its approach more to put filters in place, not deleting sites, but making them unfindable. Helped by applying link-weighting algorithms to these pages (which are Google's first line of defense for unnatural links according to Matt Cutts' blog) to devalue the links on these pages so they won't influence rankings of other sites too much.

It could be a way to increase the number of indexed URLs to get ahead of Yahoo again, without any decrease of search quality. It is the same approach that Google uses with the URL removal tool. The pages are present in the index and are therefore counted, but there is no way the normal search engine user will see them on the result pages.

Reloading so many new pages in the SERPs might influence the rankings somewhat, as those pages still might be able to distribute some reputation (PR, anchor) to other pages which causes a greater everflux than normal and therefore triggered pushing the Gilligan update false alarm button.

If my hypothesis is correct that Google is reloading banned sites and URLs to their index without displaying them, this should be visible by some of the members here that reported that one of their sites was removed from Google. Is there anyone else out there seeing a large number of URLs in the site: search for his site, but without actual or very few listings in the SERPs?

g1smd




msg:771420
 11:51 am on Sep 10, 2005 (gmt 0)

There was a massive inflation of "reported" numbers in the SERPs this sort-of time last year. It lasted very many months, and was only corrected a few months ago.

The numbers re-inflated again just a few days ago. Yes, they are including "all known URLs" even pages that are flagged in their dB as "404", "duplicate", "noindex", "excluded", and "hidden".

kamikaze Optimizer




msg:771421
 1:46 pm on Sep 10, 2005 (gmt 0)

There was a massive inflation of "reported" numbers in the SERPs this sort-of time last year. It lasted very many months, and was only corrected a few months ago.
The numbers re-inflated again just a few days ago. Yes, they are including "all known URLs" even pages that are flagged in their dB as "404", "duplicate", "noindex", "excluded", and "hidden".

Yep - it's the same "buring Yahoo" in the numbers game.

kamikaze Optimizer




msg:771422
 4:06 pm on Sep 10, 2005 (gmt 0)

Google Bot is going crazy on my site right now and has been all morning. In fact, had I not put the GBot's IP's into my protected IP range file, it would have been auto blocked as a flood attack.

Here is a sample:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

IP: 66.249.65.77

/modules.php?name=Forums&file=viewtopic&p=20397&highlight= 2005-09-10 @ 11:58:31
/modules.php?name=Forums&file=viewtopic&p=29908&highlight= 2005-09-10 @ 11:58:31
/modules.php?name=Forums&file=viewtopic&p=30058&highlight= 2005-09-10 @ 11:58:30
/modules.php?name=Forums&file=viewtopic&p=14313&highlight= 2005-09-10 @ 11:58:30
/modules.php?name=Forums&file=viewtopic&p=17368&highlight= 2005-09-10 @ 11:58:29
/modules.php?name=Forums&file=viewtopic&p=28885&highlight= 2005-09-10 @ 11:58:28
/modules.php?name=Forums&file=viewtopic&p=23837&highlight= 2005-09-10 @ 11:58:27
/modules.php?name=Forums&file=viewtopic&p=20239&highlight= 2005-09-10 @ 11:58:27
/modules.php?name=Forums&file=viewtopic&p=19339&highlight= 2005-09-10 @ 11:58:26
/modules.php?name=Forums&file=viewtopic&p=13368&highlight= 2005-09-10 @ 11:58:26
/modules.php?name=Forums&file=viewtopic&p=18895&highlight= 2005-09-10 @ 11:58:25
/modules.php?name=Forums&file=viewtopic&p=24211&highlight= 2005-09-10 @ 11:58:24
/modules.php?name=Forums&file=viewtopic&p=15915&highlight= 2005-09-10 @ 11:58:24
/modules.php?name=Forums&file=viewtopic&p=39550&highlight= 2005-09-10 @ 11:58:23
/modules.php?name=Forums&file=viewtopic&p=13604&highlight= 2005-09-10 @ 11:58:23
++++++++++++++++++++++++++++++++++++++++++++++

Anyone else seeing this?

reseller




msg:771423
 4:14 pm on Sep 10, 2005 (gmt 0)

kamikaze Optimizer

Have you made changes to several of your pages recently?

kamikaze Optimizer




msg:771424
 4:19 pm on Sep 10, 2005 (gmt 0)

Yes, but I always have, that would not be a new change.

The index contains news which is updated several times a day.

The forums is constant, updateing every minute.

walkman




msg:771425
 5:18 pm on Sep 10, 2005 (gmt 0)

GoogleBot is crawling my site heavily too. I'm not complaining--just obesrving :)

g1smd




msg:771426
 5:28 pm on Sep 10, 2005 (gmt 0)

You could simplify the URLs by which your pages are accessed quite a lot. There are parameters that are not needed in those strings.

Additionally should some other page have those long URLs on, but with the parameters in a different order, that will be duplicate content and can cause all sorts of problems.

fischermx




msg:771427
 7:50 pm on Sep 10, 2005 (gmt 0)

If this isn't an update, why a 6 month unindexed/non-cached/non-backlinked website became visible and ranking for some terms?
Isn't including new websites an "update"?

nickied




msg:771428
 7:56 pm on Sep 10, 2005 (gmt 0)

phantombookman:

One of my sites has 1,250 actual pages, it is a vanila.htm site no active ages etc.
6 months ago Google lists 3,250 pages for the site.
Now it says over 10,000

When I clicked through them all to see what the extra pages were the results actually finished around the 1,200 mark.

As this has only happened on one site and all of mine are built the same way, it is a puzzler?
Unless of course it is simply Google inflating the size of its database

Small site here. Very heavy spidering before recent "update" and currently. Page numbers here also grossly inflated.

using the yourcache tool mid-june i had around 1,000+ pages including those with non-www problems. g1smd made good suggestions to rid them with 301s. sitemap.xml also helped and i lost my url only listings. page numbers returned to about right.

beginning of july: non-www gone, pages tripled to around 3600.

september 7, nearly tripled again to +/- 10k.

looking for the cause, as phantombookman did, i find about the right number of pages. however, many are back to url only, 2 different cache dates from months ago.

the cache dates from months ago include many odd pages which were spidered such as offset=-375 (that's a minus) where offset=5 would be my second page of 5 widgets. (links were all checked here and correct). googlebot kept requesting non-existant pages and these are now in the 10k count. many good pages are back to url only.

lammert




msg:771429
 8:02 pm on Sep 10, 2005 (gmt 0)

Isn't including new websites an "update"?

No, new pages and sites are included all the time and existing sites in the index may raise or fall in the SERPs based on newly recognized links and other data. This is everflux.

Google does not consider change of the data inside the index to be an update, even if the change of this data cause significant SERP changes in some niches. They define an update the situation where--even when all data in the index remains constant--the SERPs change because of algorithmic changes. For us outside the Googleplex it is sometimes difficult to recognize what IS an update and what isn't.

g1smd




msg:771430
 8:10 pm on Sep 10, 2005 (gmt 0)

Whoa! This is different....

That site I keep mentioning that had the 301 redirects added in March and the listings sorted out within a few weeks (refer back to earlier threads for more information) has been listed only as non-www for a quite a while now.

Several days ago the listings changed again.

When you do a site:domain.com search you get to see all 120 non-www content pages with full title and description listings for all of them (as well as a load of URL-only non-www listings that robots.txt should be excluding - these are "admin" pages that surfers do not need to see in the SERPs at all, but they did recently re-appear in the SERPs).

When you do a site:www.domain.com search you get to see 70 www pages with full title and description. These are the www versions of "admin" pages that robots.txt should be excluding (and in fact were out of the index for 90 days from April to July after using the robots.txt Google Removal Tool on them in late April).

This is the new bit: those 70 www pages that do show up in a site:www.domain.com search do not show up when you do a site:domain.com search! How's that work?

There is some sort of disconnect between www and non-www that wasn't there before.

g1smd




msg:771431
 8:25 pm on Sep 10, 2005 (gmt 0)

The 301 redirect is from www to non-www on this site.

Last week, site:www.domain.com showed 90 URL-only www entries, and these were also included in the 430 site:domain.com listings (as 340 non-www and 90 www pages). (Of the 340 non-www pages, 120 were content pages fully indexed with title and description, and the rest (about 220) are site "admin" pages that are excluded by robots.txt and showed up as URL-only entries).

The 70 www pages that show up in a site:www.domain.com search now, are different pages to those in the 90 above last week, and the 70 www pages now have full title and description (the 90 were URL-only). The site:domain.com search shows 340 entries, none of which are www pages.

nsqlg




msg:771432
 8:34 pm on Sep 10, 2005 (gmt 0)

IMHO the duplicate filter of G is getting crazy.

G master-techs, the index need be clean urgent!

1. 404/410 pages to trash!
2. denied by noindex/robots.txt... if the OWNER deny, forget its content

"Do no evil" is penalize legitimate site because G keep old pages for years?

lammert




msg:771433
 10:30 pm on Sep 10, 2005 (gmt 0)

There is some sort of disconnect between www and non-www that wasn't there before.

Maybe this is the light at the end of the 301/302 tunnel. Google has had a canonical URL problem for a large period of time now. If they have found a good algorithmic solution it might be possible that they first need to delete all current canonical relations and then rebuild the canonical URL database from scratch with their new algorithm. Your discovery of the lack of link between the www and non-www versions might be the sign that they started to rebuild the database.

The disconnect between www, non-www and other canonical URLs might also be the reason for the sudden increase in reported page counts and the greater than normal ever-flux. Until now, Google counted www and non-www versions probably as one page, where it now treats them as seperate pages. If they currently rebuild the canonical URL database, the page numbers should decrease to normal within the comming days.

Keeping my fingers crossed...

cws3di




msg:771434
 11:47 pm on Sep 10, 2005 (gmt 0)

Special treatment for www only?

If it were true that G may be separating www from non-www (I don't see this at the moment) then wouldn't it logically follow to separate all sub-domains?

www.example.com
abc.example.com
xyz.example.com

or are we all just hung up on the www situation because it may be considered "duplicate" content.

Has anybody else had trash show up in their site: searches that is // - that is also perfectly legitimate "duplicate" content.

www.example.com//page1.html is always exactly the same as www.example.com/page1.html

Hope I didn't start a panic here...

Some days I just shake my head and want to retire. But I'm too poor because I do legitimate sites for little brick-n-mortar mom-n-pop.

kamikaze Optimizer




msg:771435
 2:30 am on Sep 11, 2005 (gmt 0)

You could simplify the URLs by which your pages are accessed quite a lot. There are parameters that are not needed in those strings.
Additionally should some other page have those long URLs on, but with the parameters in a different order, that will be duplicate content and can cause all sorts of problems.

Thanks g1smd.

Those pages are actually *.html pages with my rewrite mod.

My "bots" file reports the true url, but that is not how Google Bot or any other user agent see's the site.

needinfo




msg:771436
 12:52 pm on Sep 11, 2005 (gmt 0)

I'm not saying this is an update at all but something is definately happening to some of my sites from the travel sector. Some are effected different ways on different DCs, for example I have :

1. some out of index alltogether.
2. some with no title or page description shown.
3. some which have dropped from top page to pages 4 or worse.
4.

Is anyone else seeing their sites go out of index then coming back, is anyone else seeing descriptions and page titles being dropped, and is anybody else especially seeing a drop in placement in roughly the same order?

reseller




msg:771437
 1:17 pm on Sep 11, 2005 (gmt 0)

needinfo

>>I'm not saying this is an update at all but something is definately happening to some of my sites from the travel sector. Some are effected different ways on different DCs, for example I have :.....<<

As far as I could read on different threads, travel sector hasnīt been affected by the movements on the DCs recently. Maybe its happening now.

This 662 message thread spans 23 pages: < < 662 ( 1 ... 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 17 18 19 20 21 ... 23 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved