Welcome to WebmasterWorld Guest from 3.228.21.186

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Update Bourbon Part 3

     
8:35 pm on May 27, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Dec 18, 2004
posts:321
votes: 0


Continued From:

[webmasterworld.com...]



My whole site has a new cache date of May 25th. Maybe once these other sites around me get recached, I won't hold such an honorable top position. But at least Google has found my pages worthy to sit in the Search again.:) It seems strange to look at the stats and see Google in there, after 6 months of just seeing Yahoo and MSN referrals.

My website has plenty of outbound links, but they are on relevant pages. The problem my site has always had, was a lack of "inbound links." I got tired of searching for people to link to me (with all the spammy sites around) and gave up. So my pages have acquired some links naturally I guess(and I'll bet I still don't have more than 30 inbound links for the whole site) Still have a PR4, which I've had since it disappeared in Nov.

[edited by: Brett_Tabke at 8:54 pm (utc) on May 27, 2005]

4:10 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2005
posts:2264
votes: 0


Adsense is an expense to Google. It is the Adwords advertisers that generate the jackpot for everybody to take their cut.

That's what I used to think. Lately I've started to realise just how vital the Adsense network is. Less than 5% of the time people reach their destination through search. The rest is through links... and tapping into that market is the only way Google can massively increase the eyeball count. Advertisers can go - others will take their place - it's the publishers who are key. I speak as both an advertiser and publisher.

But that is OT, sorry.

Dayo_UK

4:10 pm on June 1, 2005 (gmt 0)

Inactive Member
Account Expired

 
 


EFV

Well yes - hence best case scenario. I suppose there could be legitimate reason for the use of the sub-domains. Just an example of how I wanted to underline a theory as it relates to non-www and www ;)

But going back to your and my case. Remember www is just a sub-domain and Google obviously did not think they were the same site so we had duplicate content on the www and the non-www according to G algos?

Anyway - hopefully changes are afoot to combat canonical url probs.

Also - in light of GGs comment - your eye of the storm theory might be correct.

4:32 pm on June 1, 2005 (gmt 0)

New User

10+ Year Member

joined:June 6, 2003
posts:25
votes: 0


I've managed to spot a change in the algo which may be playing a part in some of the irrelevant results talked about here.
For 2 word search terms, pages which relate to just one of those terms but contain the other term somewhere are beating pages which relate to the two terms (ignoring all other factors).

So for example if there is a page about a film star with lots of backlinks and a high pr, then it will beat a page about another topic containing one of the terms in his name which has less backlinks and pr.

I noticed this because I have been top ranking purely on content, pr and internal links for some uncompetitive 2 word terms (term1 term2) where term1 is constant. I've kept all the positions where term2 does not really mean anything in its own right but have dropped considerably for cases where term2 has another meaning.

It looks like Google is giving more weight to the individual rankings of terms for multi-term phrases.

MyWifeSays:

Yes!

This is exactly what I am seeing.

Now if only Google will fix it.

4:44 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 8, 2003
posts:397
votes: 0


< Surely you wouldn't argue that the site owner wasn't trying to spam the index?

Here's a legitimate reason why multiple subdomains don't necessarily constitute a spamming attempt - cobrands. Back in 99-00, we used to private label our service out to other sites, resulting in identical sites under different brands.

All of these cobrands "died on the vine" and we ended up switching to a destination site model in 2001.

What we've learned during Bourbon is that some of these pages from these cobrands are still out there, and are outranking the primary domain's pages. We've only just 301'd them back to the primary domain.

My understanding of the duplicate content penalty has always been that it's not a penalty at all - Google simply ignores page duplicates, and guesses at the primary page for indexing purposes. Our experience would reinforce this.

That's not to say that Google hasn't now decided to penalize all versions of identical pages.

4:49 pm on June 1, 2005 (gmt 0)

New User

10+ Year Member

joined:Nov 21, 2004
posts:20
votes: 0


Thanks GoogleGuy! I bet your thread is going to be the most read thread at WebmasterWorld.

I'm still trying to figure out what to ask you guys at WebmasterWorld in New Orleans? I have alot of questions, and this update will be among the questions.

4:53 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 8, 2004
posts:527
votes: 0


For 2 word search terms, pages which relate to just one of those terms but contain the other term somewhere are beating pages which relate to the two terms (ignoring all other factors).

Know what would be neat? Could we test this with a 3-word search query, and a 4-word search query? Sticky me for instructions, I'll crunch some numbers for my familiar searches, and report back!

5:05 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 8, 2005
posts:146
votes: 0


On the www vs. non-www topic... I can guarantee 99% of webmasters do not know about this issue. Therefore, it should be about 99% of websites out there that should have trouble with this issue. It is also standard setup at 99% of hosting companies that the www (www.###.com) and non-www (###.com) are set to resolve to the same IP address. Therefore, why should Google have problems with this? I do not see discussions on this for MSN, Yahoo, or Ask. Why is Google special in this case?

For our site, the Google page count is incredibly inaccurate. The site command says we have five time more pages than we actually have. With Yahoo, MSN, and Ask - all are right on target. Even when I have written Google, I get the canned response - "Number of pages indexed can fluctuate and rankings can change."

For once and for all, Google Guy should come out and say if this is an issue or not. If it is, people can take the measures to fix the issue. If not, we can move onto something new.

Something must be very flawed with Google if I have to contantly worry about www vs non-www, absolute urls, 301 vs 302 redirects, putting a trailing slash on urls, etc...

Just my thoughts.

5:25 pm on June 1, 2005 (gmt 0)

Full Member

joined:Jan 12, 2004
posts:334
votes: 0


Clint
You may wish to view the following 3 threads concerning 301:

[webmasterworld.com...]

[webmasterworld.com...]

[webmasterworld.com...]


Thanks, I got the www vs non-www issue and 301 redirects resolved. If anyone needs to use it to fix your issue, this is what worked for me in cPanel's htaccess file:

RewriteCond %{HTTP_HOST}!^www\.domain\.com [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*) [domain.com...] [L,R=301]

That needs to go under your "RewriteEngine on" if you already have it. If not, RewriteEngine on needs to be placed above it. This will direct non www request to the www domain, and the header check shows it as "HTTP/1.1 301 Moved Permanently". So this should fix the issue (if any) of the dupe content issue of non www and www sites.

5:42 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 19, 2005
posts:367
votes: 0


oddsod,

Advertisers can go - others will take their place - it's the publishers who are key.

I see your point. To remain on topic...

As the algos change during Bourbon, sometimes I see scraper directories with adsense dominating, then at other times I see clean serps without them. The primary focus of Bourbon seems to be how to sort out true adsense publishers from these scraper directories.

It defeats the purpose of using Google if you search for "widgets" and then get an index of directories cataloging the sites that sell those widgets. It gives the impression that google is abdicating its responsibility and purpose for existance as a SE.

On the other hand...an adsense publisher that writes product reviews for widgets would be of great benefit to the searcher, then have adsense ads pointing to widget suppliers. Unfortunately, scraper directories are crowding out the true adsense publishers and widget suppliers alike.

The dilemmia for google is how are they going to identify the good adsense publishers from the bad. It seems that with bourbon there has been alot of collaterial damage.

I am one of those widget suppliers and tired of seeing my keyword serps with so much of this "pollution"...not to mention my original content being scrapped.

6:07 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 3, 2004
posts:59
votes: 0


I am one of those widget suppliers and tired of seeing my keyword serps with so much of this "pollution"...not to mention my original content being scrapped.

I sort of envy you - I would love if my site was being scrapped. :(

Right now my site dropped mostly 40-50 places and that is for unique content which can be found only on my site - every other webpage that has link to me got pushed in front of me!? :o

And if that wasn't enough if I try to do search for my domain name ("domain.tld") it is lost on bottom of page 6 as if someone at Google really hates me.

Few pages are still amongst first ten but only because my site is only one on the net which contains that information, but here also Google tries to put anything that could possible be related in front of my website. :(

RS_200_gto

6:12 pm on June 1, 2005 (gmt 0)

Inactive Member
Account Expired

 
 


Google's stock is up as our sites and traffic goes down!
6:15 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 19, 2003
posts:804
votes: 0


<message>

oldpro, I trust you got my sticky. Was the information helpfull?

<rant>

As for what/who is on first, we still have about a week more to this dance.

<soapbox>

It is the advertisers who are footing the bill, and it is the publishers that are providing information the a potential buyer reads while deciding if item 1 or item 2 meets his needs, and it is Google who is matching the page content to the advertisers.

In this situation all parties derive a portion of the $ stream. The advertiser from the buyer, the publisher from Google, Google from the advertiser, the buyer doesn't get a portion of the $ stream, but has been able to evaluate various items, and then make an informed purchase.

</soapbox>

If however the searcher can't find information other than just an ad he'll go elsewhere.

</rant>

So we have our answer at last the fat lady is still in the building warming up for the closing act. In lava speak, it ain't crusted over yet.

Awaiting, GoogleGuys very own thread.

</message>

6:22 pm on June 1, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


We'll see soon, but if Google is introducing more "improvements", doesn't that mean that more sites will be caught with the new "improvements"?
Now if they're changing or replacing the current filters...

[edited by: walkman at 6:24 pm (utc) on June 1, 2005]

6:22 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 19, 2005
posts:367
votes: 0


bear...i now consider you a friend...

Yes, I did get your sticky and it was very helpful...put my fears to rest.

As for the duplicate content issue from page to page...

I am not sure how I can get around this as my website deals with one thing. I guess the only things that count is whether my customers find it helpful and hopefully google will not turn the filter up too high on me.

As they say across the pond...

Cheers

6:54 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Feb 6, 2005
posts:1858
votes: 106


oldpro

>The dilemmia for google is how are they going to identify the good adsense publishers from the bad. It seems that with bourbon there has been alot of collaterial damage.<

It isnīt that much difficult as some might think.

- First define what is a "bad" AdSense publisher.

- Find examples of bad AdSense publishers and study their sites.

- Write killer algos to remove the garbage out of the index.

If Google has a problem in the first two points, they can just come here and ask for feedback. Most of us AdSense publishers are willing to help.

BTW, have any of you a detailed definition of: a "bad" AdSense publisher?

7:05 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 8, 2003
posts:116
votes: 0


Been thinking (well wishfull thinking actually) about GG's post about only being 4/7 th's done.

I wonder if current PR data still needs to be brought in?

Two things make me think this is a possibility

1) 2 of my sites which last month made it into the index on the back of new links have dropped in ranking on uncompetitive terms.

2) I remember comments made by GG about one way they test their new algo's. They work behind the scenes on a fixed environment and when it comes to releasing the algo they release this environment together with the new algo to see if the algo works in the real world. I remember him saying on an earlier update that the PR data was old and they would bring it in once they were happy with the new algo.

Here's hoping.

7:09 pm on June 1, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


>> I remember him saying on an earlier update that the PR data was old and they would bring it in once they were happy with the new algo

I doubt that the huge dropouts on this update are related to gaining or losing a PR point.

7:11 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 27, 2003
posts:570
votes: 0


Ummm get rid of bad adsense sites by not allowing them in the first place! Maybe use media bot as a way to flag a site for inspection.
7:28 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 19, 2005
posts:367
votes: 0


My definition of a "bad adsense publisher"

One that targets an adwords advertiser or group of advertisers...scraps their content without adding any original or useful supplimental information...the page is nothing but a psuedo directory optimized for the target keywords...pollutes the serps and acts as nothing but a glorified doorway to the advertisers.

Although not define as such, but the same effect as hijacking. Serves no purpose, but to butt in to get a piece of the action.

Then there is the "dark side" scraper directories with adsense that actually do 302's hijacks to steal your PR.

These are the first that I think google should give the death penalty to.

7:35 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 8, 2003
posts:116
votes: 0


walkman,

I agree, I didn't mean it explained everything.

My earlier theory might explain some of the major dropouts.

That is that when considering on-page content Google now seems to be happy with a single on page occurence of a term in a multi-term search phrase. Wheras before multiple occurrences of individual terms and proximity were more important.

This results in far more pages appearing in the initial result set for a phrase, allowing irrelevant but high ranking (in terms of PR, backlinks and site rank) pages to compete with your more relevant page in subsequent stages of ranking.

I have a vague feeling of deja vu here. Wasn't there an earlier update where we saw a lot of irrelevance from Google? People were talking about 'over optimisation penalties'. Didn't GG say there was still more data to be added and then after a few weeks things improved?

7:45 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 8, 2005
posts:146
votes: 0


It is easy to identify bad Adsense publishers... Very simple plan:

- Premise: The Adsense database knows every web page that ads are on. Let just say ads are placed on 50 million pages.

- Hire 500 temp workers. These do not need to be high skilled workers. Provide them with templates of what a scraper site or bad sites look like.

- Write a program that pops up one URL at time in front of the worker. The worker compares it to the bad templates. If it is bad, they mark it for a second review. If it does not pass second review, they are thrown out of the Adsense program. It is also flagged to be removed from the main Google index.

- On average, each worker can look at 100 pages per hour. With 500 workers, that is approximately 50,000 pages per hour. This would work out to be about 400,000 pages per day. If you run a 24-hour shift, 1.2 million pages per 24 hour period. This means, you would complete a full review of all adsense pages within about 1.5 months.

The following issues would be served:
- It would reward good publishers. Ad inventory going to bad sites would now be available for good sites. Possibly increasing CPM.
- It would increase the credibility of the overall Adsense program. They should have been more picky from the start. They should not have allowed the rule that you can use your publisher id on any site.
- Adwords publishers would feel good that their ads are not being pushed to bad sites.
- It would help the economy by putting 500 people to work. Google says it gets 1 million resumes a day. I am sure they would not find it hard to find 500 people to this. They could even do it from home.

Sometimes technical solutions dont cut it.

7:46 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 19, 2005
posts:367
votes: 0


I am looking at a "bad adsense publisher" now that is occupying a top five position in my keyword serps right now...

It is a directory of sorts that has an index of popular search catagories. Basically it is structured like this. Click on a catagory and it brings up a list popular websites in that catagory. The page templates are the same...adwords by goooooogle on the right side and at the bottom of every page in every catagory is links to online poker, via@ra, matchmaking, etc...at the top of each catagory listing is content that a 6th grader could tell that has been scrapped from the websites listed.

This garbage has no redeeming value. Obviously written for adsense and ppc for gaming, drugs and dating sites.

Digging a little further...this same website is in most of the top 10 serps for the targeted keywords in each catagory listing.

7:49 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 18, 2003
posts:191
votes: 0


wiseapple,

I doubt each of the workers will work 24 hours per day. If three shifts, only 500/3 or 167 workers can work a shift.

7:57 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 5, 2002
posts:1382
votes: 0


>..pollutes the serps and acts as nothing but a glorified doorway to the advertisers.

Isn't that what the advertiser wants?

7:58 pm on June 1, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 22, 2003
posts:118
votes: 0


You only need "the workers" to examine and dump the scraper sites in the top 10.....

Once every 60-90 days.

7:58 pm on June 1, 2005 (gmt 0)

New User

10+ Year Member

joined:June 6, 2003
posts:25
votes: 0


Google doesn't need to hire all of those people -- they could just start reading their spam reports...

[google.com...]

[Admins: Please delete the Google URL if it violates site policy.]

8:03 pm on June 1, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 19, 2005
posts:367
votes: 0


Mhes,

Isn't that what the advertiser wants?

Good point, but the organic serps in theory should be what the searcher needs or wants.

8:10 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 5, 2002
posts:1382
votes: 0


>but the organic serps in theory should be what the searcher needs or wants.

But if Google sees people clicking the scrapers in the organic results and then clicking the adsense on a scraper site...... the evidence is that the user is finding the site they want AND google is making money indirectly from the organic results.

Win Win

8:12 pm on June 1, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 1, 2004
posts:3181
votes: 0


It's funny that GoogleGuy mentioned that we should stop worrying about ranking right now. After 12 months in the sandbox, I've all but given up on Google. I mean how sad is this - that I would check my Google rankings each day hoping that I would get out of the sandbox.

For last month I got a total of about 30 referals from Google. Mostly on exact phrases (in quotes). Often I am the only result. The only phrase I now check is my name (somewhat unique and no one would optimize for it, not even me). I rank ~450, although a couple of DCs had me at 780+.

I think they will eventually figure out that the site is legitimate. I can't imagine my site has a penalty. Several members here were nice enough to provide me with good comments. None came back with any warnings (they shouldn't have since it's white hat).

If it's due to links then I'm doomed. Scrapers love my site because it does well in Yahoo.

8:26 pm on June 1, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


as far as adsense: they can flag via the algo the ones that match let's say 95% of a scrapper site's characteristics, e-mail the webmaster with a one week notice of suspension (unless you e-mail and clear with with them).

They can also put some sites under a "suspicious" category and check them manually. No need to check every page of a site. If you run a site: command, you can see what's hiding in there. Domain.com/high-paying-keyword/high-paying-keyword.htm tends to raise a few alarm bells.

This way the serps are cleared and no innocent site gets caught (because you have week to e-mail them and state your case). To avoid a backlog, they can issue warnings in stages: by IP range, category, country etc.

This 789 message thread spans 27 pages: 789