homepage Welcome to WebmasterWorld Guest from 54.211.235.255
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Google Condones Cloaking
Dust off those IP lists.....
WebGuerrilla




msg:211942
 5:48 pm on May 28, 2004 (gmt 0)


[searchenginewatch.com...]

 

BigDave




msg:211943
 6:45 pm on May 28, 2004 (gmt 0)

Google has always condoned certain cloaking. In fact, they recommend it in regards to session IDs.

The way I look at it is that Google doesn't have to be fair. It does not have to apply the same rules to me that it does to NPR.

It's their index and they can do with it as they please. I have similar favoritism policies. I am quite willing to get up and get a drink for a pretty young thing, but hairy legged old boys can wait till I am getting up to get one for myself or go get it themselves.

To tell you the truth, I would rather get the transcripts from NPR than the audio most of the time, so I wish they would offer both.

shady




msg:211944
 6:54 pm on May 28, 2004 (gmt 0)

This is a very valid and fair reason for cloaking.

"Cloaking" becomes a "dirty word" only when misused by those who are abusing and defacing our lovely search engines with their rubbish!

quotations




msg:211945
 7:01 pm on May 28, 2004 (gmt 0)

Most of the cloaked .edu sites are allowed through as well.

junai3




msg:211946
 8:15 pm on May 28, 2004 (gmt 0)

If you do something with a website, and there is good reason for it - then google most likely won't penalize you. In the case of NRP, there cloaking has valid reasoning behind it.

Using cloaking strictly to trick search engines, on the other hand, can get you in big trouble.

john_k




msg:211947
 8:27 pm on May 28, 2004 (gmt 0)

Since the content is actually there, available to the web visitor, is it even cloaking? I mean they aren't pretenting to have content that's not there. And they aren't providing content to google that is in no way related to what is on their site.

This seems to be more of a "translation" or conversion.

For other sites that have audio files, but that can't get the attention of google, it seems that it would be smart (and honest) to include a link to a transcript of the audio. That would be useful to both google and your visitors.

itisgene




msg:211948
 9:12 pm on May 28, 2004 (gmt 0)


"Google has always condoned certain cloaking. In fact, they recommend it in regards to session IDs. "

Hi, BigDave,
Do you happen to have a bookmark for those OK'd cloaking by Google?

We are considering detecting User Agent for Session Id issue. We are putting session ID to all visitors now and I believe it is hurting our SE-friendliness.

Can you point me to the posting?

makemetop




msg:211949
 9:12 pm on May 28, 2004 (gmt 0)

Dust off the IP lists?

Some of us never put them away ;)

Youch - shouldn't have said that - too many drinks on a Friday!

BigDave




msg:211950
 9:53 pm on May 28, 2004 (gmt 0)

[google.com...]

Second bullet point below Technical Guidelines is

Allow search bots to crawl your sites without session ID's or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page

The way that some people talk, you would think that dynamic content = cloaking. As someone recently pointed out, Google serves me different HTML code than they do to someone that use IE as their browser. The real content is the same, but the code they output is different based on my UA.

Go ahead and do some minor things, just don't go overboard with your cloaking. Or if you do go overboard, don't cry about it if you get dinged for it.

dhatz




msg:211951
 11:48 pm on May 28, 2004 (gmt 0)

What's the best/recommended way to avoid indexing of HTML ads? Since the ads are at the top of the page, it also messes with my snippet in the SERPs

Would "cloaking" via SSI includes be OK in this case (it's certainly for the benefit of the user, as it helps the SE index only the RELEVANT content)?

anallawalla




msg:211952
 2:12 pm on May 30, 2004 (gmt 0)

Would "cloaking" via SSI includes be OK in this case

If you meant the ads are in the include file, then no. The robot will see them because of the SSI. If you used Javascript for the ads, the robot might not.

webnewton




msg:211953
 12:32 pm on May 31, 2004 (gmt 0)

I still remember the "JEW" listing case where the site on the top of SERPS was antijew. Then Google refused to make changes to SERPS saying that they can't do any manual changes and the whole process is automated.

Why then this kind of prefrences for some when others can't use the same technique?

dhatz




msg:211954
 2:30 pm on May 31, 2004 (gmt 0)

If you meant the ads are in the include file, then no. The robot will see them because of the SSI. If you used Javascript for the ads, the robot might not.

Just to clarify, I don't care about "PR leak", "hide the links", just HIDE some of the text.

My pages have 5-25kbytes of text in greek and a couple of lines of ads in english, yet the snippets show the english text...

I meant do customised content delivery per user-agent (if it's googlebot or slurp, strip off the ads), or check for both UA+IP.

Google comes with different UAs from time to time, we all know that (I notice it in my logs). I don't want to do UA+IP, because it'd seem as if I wanted to hide something "suspicious", which I'm not. Just UA. Which on the other hand, could be interpreted as sloppy cloaking attempt.

What a mess!

anallawalla




msg:211955
 10:55 pm on May 31, 2004 (gmt 0)

Just to clarify, I don't care about "PR leak", "hide the links", just HIDE some of the text.

That's what I meant - with SSI, Googlebot will see the ads. Nothing to do with PR leak. Here, SSI is just a convenient way to manage the code for ads, i.e. not have to plant it in each page but call it with an include. IOW, I don't see it as cloaking.

But if you can serve a different page minus the SSI call to the ads, based on IP addresses, that's cloaking.

To answer the question about not getting the ads in the snippets, you can use clever positioning of tables so that the top row is below the main body of the visible page. You can also use CSS to do it. Ask in the CSS forum.

mbauser2




msg:211956
 5:54 am on Jun 1, 2004 (gmt 0)

For God's sake, did anybody read the original article?

Not Danny Sullivan's fear-mongering gibberish, but the news.com story [news.com.com] he's interpreting? Here's the key quote:

. And the company's strategy is working so far: In recent weeks, NPR audio has begun regularly appearing on the index pages of Google News and Yahoo News, and clips also crop up when people search for news-related keywords, such as "Abu Ghraib," the name of the notorious prison in Iraq.

NPR isn't feeding transcripts to Google, it's feeding them to Google News.

Google News has always played by different rules than Google Search. Google hand-picks what news providers to spider, and lets them serve spider-friendly pages to Googlebot. (How did you guys think Googlebot was spidering all those subscription-required news sites?)

You guys want to cloak? Convince Google that your're a dependable news organization.

ukgimp




msg:211957
 8:22 am on Jun 1, 2004 (gmt 0)

>>If you do something with a website, and there is good
>>reason for it - then google
>>most likely won't penalize you

I need more money. Is that good enough :)

dannysullivan




msg:211958
 11:17 am on Jun 1, 2004 (gmt 0)

For God's sake, did anybody read the original article?
Not Danny Sullivan's fear-mongering gibberish, but the news.com story he's interpreting?

Sorry you found my story to be gibberish. I did read the original News.com story. I was amazed that the cloaking aspect wasn't addressed, given the high-publicity surrounding WhenU being booted for the same thing only two weeks before.

My goal wasn't to fear-monger. It was to point out that Google is allowing NPR to do something it explicitly warns not to do on its web site.

That's a substantial change of position. There are plenty of people in the same situation as NPR with quality content that would like to do custom delivery as well. At the very least, they need to be alerted to the fact that Google now seems to allow this -- assuming you know the right people at Google.

NPR isn't feeding transcripts to Google, it's feeding them to Google News.

Not so. The longer version of my article for our members detailed how there were over 230 examples of cloaked transcripts showing up in Google web search. Have a look yourself: [google.com...]

Google hand-picks what news providers to spider, and lets them serve spider-friendly pages to Googlebot. (How did you guys think Googlebot was spidering all those subscription-required news sites?)

I agree entirely that they've set up mechanisms to get into registration areas. However, they've not given any indication of spidering content from those areas that is substantially different than what a user sees.

Also, Google may have allowed true cloaking of other sites before NPR. I'm not saying this is necessarily the first case. It's just the most public and clear-cut that I've ever heard of being allowed.

Let me clarify a few other things that have been raised:

Google has always condoned certain cloaking. In fact, they recommend it in regards to session IDs.

Google wouldn't consider that cloaking. In fact, in my most recent article, I make reference at the end to another one I did from last year that talks about the big debate on how to even define cloaking.

Those who want to cloak have used things like the session ID argument or country-IP targeting to say "Google does it to." Google itself most definitely does not consider such things to be cloaking, when I've talked with them.

This is a very valid and fair reason for cloaking.
"Cloaking" becomes a "dirty word" only when misused by those who are abusing and defacing our lovely search engines with their rubbish!

I agree. There's no question that letting NPR do this has a real advantage to the searcher. Part of my original article from last year was to say how the argument over "cloaking" period was absurd.

Cloaking is just a content delivery system. To define cloaking itself as spam opens up all sorts of problems. Instead, the focus should be on the actual content delivered. Is the content misleading or spam? That's the bigger issue.

Indeed, WhenU really seems to have gotten booted because apparently because misleading content. It was cloaked, but if the content hadn't been misleading, they might still have been allowed. I'm not suggesting that NPR is being misleading. They just enjoy a benefit that other sites would like to have. Whether Google wants to expand that remains up to Google.

It's their index and they can do with it as they please.

I agree. But it is an issue to be telling the world "don't cloak" and yet have at least one major cloaking arrangement running behind the scenes. It would be better to say, "don't cloak, unless we've given you explicit approval."

This seems to be more of a "translation" or conversion.

No, it's cloaking. Google is seeing a full transcript of the audio content, a large textual transcript that exists on a regular HTML page that anyone could view. It indexes that transcript but instead sends you to a completely different page that has only the audio content and the ability to buy the text transcript.

Why aren't you sent to the page that has the original text transcript? Chances are, because that would hurt NPR's sales of these transcripts. But technically, there's no reason why these couldn't be shown.

Let me end with a few last points.

+ The point of the article wasn't to say go out an cloak on Google or elsewhere. It's simply pointing out that Google is now making a major exception to its cloaking rule.

+ I've long said that cloaking should not in and of itself equal spam.

+ Using cloaking does not suddenly mean you'll get better rankings on search engines. There's nothing magic about cloaking content. But in some situation, people may wish they could do this to better serve the dual needs of spiders and humans.

+ Using cloaking to hide low-quality content isn't advised and may get you banned.

+ I still wouldn't recommend cloaking against Google unless they've given you explicit approval.

quantobasta




msg:211959
 1:49 pm on Jun 1, 2004 (gmt 0)

I don't think Google is very good at detecting cloaking. There is a german *.de domain with about 170,000 serps which are all cloaked. If the ua is for a bot a page with links to Amazon products is returned but if the ua is for a browser the request is redirected to Amazon. This is cloaking abuse at its worst yet Google has yet to ban the domain.

ciml




msg:211960
 2:22 pm on Jun 1, 2004 (gmt 0)

Thanks for the clarification, dannysullivan. I can't say I'd disagree with any of your points, but cloaking (of various types) is very common. Especially if we include the likes of Javascript cloaking, geotargeting, etc.

> cloaking should not in and of itself equal spam

Quite. I think things became confusing when people inside search engines started to decide that certain types of cloaking were not cloaking.

It would be easier if they'd just define their view of 'good cloaking' and 'bad cloaking', but a search engine will never want to be too clear about where they draw the line for index quality.

BigDave




msg:211961
 4:13 pm on Jun 1, 2004 (gmt 0)

I think that it would be a bad policy for google to try to suggest what is "good cloaking" or "bad cloaking" or suggest that you can somehow "get permission" for the good cloaking.

The public advice to webmasters should remain "don't cloak", just because there are certain people around this forum that would assume that their reason would be Good Cloaking when no cloaking is really necessary.

Even if you have an audio piece too, do you have to put up a transcript or is a synopsis good enough (more SEO can be done on a synopsys) Are you allowed to bold words that were not emphisized in the the interview?

What I do wish that they woule do is add a statement somewhere that their goal with these rules is to guard the quality of their index, and at their sole discretion, they retain the right to consider other factors and ignore some of these if they believe that it is in their users best interests.

Marcia




msg:211962
 7:22 am on Jun 13, 2004 (gmt 0)

dannysullivan
>>Whenu

Exactly. And comparing National Public Radio to Whenu is kind of like comparing the Google or Yahoo toolbar to the IBIS toolbar [google.com].

Interesting analogy brought up, thanks. It kind of broadens the scope of the rationale and perspective on the issue and attitudes toward cloaking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved