Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
It isn't anything misleading and bots will see the same content as a visitor would from the country they originate from... Do you think this would be considered cloaking by the major SE's?
Better to use "clickable flags" or somesuch which gives the choice to your incoming reader/customer ...
But this kind of selective "feed" from a search engine point of view wont get you any penalty at all ...
They are doing it themselves with geotargetting..
We realized 5 language flags plus we give the content for guests and SEs according to their browser language - Google seems to have always English ... If somebody clicks a flag it will be written into the cookie and he/she will get the right language.
Can it be that Google sees some cloaking - or do you think the new pages are considered as SPAM?
[edited by: volatilegx at 4:55 pm (utc) on July 23, 2004]
[edit reason] snipped URL [/edit]
I beg "viking food" is SPAM (?) I thought so too when our figures dropped every day since we introduced the 38.000 american radio models (with few data). Therefore we started now to add about 50 lines of randomly gathered models (each only one line) as examples - later (first a test) they become links to these model pages and back. It can't be worse than now because G is throwing out all model pages ...
Biscuit sec: the best (to me) are the "Basler Läckerli". You deserve a kilo of them ...
Vous parlez Français? Merci de toute façon.
Ernest (HB9RXQ = Switzerland)
There's no doubt here at all. If you serve different content to different users/browsers it is cloaking, as that is the very definition of cloaking.
However, you shouldn't worry about that part. In stead, you should worry about this issue: "is it bad practice and possibly harmful?"
I'll tell you about that exact issue at the bottom of this post, i have to comment upon RMorgs post first, as i think you might have overlooked it.
By the way: Welcome to WebmasterWorld RMorg :)
>> we give the content for guests and SEs according to their browser language
RMorg is right on the spot here. It is possible to cloak based on the browsers "Accept language" setting (i can't remember the exact name) - it will be a set of letters like, say, "en-US", "fr-FR" and similar. The browser sends this data to the server when a page is requested, so it should not be a major task. IMHO, this is much better than IP-adress based content serving, as IPs are not very country specific (there are many false positives).
>> If somebody clicks a flag it will be written into the cookie
Again, RMorg knows what to do ;)
>> Google seems to have always English
So, Google will get one set of pages, in one language. No problem. Unless you really want Google to index the other languages as well. In that case, go for country-specific domains or subdomains, don't put it on the same domain.
>> It isn't anything misleading and bots will see the same content as a visitor would from the country they originate from
Now, read this carefully. It's all from Google's webmaster guidelines [google.com]. I'll walk you through it:
- Allow search bots to crawl your sites without session ID's or arguments that track their path through the site.
- If your company buys a content management system, make sure that the system can export your content so that search engine spiders can crawl your site
- Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you. Another useful test is to ask, "Does this help my users? Would I do this if search engines didn't exist?"
- Don't employ cloaking or sneaky redirects.
- Don't create multiple pages, subdomains, or domains with substantially duplicate content.
1) Ask yourself: "How would i do that?". The answer is: By serving the SE bots special pages that does not have Session IDs. This is cloaking, and hence one special case of cloaking is encouraged by a SE.
2) Ask yourself the same question as above. The answer is the same.
3) That's a few very essential questions for you. This is the major "bad practice" definition. Translate to: "if it's done only for SE value, it is potentially harmful, but it is not bad practice if it's done for user value only". Cloaking by language (for users only, eg. like RMorg wrote), i believe, passes all these tests.
4) Now, that's a tough one: "Don't employ cloaking..." You're probably already discouraged here, but, AFAIK, that's also the intention. Do yourself a favour and put extra emphasis on the two following words (people tend to overlook those): "or sneaky". Now, the message is somewhat more balanced, right?
This is actually about sneaky redirects which is one particular type of cloaking, ie. a subset of cloaking techniques, not the whole category. Now, ask yourself: "is (1), (2) , or (3) above sneaky?. No they're not, but still we're talking about cloaking... so, not all cloaking is sneaky, huh?
So, now we'we established an important fact about cloaking, ie. that not all cloaking is bad for users. We've also established the fact that not all cloaking is bad for search engines.
So, logically, it follows from these two facts that not all cloaking is bad.
This is why i included (5) above. I knew that even though i could tell you this, you might still be sceptical. So, let's take a look at it:
5. Don't create multiple pages, subdomains, or domains with substantially duplicate content.
5) Now, isn't that last piece of advice totally against the advice i gave above: "go for country-specific domains or subdomains, don't put it on the same domain"?
The key to the understanding here is substantially duplicate content, especially the words duplicate and content as in the literal sense. Now, recall what Leosghost wrote, as that's actually quite illustrative:
dirait "biscuit sec" pas "cookies" unless they are from Lu
In particular, ask yourself what's really so duplicate about "biscuit sec" and "cookies". Sure, the terms can describe the same physical object in the real world. But: They are not the same terms. You should take this quite literally, as a SE spider does not "understand" anything, it just reads text. In this case:
1) "biscuit sec" is two words and 11 characters (including whitespace)
2) "cookies" is one word and 7 characters (including whitespace)
3) common letters are "i", "s", "c", "e"
There's simply no way you can convince me that these two concepts are in fact the same word. Simply put, they might mean the same in different languages, but they are not the same.
Accordingly, a text in English is simply not the same text when translated to French, even though the subject matter is the same. Now, reconsider what i wrote above:
Google will get one set of pages, in one language. No problem.
Nothing sneaky about that. In fact, if you have separate language content on separate domains or subdomains, Google will find it easier to serve up the right search results pages for the right users.
As you probably know, Google does display different content depending on your language/country (what was the word for that, again?), so your English pages should come up for English users and your French pages should be listed for French users. Having different language content on different (sub)domains makes this much easier for everyone.
[edited by: claus at 1:17 pm (utc) on July 24, 2004]
Spot on ...And you even got my "illustration" ...At the time I realised after posting that it might be a touch obscure ( contrary to received opinion around here ..I can do "subtle"..just forgot to flag it as such there ;)).
I have a site for which everything is on one server in the USA ..it's a dotcom ( to have the dotFR would mean laying out $7000 per year ..because only French companies can have the FR and thats what the minimum is for a French LTD ( sarl ) company yearly in taxes etc )..
All pages are in two versions ...English ..right down to the file paths , alt ags etc ....and then the same thing in French on another page ...
Only two pages link to their corresponding "other language page" directly ...these are the "index" and it's translation ...
All other pages link to their own index and interlink via the navigation with a site map link at the foot of each page ( English and French site maps are not on the same page )...
Text length ( as explained by Claus ) is different as has to be the case because some ways of speaking have to change according to linguistic reference areas and language structure , culture etc ..( which is why babel fish and the like can't cut it..ever! )
Images are also not quite the same throughout the site as some cultures are attracted more by certain kinds of image than others ...or I have changed the order in which they are accessed in page sequence and navigation .Although the pool of images is the same
Every page does give you the possibility to go to the "start" page of the other language ...
Google considers them ( to read the serps ) as virtually two distinct sites ..in that there is no "duplicate page" penalty anywhere ..and each page ( which is optimised ..albeit spammily ( hey it works and has done for a long long time! plus it's an "images" site so very little text and not much scope to play with similis ) for its target keyword or two keywords is presented in the correct language in response to a search from Google via that particular language "g" page ...
It also ( up to now ) has them all on first page and most at #1 for what they were intended to catch ...
In both French and English serps in Google , Yahoo? MSN , alta? lycos etc it works just fine ...
In many cases I have the "indented" #1 plus #2 slots on google and also yahoo , Msn ,alta', L etc this way ...
so I think it's safe to say that this form of "cloaking " is acceptable and as Claus has shown why ..even encouraged and rewarded by search engines ...
BTW Ernst.."GG" ..in this place means "google guy" ..an ( absolutely not )PR man /woman ( don't know never met ) who posts here and drops cryptic hints about such stuff ..except when there an IPO pending or "jaws" is running well ..;)
I'm really relieved. Also it is not necessary that we feed other languages to Google because everybody get's it in his/her language by the browser setting AND can change by flag clicking.
The only draw back we have is that the pictures are the same but surely that is tolerated (but if there are many some day ...?). An other draw back is that I could not yet translate everything to other languages, leaving much in English also for F, I and E (Spanish) - but SEs don't see that - it is only for humans. But well - don't they? That reminds me that we shoupld perhaps change in a way that the flags are not clickable by the SE's - only on the HP. There could still be an error - I think now.
But surely we have a punishment by Google. And I think now it's the 72.000 model pages - precisely the 38.000 pages for the US models with up to now little information.
I think it is forbidden to post here an example address. Therefore I sind you one per e-mail.
One Term: "Would I do this if search engines didn't exist?" We can't meet completely: We did put now the model name as a h1 and repeat it as an h3. Plus new: We give an additional list of about 45 randomly fetched names of other radio models displayed below the content of the particular radio description. This to reduce the equal content in percentage to the individual. There are many titles etc. which must be the same but we already could delete the ones which show no content.
About our cookies not being from Lu (what is meant by Lu?). In 5. Claus did a nice example for everyone and I think we all could learn a lesson. We have to see "duplicate" in words, not meanings.
So it is SPAM that matters here. Since my profile shows the URL for RMorg I can write: please look up some model pages by typing 123 into the model search. The first model is "89123" and it has many words which are repeated like "Principle, Power type and voltage, Loudspeaker, Valves/Tubes, Source of data etc. All the 72.000 pages will show this ... We now try to get more variable content by adding "Random examples of other models". As long as one can not click these lines below this is mainly for the SE's - neglecting the question "Would you do it if there were no SE's ... I don't dare to create these lines as internal links including internal backlinks because only 1 % of the model pages are left - getting 99% links into the empty space.
But we loose most model pages just now as you can see by Google: info:URL then clicking "link to" (specially page 3 onwards). Also some forum pages vanish - very sad!
SEO and them ( with the exception of some such as martin hagstrom who post in both places ) just do not mix ...
That's not quite right - Craig from Google groups posts here as cbpayne - and I know another regular Google group'er who lurks around here.
CIML, Patrick Taylor, anallawalla and GoogleGuy are also oldtime Google group'ers.
you go there you'll just get confused and given bum steers ..
O.k. I'll steer my bum back to the Google forum :)
Now I'm gonna shut up and read what Claus said ..
And a wonderful post it was. I think Claus made a good point against cloaking - namely that if you have a site in 5 languages you want to attract visitors who speak those languages - so it doesn't help to always serve English to Google.
Model names are always the same in every language for a given radio model - often being a number combination. When user gets the content he get's it in his language unless he clicks "Cached" content. He is fine. Same for the forum where one sees every language but can soon select his language (session) like members already hide langauages at wish.
Would you have done it differently?
This solution here is "Cloaking by language flags" and no offence to SEs because content is different - I have learned by this thread.
BTW I'm sure google guy is in many places ...but he makes more sense here ..as does all the advice in general ..
any fool can SEO for a 4 word phrase and hit page one on google provided they can spell and its not mega competitive like real estate or pron .
plus his ( Rmorgs) site is by it's design very thin on text ....and has some of the longest meta keywords/description tags strings I ever did see ....ouch!
Rmorg ...More than around 60 to 70 characters including white space is overkill even for "all the web" ..
And the description should be different for each page ...
Without getting into specifics on your site on this forum ( cos it's not for reveiwing individual sites )..your content on each page is I think too similar across the language spread ....
I would be very very surprised if you are penalised for cloaking as such ..I think more probable is that you are considered by google at least to be serving the same page thousands of times under different names ( sort of the antithesis of cloaking but so far around the circle from it as to be almost cloaking ..if you seewhat I mean )...
the variables on your pages are just not enough to make a difference IMO ...particularly the fact that all the non visible stuff is 99.9% identical ...and the visible stuff doesn't change very much ( maybe not enough IMO )either ...
sorry if this a bit to much like a site review ..mods admin .. snip as you think fit
I cleared the question on possible Cloaking and got the knowledge of having to change the tags - which will take me some time to really understand where I can do what - also according to the limited possibilities because of existing programs.
Since there are many problems I will have to sort out the biggest obstacles for Google first - and I think this is SPAM. Incoming backlinks are OK but I will have to work permanently ...
Please don't think I don't accept your advice. And I am most grateful that you even had a look into some pages. That is exceptoinal.
I think the meta I can change after sorting out SPAM since Google does not consider this as a big item (or keywords not at all). But yes, I believe they are adding to same content.
I was told that the heavy design does not matter - but you tell me an other story. Which invisible stuff do you address? I think it is an easy one if it is alt tags etc. but very tragic if it is graphics for the design. :-(
I have now posted the question of "SPAM and Page Rank" in the board: Search Engine Promotion.
Graphics does not matter to the search engines, as they do not see them. All the search engines see is the text.
They prefer the text that is visible on the page, but if you do not have very much visible text on the page they will start looking at meta tags too.
So, if your meta tags hold a large proportion of the meaningful words of every page - and these meta tags are similar (or identical), then your pages could easily be seen as duplicate pages, even though the meta tags are not visible to a human reader.
Btw. 3-500 chars seems to me like it's far too many. If you have that much text you want to use for describing the content of the page, i would advice you to put that text on the visible part of the page in stead, and please don't have identical text on all pages.
If you found the 300 to 500 chars on the HP (search page) this would not matter since this one is unique. I think I have to see for "Countries", "Makers" and Model pages. And there it seems pretty good (?).
I hope you see now very varying text on the model pages. I hope it is a wise solution.
can't do anything about that.
But here is how I work around it: I have areas on my pages that show "items of interest to you", which using a combination of geo location & cookie tracking (repeats) show info & products that we consider are targetted. The user then has an option to dive into the "paris" section.
Can you explain your "big NO" by looking at the example <snipped>?
[edited by: volatilegx at 4:00 pm (utc) on Aug. 9, 2004]
[edit reason] no specifics please [/edit]
On the site, you know there are language sections that aren't appearing across all the search engines and that's going to have more to do with the 'clever' cloaking being hung off the root to switch languages without changing the domain then anything else (domain.com/language/ solution)
If you get too busy with the technology the spiders won't follow.
As is clear.
Cloaks on you mate ;)
Lets make it clear ..
Different words ( which is what different language content is ) on different pages ..are seen by all spiders as DIFFERENT PAGES with different content ....
Your problem is that your pages had maybe 10 times more meta text than the dozen or so words you had per page in any language ...so even tho Google doesnt take one's meta text into account normally ...in your case( it did and with reason ) it far outweighed the "visible" text ..plus the visible text looked the same ( model numbers etc and even some buttons and forms written in the same language ( English ) on non English pages ...
BTW ..I hope you didn't pay for the site ..and certainly not for any optimisation ..I couldn't do have done a better SEO sabotage if I'd tried ...
( mods you can cut that if you want but it isnt an attack on Ernest ...the poor guy had no idea ..he's here looking for help ..Brett apparently thinks this is relevant to others ...it will be ..only if people who dont know what they are talking about and have no experience with the issue don't give weird advice ...rant over ..sorry )....
Page content must be different ..if it's too similar then the engines will look at other factors ( such as metas ) purely to identify possible duplicates ..if the meta data is the same for each page ..and so are the alt tags ..image tags ..hrefs etc ..then the pages will be flagged duplicate or spam ....
this is what happened to Ernest ...prophet is asking basically the same thing so as to avoid this ..
again sorry to make this a "site reveiw" ...but it's almost impossible not to in the face of some of the "advice" here ...
The algo ( any algo ) isn't one set of rules ...this "taking into account of meta tags etc " happens when pages look like "maybe dupes" ...or "maybe spam" ..or "maybe whatever"...
observe what it does and who it does what to ..then think ..then apply what you observed ...and remember they might just add in some variable while you're doing this ....Fun isn't it ...;)
No, I just optimise the outcome.
It's more about unleashing the potential of a website fully.
Never mind and ;) Have fun
Indeed: since we have an acceptable PR5 it is fun to now learn and optimise bit for bit to have again PR6.
By the way I could show you even now pages that recommend to fill up more than 500 characters into "key words" run by professional SEOs (not in the USA). They still do sabotage to their clients and think they do their best ... ;-)
I'm quite happy having learnt something by the forums but it takes much time to change such a huge dynamic site.
I hope you have a good time when being busy and I'm always glad when you answer - no matter how "late".
If my site would have to make money or to boost up a trade I would naturally hire a professional again, now knowing better who is and who is not. I also think because ours is a non profit page and a quite unusual one to study results (which I could display here) in an open manner it would have been nice that we could get open discussions on the object - but I understand other politics here, knowing I'm a guest in a house with certain rules which in principle is always good.