Forum Moderators: open

Message Too Old, No Replies

How does Google read harcoded text - like &?

Google special character and foregin languages.

         

ogumi

12:25 pm on Jan 18, 2004 (gmt 0)

10+ Year Member



I have to write text in my language with some special symbols which must be hardcoded.
Will google read and index my site correctly if I write for example:
Google?
(output in browser would be google)

troels nybo nielsen

12:53 pm on Jan 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome at WebmasterWorld, ogumi.

I have tons of å, æ and ø on my websites. Some search engines have problems with them but not Google.

CygnusX1

1:44 pm on Jan 18, 2004 (gmt 0)

10+ Year Member



All programs that make websites code bloat up to a point. It is to what degree each does it. For example Front Page is kind of bad at it. Dream Weaver is probably better at not code bloating then any other, but that’s not to say that it doesn't do it either. Dream Weaver has a habit of making the "&" symbol that you want on your page and using the "&" when you don't need it. This is an important issue if it happens in the title of a page. A title tag, which is a valuable part of each page, is only aloud to be so long or it may be considered spamming. I guarantee you that all of the code bloating characters is being counted. Remember that just because a program makes this code doesn't mean you have to use it, although sometimes you do.

On our website we have to go back and take out the code bloating by hand whenever we make up a new webpage. The only program that doesn't automatically code bloat is the "Human". I suggest you take out the code bloating by hand after you make up each page.

In saying that, you do need to know a little about writing code by hand or it will change the look of your webpage.

ogumi

2:44 pm on Jan 18, 2004 (gmt 0)

10+ Year Member



Thanks troels nybo nielsen!

I code html manualy using text editors (HTML kit or Dreamweaver)
Then i use a freeware proggy http*//www.orbit.org/replace/
to replace all conflicting char like čćšČĆŠ in all files at once. I hvae serious problems doing it with php script(not much of a programmer :-).

I'll submit my page shortly to SE and I'll reply the results as soon as i get crawled.

[edited by: pageoneresults at 3:53 pm (utc) on Jan. 18, 2004]
[edit reason] Delinked URI [/edit]

AthlonInside

5:34 pm on Jan 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If I hard code a normal alphabet, such as BIG to (****somethingxxx), will google return the page for the search of BIG?

troels nybo nielsen

7:00 pm on Jan 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



BIG says BIG

I would certainly expect Google to interpret it correctly.

> a normal alphabet

Not quite sure what a "normal" alphabet is. Many of the problems with html comes from the fact that for reasons unknown to me it is designed around the English alphabet which is a very limited alphabet.

rfgdxm1

12:03 am on Jan 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Not quite sure what a "normal" alphabet is. Many of the problems with html comes from the fact that for reasons unknown to me it is designed around the English alphabet which is a very limited alphabet.

The inventor was English speaking. Also at this time, English was by far the most dominant language used on the Internet.

Krapulator

5:38 am on Jan 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>Dream Weaver has a habit of making the "&" symbol that you want on your page and using the "&" when you don't need it. This is an important issue if it happens in the title of a page.

This is not code bloat - your code is not valid if you use & in your title instead of &

ciml

11:18 am on Jan 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



& is fine if it's followed by whitespace, otherwise & is necessary.

troels nybo nielsen:
> BIG says BIG
> I would certainly expect Google to interpret it correctly.

Thanks for that, I'd never thought about it.

troels nybo nielsen

12:46 pm on Jan 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I'd never thought about it

Sometimes an advantage may be a handicap and a handicap may be an advantage. Trying for some years to create websites at a reasonably decent quality level in a "small" language on a web dominated by the English language forces a webmaster to awareness of certain problems and their solutions. Problems (and solutions) that may be less obvious for webmasters writing in English but still have some relevance for them too.

Perhaps it's too "easy" to create websites in English?

ronin

1:50 am on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The inventor was English speaking.

Yes, yes, I'm English speaking too, but that doesn't make the version of the Latin alphabet used by native English speakers any more 'normal', as Athlon describes it, than the versions used by Danish or Norwegian speakers (for example).

Troels is quite right. Compared to most other oral languages which have adopted the Latin alphabet to express themselves in written form, written English is by far one of the most limited in terms of the number of symbols it uses. It is unusual and frustrating that we only use five vowel symbols to illustrate over twenty vowel sounds whereas most languages written in Latin use more symbols to describe less sounds.

To answer the original question, I use letters like ü all the time and Google doesn't have a problem...

Krapulator

5:38 am on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> & is fine if it's followed by whitespace, otherwise & is necessary.

If you want to validate you must use & regardless whether it's followed by white space or not.

GoogleGuy

7:19 am on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ya got me. If you do an experiment and figure anything useful out, let us know. :)

CygnusX1

11:53 am on Jan 20, 2004 (gmt 0)

10+ Year Member



I can tell you what we found on the "&". When we found that the "&" was being shown in our title on 3 different internal pages on our website. We looked at our listings in Google and where getting around the 10th place under certain keyword phrases. When we removed the "amp;" and left just the "&" then our listings jumped by 4 to 5 places under those same keyword phrases. This jumped happened on 3 different occasions that we found the "&" in a title of a page. Google is counting every character in not only the title, but on the whole page.

I tried to point out early that when a program uses a code and other programs use a shorter code, then in my mind that longer code is called code bloating. I read somewhere on this post that I was wrong, but I don't think I am.

Chris_D

12:01 pm on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Search for BIG

Results 1 - 10 of about 129,000,000

and big is highlighted everywhere.....

Chris_D
Sydney Australia

Jim_Wilson

2:20 pm on Jan 20, 2004 (gmt 0)



Does anybody know if Google will read unicode as unicode or convert it back to ASCII?

AthlonInside

9:30 pm on Jan 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually I thought of encoding text that I don't want Google to see. But now look like Google is too smart!