Sloppy anchors and Google

Forum Moderators: open

Message Too Old, No Replies

Sloppy anchors and Google

Can google follow my awful code?

vincevincevince

8:38 pm on Apr 10, 2003 (gmt 0)

I know that the standard for the <A> tag is to use the double quote <A HREF="foo.bar">, but as many will know, php has an annoying [understandable] habit of needing " to be escaped \" or is it /" [forgot]... so i resorted out of lazyness to using:

<A HREF='foo.bar'> or just:
<A HREF=foo.bar>... all of which work just fine in Zilla and IE... but how about google?

And how about <IMG SRC='foo.bar'><IMG SRC=foo.bar>?

And... it seems that the provided `link to me` text was in the form <A HREF=http://www.wigetfinder.com>... Have I spent months getting links which Google will not follow?

Set me straight, i beg of you! :D

Mohamed_E

8:54 pm on Apr 10, 2003 (gmt 0)

vince,

Google was designed to have a very forgiving parser, the internet is full of code that does not validate :) If a browser can display it, Googlebot will understand it. Here is what the designers of Google have to say about the code it is able to deal with (see [www7.scu.edu.au...] Section 4.4):

Any parser which is designed to run on the entire Web must handle a huge array of possible errors. These range from typos in HTML tags to kilobytes of zeros in the middle of a tag, non-ASCII characters, HTML tags nested hundreds deep, and a great variety of other errors that challenge anyone's imagination to come up with equally creative ones.

Using single rather than double quotes should be a very minor problem.

g1smd

9:48 pm on Apr 10, 2003 (gmt 0)

You do need to enclose attributes in quotes if the attribute value contains anything other than a simple A to Z or 0 to 9 value. So, all URLs, all "50%" sizes, and all "#FFFFFF" colour codes should be "quoted" for starters.

It is recommended in HTML 4.01 to quote all attributes.

In XHTML this then becomes a requirement.

Escaping the quotes is done by \" if you need it.

I would also suggest running your code through [validator.w3.org...] to find any typos and errors.

Is this other tool [gritechnologies.com] useful as well?

vincevincevince

2:37 pm on Apr 11, 2003 (gmt 0)

thanks for your advice, but neither of you actually gave me a yes or a no as to whether google needs " ", if ' ' is good enough, or if I can get away without any quotes at all.

i have limited experience in parser creation, but i'm sure that google must have a rule for finding the url from the <A> tag? ie, find the first " after HREF= and then take all the text until it reaches the next "? or will it `intelligently` recognise the url format?

all i want is a [spiders] or [does not spider] for:
a: <a href='http://www.widgetfinder.com'>Widgets</a>
b: <a href=http://www.widgetfinder.com>Widgets</a>

Mohamed_E

2:40 pm on Apr 11, 2003 (gmt 0)

> thanks for your advice, but neither of you actually gave me a yes or a no ...

Since neither of us works for Google that shows that we are honest :)

There are, of course, a few "facts" that we all "know" are true, but please note the quotes. The vast majority of what you read here is opinion, whether or not the writer states so explicitly. Read, think, use your own judgement, and make your own decision.

g1smd

6:12 pm on Apr 11, 2003 (gmt 0)

Example <a> is probably OK, but only a Google employee could really tell you the answer to that one. I would be willing to risk formatting it like that if I had to.

Example <b> is NOT even valid HTML, as per my last post above (all URLs should be quoted -- because they contain colon, slash, and dot which by definition are not a simple A to Z or 0 to 9 value), so you should not ever use that. Whether or not spiders can follow it is irrelevant, some browsers will choke on it.

If you write well-formed code, then all spiders should be able to follow it. You can check if your code is well-formed by using the validator tool I mentioned, and the spider view tool I linked to. The answer is to not write sloppy code!

vincevincevince

6:21 pm on Apr 11, 2003 (gmt 0)

g1smd, thanks for that :) i guess i'm just lazy

any one got a spider simulator which lets me change the useragent to googlebot btw?