Forum Moderators: open
<A HREF='foo.bar'> or just:
<A HREF=foo.bar>... all of which work just fine in Zilla and IE... but how about google?
And how about <IMG SRC='foo.bar'><IMG SRC=foo.bar>?
And... it seems that the provided `link to me` text was in the form <A HREF=http://www.wigetfinder.com>... Have I spent months getting links which Google will not follow?
Set me straight, i beg of you! :D
Google was designed to have a very forgiving parser, the internet is full of code that does not validate :) If a browser can display it, Googlebot will understand it. Here is what the designers of Google have to say about the code it is able to deal with (see [www7.scu.edu.au...] Section 4.4):
Any parser which is designed to run on the entire Web must handle a huge array of possible errors. These range from typos in HTML tags to kilobytes of zeros in the middle of a tag, non-ASCII characters, HTML tags nested hundreds deep, and a great variety of other errors that challenge anyone's imagination to come up with equally creative ones.
Using single rather than double quotes should be a very minor problem.
It is recommended in HTML 4.01 to quote all attributes.
In XHTML this then becomes a requirement.
Escaping the quotes is done by \" if you need it.
I would also suggest running your code through [validator.w3.org...] to find any typos and errors.
Is this other tool [gritechnologies.com] useful as well?
i have limited experience in parser creation, but i'm sure that google must have a rule for finding the url from the <A> tag? ie, find the first " after HREF= and then take all the text until it reaches the next "? or will it `intelligently` recognise the url format?
all i want is a [spiders] or [does not spider] for:
a: <a href='http://www.widgetfinder.com'>Widgets</a>
b: <a href=http://www.widgetfinder.com>Widgets</a>
Since neither of us works for Google that shows that we are honest :)
There are, of course, a few "facts" that we all "know" are true, but please note the quotes. The vast majority of what you read here is opinion, whether or not the writer states so explicitly. Read, think, use your own judgement, and make your own decision.
Example <b> is NOT even valid HTML, as per my last post above (all URLs should be quoted -- because they contain colon, slash, and dot which by definition are not a simple A to Z or 0 to 9 value), so you should not ever use that. Whether or not spiders can follow it is irrelevant, some browsers will choke on it.
If you write well-formed code, then all spiders should be able to follow it. You can check if your code is well-formed by using the validator tool I mentioned, and the spider view tool I linked to. The answer is to not write sloppy code!