Forum Moderators: Robert Charlton & goodroi
My big concern is that for some articles both the description and keywords (and perhaps even the title) will contain the same keywords and so that together with the content of the article the page might be marked by G as repeating the keywords to often (keywords stuffing)?
Anyone has done any research in the field?
1. title
2. meta description
3. meta keywords
I've been told to not include keywords unless they really do appear on the page itself. I would guess you could use other words to describe the page, though.
Repeating too often seems spammy to me, but if it is it might not be tomorrow the way things change these days...
That's what I'm doing now with meta descriptions. Google didn't used to care about them at all, now they're enough to throw you into supplemental.
It's a pain in the neck, but it doesn't take that much longer to just do them when you build the page.
Meaning not the navigation but the meat of the document.
You will find that engines like Yahoo and MSN like this type of matching. Google does not really care, but it may just be enough of a score alteration to move you up a notch.
I'm not an authority but I thought one purpose (historically) of the meta keywords list was to include terms that are equivalent to the terms used on your page but not acutally found there. The list in this case helping the search engine spider to have a better understanding of what keywords to associate your page with in its indexing.
Say you had a page describing features of electric miter saws. Another name used for these saws (somewhat incorectly) is "chop saws." You might put the words "chop saws" in the keywords list even though those words are not found in the text of your page.
Historically, this synonym and misspelling business was a strange tangle. I think it was AltaVista that recommended it and Inktomi sid they would penalize for it at the same time. But then Yahoo acquired both those properties, and that was that. They settled for the AV way.
Nothing is sure but change, and that's especially true in search. So I still fold in a meta keywords tag on most sites. I use it as a kind of memo pad for a small set of keywords that are truly specific to the page - words I hope the url can rank for. It's a good practice -- and some directories even require it. But Google currently doesn't care.
Your document should begin with a !DOCTYPE (this tells the browser what sort of HTML is in the file) followed by the <html> and <head> tags:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
.
For your page to actually be valid you MUST declare the character encoding (lets the browser know whether to use A to Z letters (Latin), or Chinese, Japanese, Thai, or Arabic script, or some other character set) used for the page, with something like:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
There are also other schemes such as UTF-8 and many others.
.
It is also a good idea to declare what human language the page is in, using:
<meta http-equiv="Content-Language" content="EN-GB">
The language and country codes come from ISO 4217 and ISO 3166. This is useful for online translation tools as well. Change the "en" and "gb" to whatever language and country you need.
.
You need a <title> element for the page:
<title> Your Title Here </title>
This is displayed at the top of the browser window, and stored as the name of the bookmark if someone bookmarks the page URL in their browser. Most importantly, it is the <title> tag that is indexed and displayed by search engines in the search results page (SERPs).
.
You need the meta description tag, as this is very important for search engines, and it is useful but not vital to have a meta keywords tag:
<meta name="Description" content=" Your Description Here. ">
<meta name="Keywords" content=" your, keyword, list, here ">
.
Most search engines do obey the robots meta tag. The default robots action is index, follow (index the page, follow all outbound links) so if you want something else (3 possibilities) then add the robots tag to the page in question. If you want to exclude whole directories then use the robots.txt file for this instead of marking every HTML file with the tag.
<meta name="robots" content="noindex,follow">
.
The last parts of your header should have your links to external style sheets and external javascript files:
Use this if the stylesheet is for all browsers:
<link type="text/css" rel="stylesheet" src="/path/file.css">
Use this for style sheet that you want to hide from older browsers, as older browsers often crash on seeing CSS:
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"> @import url(/path/file.css); </style>
Use this for the javascript:
<script type="text/javascript" language="javascript" src="/path/file.js"></script>
.
End the header with this:
</head>
<body>
and then continue with the body page code.
It is as simple as that.
But if I got it right he says that he might be penalised for duplication:
My big concern is that for some articles both the description and keywords (and perhaps even the title) will contain the same keywords
I have the same issue with one of my sites which is news related. If you check Reuters they only have a single set of keywords and a single description site-wide on their articles related to their business and not anything to do with the article in question.
I think it does cause content duplication problems. I am constantly having a go at the journalists to get them to write a unique description and keyword set for each article they put in the site, however since we talk about people who can't figure out the importance of this and they keep pasting the opening paragraph to the description field and the keywords are already remembered in firefox, which means they change one keyword and leave the rest as it was on another article they did a month ago (for example).
Maybe we need to add another meta like <meta name="meta-editor" content="A journalist entered these words">... This is a tough one as it involves humans and not code.