Forum Moderators: mack
What is this tag?
<meta name="googlebot" content="index, follow"> A reference to the area on Google where the above tag is described and suggested would be appreciated.
In addition to the above, why do people utilize this Robots META Tag?
<meta name="robots" content="index, follow"> or <meta name="robots" content="all">
[google.com...]
Well, there's the name="googlebot" portion. I've only seen it listed on Google as a method to disallow Googlebot from crawling or indexing. As to the "index, follow": possibly over-zealous webmasters who are hoping that explicitly allowing Googlebot will improve their ranking or time between crawls?
Possibly over-zealous webmasters who are hoping that explicitly allowing Googlebot will improve their ranking or time between crawls?
Now we are getting somewhere. There is a whole new breed of SEOs coming into the market place that seem to want to create their own set of erroneous metadata. The
index, follow for Googlebot being one of them. The myth now begins. I've seen over 100 instances of that tag in the last 5 days. All of it from a particular region of the world. This happens when someone misinterprets the guidelines and then decides to insert another robots-term that they think will have an influence on Googlebot.
You watch, in a year from now, many newcomers will have that piece of metadata in their
<head></head>s.
<meta name="googlebot" content="index, follow"> or this one for that matter...
<meta name="robots" content="index, follow">
I've searched, researched, dug holes, knocked down walls and I still can't find any authoritative references that suggest the use of index, follow in a META Robots Tag.
If I remember correctly, some time ago Inktomi's default behavior was to index, nofollow or something odd like that. I think they did suggest the use of the index, follow directive because of that issue. I never had any issues with Ink, so I never thought twice about using it.
Do either of these tags have any influence over the spiders behavior?
0 noindex,nofollow
1 noindex,follow
2 index,nofollow
3 index,follow
So, even though the last option doesn't actually accomplish anything (except yes, for Ink), it is there as a placeholder on the page in case a future change is needed (think campaigns). That's why I mentioned SSI and a central administration function for indexing control.
Jim
As long as it's not prefaced by "stink" or "jail", then I'll usually take it. :)
>> sites may be database-driven, and have an associated index into a table used to populate the page <<
Certainly a legitimate and easy to manage method. I can see its uses. But, I've seen this show up on a few SEO forums & sites as a method for improving rankings. pageoneresults' prediction that it will become a SEO urban myth is already coming true.
jd, thanks for the explanation above. It makes sense but I've never seen an implementation like that.
Okay, I've put together some information to hopefully stop this one from becoming another element for the META tag generators out there. Most of this is extracted from the Google's Webcrawler [google.com] information page with some additional information from The HTML Authors Guide to the Robots META Tag [robotstxt.org].
Googlebot Robots META Tag
The Robots META Tag for Googlebot is meant to provide users who cannot upload or control the
/robots.txt file at their websites, with a last chance to keep their content out of Google's indexes and services. The "robots" tag is obeyed by many different web robots. If you'd like to specify indexing restrictions just for googlebot, you may use "googlebot" in place of "robots".
<meta name="googlebot" content="robots-terms">
<meta name="robots" content="robots-terms"> Googlebot obeys the noindex, nofollow, and noarchive Robots META Tag. If you place the tag in the head of your HTML/XHTML document, you can cause Google to not index, not follow, and/or not archive particular documents on your site.
The
content="robots-terms" is a comma separated list used in the Robots META Tag for Google that may contain one or more of the following keywords without regard to case: noindex, nofollow and/or noarchive. noindex
Document will not be indexed by Googlebot.
nofollow
Internal and external links in the document will not be followed by Googlebot.
noarchive
Google will not archive a copy of the document (Google's Cached Page).
If this Robots META Tag is missing, or if there is no content, or the robot terms are not specified, then the robot terms will be assumed to be "index, follow" (e.g. "all") which is the default indexing behavior for most search engine spiders.
Examples of the Googlebot Robots META Tag
The tags to include and their effects are:
The robots term of noindex will produce the following effect; Googlebot will retrieve the document, but it will not index the document.
<meta name="googlebot" content="noindex"> The robots term of nofollow will produce the following effect; Googlebot will not follow any links that are present on the page to other documents.
<meta name="googlebot" content="nofollow"> The robots term of noarchive will produce the following effect; Google maintains a cache of all the documents that we fetch, to permit our users to access the content that we indexed (in the event that the original host of the content is inaccessible, or the content has changed). If you do not wish us to archive a document from your site, you can place this tag in the head of the document, and Google will not provide an archive copy for the document.
<meta name="googlebot" content="noarchive"> You can also combine any or all of the above robots-terms into a single Robots META Tag for Google. For example:
<meta name="googlebot" content="noarchive, nofollow"> Misinterpretation of the Standards
Googlebot's default indexing behavior is to index, follow or all. The below Robots META Tag is not required nor is it suggested in the Google guidelines which clearly state that the use of the Robots META Tag is for restricting the indexing of content.
<meta name="googlebot" content="index, follow"> Utilizing erroneous metadata elements like the example shown above may not present a professional image to both your peers and potential clients. It also adds additional weight to your pages that is not required. You shift the text to html ratio when inserting the additional code within your documents.
[edited by: pageoneresults at 2:21 am (utc) on July 16, 2004]
You hear every once in a while of (prominent) websites being hacked and slogans like "Kilroy was here" inserted. Well, the hacker gets the honour and the "damage" is easily fixed.
Sometimes, hackers destroy or erase the content of the page. Also easily fixed if the backup works.
If I really want to HURT somebody, I'd make the break-in very silent, very low profile. And all I'd do was adjust the robots.txt to lock out all the spiders. I would even leave the top part of the robots.txt which usually allows all spiders, then insert 100 blank lines, and then lock out googlebot, inktomi & co.
I think it will take the average webmaster VERY long to find out about this, but in the meantime all his pages are slowly getting unindexed.
So, everybody here's cheking their robots.txt now?
i dont get it?....what is the "revisit after" tag?
Revist-After META Tag - The Myth Continues in 2004 [webmasterworld.com]
<meta name="googlebot" content="robots-terms">
Uh, oh! A new meta myth is born!
You know all those folks who rarely read beyond the first paragraph of an article, and even then only every other word?
"What does that do?"
"I dunno. I found it on Webmaster World. It's supposed to improve your ranking in Google."
I wonder how long it's going to take for this "great new tag" to start showing up. :)
Yeah, and don't forget your all-important "Revisit-after" tag! ;)Ah-ha, its nice to know someone read that one. ;)
Yes, but it was your META Tags [webmasterworld.com] thread back in January, 2003 that I read, and posted about it [webmasterworld.com], too (msg #8, item #1).
That tag's been dead for a long time.
With the Web as large as it is now -- and 7 million new pages every day, search engines spider your site when they decide to, and when they can get around to it. Thinking you can "tell" Google to re-spider a page every day with a meta-tag is, um, delusional... Lisa's JediMindTrick tag will work just as effectively. Getting your pages to PR5 or above works a lot better. :)
Jim