| Welcome to WebmasterWorld Guest from 220.127.116.11 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Become a Pro Member|
|Is there such thing as html markup to tell googlebot what to crawl|
and what not to?
| 4:22 am on Mar 28, 2006 (gmt 0)|
I wonder if there is such thing as document makeup you can put in your html file to tell google bot what part of a file to crawl and what part to ignor.
| 10:56 am on Mar 28, 2006 (gmt 0)|
(though there is some code that tells google which part of a page it should use to judge what adsense adverts to display)
| 11:07 am on Mar 28, 2006 (gmt 0)|
Yes there is: GoogleBot will crawl everything from the <HTML> to the </HTML> :-)
| 1:22 pm on Mar 28, 2006 (gmt 0)|
is there such thing as html markup to tell googlebot what to crawl
no. though I always get a chuckle when i see:
'robots revisit' or 'robots all' in the metas.
and what not to?
robots.txt can help exclude 'good' robots. search this site for robots.txt; you should end up at formum 93.
If you look at Brett's robot.txt about 20 lines down you'll see:
Be very careful with this; Brett has chosen to exclude everything. Read up on it before implementing.
| 3:01 pm on Mar 28, 2006 (gmt 0)|
> sadly not.
Though it is a shame. I have one page in particular that looks as if it is keyword-stuffed in the open text - I'd love a <noindex> </noindex> markup tag.
What about a server-side include for the stuff that's not to be indexed?
| 3:08 pm on Mar 28, 2006 (gmt 0)|
SSI is inserted into the document before it leaves the server. The bot has no idea how the page was generated, so it all gets indexed, SSI or not.
| 5:29 pm on Mar 28, 2006 (gmt 0)|
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved