Welcome to WebmasterWorld Guest from 23.22.182.29

Message Too Old, No Replies

Is there such thing as html markup to tell googlebot what to crawl

and what not to?

     
4:22 am on Mar 28, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Apr 19, 2003
posts:282
votes: 0


I wonder if there is such thing as document makeup you can put in your html file to tell google bot what part of a file to crawl and what part to ignor.
10:56 am on Mar 28, 2006 (gmt 0)

New User

5+ Year Member

joined:Mar 9, 2006
posts:29
votes: 0


sadly not.

(though there is some code that tells google which part of a page it should use to judge what adsense adverts to display)

11:07 am on Mar 28, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Feb 21, 2005
posts:553
votes: 0


Yes there is: GoogleBot will crawl everything from the <HTML> to the </HTML> :-)
1:22 pm on Mar 28, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 31, 2004
posts:43
votes: 0


newbies:

is there such thing as html markup to tell googlebot what to crawl

no. though I always get a chuckle when i see:

'robots revisit' or 'robots all' in the metas.

and what not to?

robots.txt can help exclude 'good' robots. search this site for robots.txt; you should end up at formum 93.

If you look at Brett's robot.txt about 20 lines down you'll see:

User-agent: *
Disallow: /

Be very careful with this; Brett has chosen to exclude everything. Read up on it before implementing.

3:01 pm on Mar 28, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 11, 2002
posts:140
votes: 0


> sadly not.

Though it is a shame. I have one page in particular that looks as if it is keyword-stuffed in the open text - I'd love a <noindex> </noindex> markup tag.

What about a server-side include for the stuff that's not to be indexed?

3:08 pm on Mar 28, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


SSI is inserted into the document before it leaves the server. The bot has no idea how the page was generated, so it all gets indexed, SSI or not.
5:29 pm on Mar 28, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:June 6, 2005
posts:524
votes: 1


You can use JavaScript to write that section of the text. Since the spiders don't have JavaScript activated that part will be 'invisible' to them.