Forum Moderators: goodroi
On my website, I have about 5 pages that can be found only through a intranet site search. There is no link path to these pages. I can create a sitemap file if I wanted to. However, it'd certainly be easier if I could just put these statements into my robots.txt:
allow: http://www.example.com/mypage1.html
allow: http://www.example.com/mypage2.html
allow: http://www.example.com/mypage3.html
Will this work?
[edited by: engine at 4:40 pm (utc) on Sep. 11, 2007]
[edit reason] examplified [/edit]
Unfortunately, it's the robots exclusion protocol [robotstxt.org], and so you can't use robots.txt files as a pointer to content you want indexed.
A sitemap is your only option other than links, although personally I would link to the files from somewhere relevant.
That said, with all the adaptations to robots exclusion recently, maybe you just need to wait for a while ;)
However, on a complex site, where I need to document what IS allowed, I add those as comments.
Disallow: /forum/forumdisplay.php?page
Disallow: /forum/forumdisplay.php?sort
Disallow: /forum/forumdisplay.php?order
Disallow: /forum/forumdisplay.php?pp
Disallow: /forum/forumdisplay.php?daysprune
Disallow: /forum/forumdisplay.php?do
# Allow: /forum/forumdisplay.php?f=
# Allow: /forum/forumdisplay.php
Disallow: /forum/showthread.php?mode
Disallow: /forum/showthread.php?goto
Disallow: /forum/showthread.php?post
Disallow: /forum/showthread.php?page
Disallow: /forum/showthread.php?pp
Disallow: /forum/showthread.php?p
# Allow: /forum/showthread.php?t=
# Allow: /forum/showthread.php
Disallow: /forum/profile.php?do
# Allow: /forum/profile.php
# Allow: /forum/external.php
# Allow: /forum/showprofile.php
# Allow: /forum/announcement.php
# Allow: /forum/faq.php
With TWO spaces after the # the URLs all align in a column for ease of readibility.