Forum Moderators: phranque
So, to the problem:
At the moment we're working on some optimised information pages for a website using a CMS, and very few of their pages are being indexed. Current CMS based pages have URL's like so: /dir/dir2/page.ext?id=number. Now theoretically, these should be indexed - at least by Google. They're not. I suspect this is due to the re-directs they have in place at the index of various areas - typing in domain.com/dir re-routes me to a page with the aforementioned directory structure. I think this is done through some mystical apache htaccess lark, rather than meta tags. That'd be fine, if either were in the SE databases.
Anyway, the long and short of it is we're looking to insert our pages alongside the CMS, which means at least one link from the CMS pages in reality; but these aren't indexed. If I were to use mod_rewrite or some other fabled ISAPI filter to change the dir structure to say /page/id.ext, the pages would be much more likely to be spidered, right? This is the logic I'm working on, anyway. If we can then get a link from a newly indexed page, then alls fine and dandy, indeed?
The other options were cloaking, and off-site hosting, neither of which were as preferable as trying to make the entire site spiderable. I realise this question has undoubtedly come up before, but I must desire attention or something, as I thought it worth a post. I'm just kind of seeking confirmation and guidance before I suggest a plan of action, really.
Thanks muchly.
Addendum:
If they're not running apache, and are running IIS or something else, is there a similar tool/filter/thing available?
If you haven't tried yet, do a site search [searchengineworld.com] for mod_rewrite. There are lots of good threads, including the IIS issue.
I shall run through the threads when I get a moment, thanks.