Forum Moderators: goodroi
I have a dynamically generated site. All urls are turned into SE freindly urls, like this:
[mysite.com...]
The folders before the /addfav folder could be ANYTHING, as the site is dynamically generated. The /addfav folder always appears 5 levels deep in the folder structure.
What I am after is for the SEs not to index the /addfav folder or anything below it.
From what I understand, wildcards cannot be used to specify subdirectories, only user-agents... so something like this would not work, correct?
User-agent: *
Disallow: /*/*/*/*/addfav/
What should my robots.txt look like to accomplish this?
Thanks in advance,
Dan
As you say that the site is generated, would you be able to add a
<meta name="robots" content="noindex"> to the pages you want to exclude? I reckon that would be a simpler and safer strategy.
There is not even any content on the /addfav "page" for the SEs to index, but looking at the serps for my domain many of these "pages" are in the index.
Is my request even possible with robots.txt or am I stuck with these worthless pages being indexed by the SEs?
Thanks,
Dan
As you are using Javascript to add to favorites in IE, could you make the href link a direct link to the article, but with an onclick event to action the bookmarking?
Go to my homepage listed in my profile, then look at one of the articles. You will see where you can add an article to your favorites list.
That was getting a little bit off topic but thought I should clarify.
Any other thoughts on getting those pages out of the serps?
Thanks,
Dan
[example.com...]
Then you could easily exclude it like:
User-agent: *
Disallow: /addfav.php
Besides that Google might support wildcards for your need. You might want to try:
User-agent: *
Disallow: /*addfav*
Though I'm honestly not to sure about the "*" at the end.