Forum Moderators: goodroi
RewriteCond %{SERVER_PORT} !443
RewriteRule (.*) https://example.net/ [R] in my httaccess yet my robots text will not forward to to https.
basically what is happening is my site is being crawled but not being indexed like it was prior die to the robots and sitemaps are not being followed by the redirect.
Is there something i should add to the robots.txt to tell it allow the redirect? also another thing is, what few listings i do have on the net they all take you to home page. sometimes my right blocks show sometimes they don't.
thanks for taking time to read this.
Thank You,
Clyde
[edited by: tedster at 8:32 pm (utc) on Mar. 3, 2008]
[edit reason] use example.net - it can never be owned [/edit]
You've captured the requested URL with the pattern (.*), but have not accessed this via $1 in the redirection URL. Try
RewriteCond %{SERVER_PORT}!443$
RewriteRule ^(.*) https://example.net/$1 [R=301,L]
ok confirmed robots.txt detects a redirect and refuses to follow.
[tool.motoricerca.info...]
this is how it works using my robot checker.
http://example.net/robots.txt = redirected detected please correct
[example.net...] = refresh of page no changes.
googles robots checker on https = 200 success and lists my robots.txt
Thanks for your time and thank you for the welcome to webmasterworld.
what happened is that my search rankings dropped down to just a link or two. when you used the search links to enter my site, you were taken to home page regardless of the url and left side blocks were missing. my understanding was that it was due to the robots.txt not being configured correctly to point to the new https.
using your code has removed all that, yet at the same time it's to early to tell how i will be crawled and indexed. i removed my code yesterday so my site could start getting indexed again. since removing the code i have gained an extra search ranking. today i put your code in and await for results.
i'm basically wanting the robots.txt to follow the https and enforce rules set within yet at the same time keep my site rankings within a search. i could be wrong in where i posted this but, i was told it all depended on the robots.tx
It certainly doesn't sound like you've been given very clear advice.
Robots.txt is a method to exclude spiders from accessing certain parts of a site. You can optionally use it to specify the URL of an XML sitemap for search engines to locate pages.
A sitemap is not necessary for pages to be included in search engine databases and ranked for keywords (I tend to avoid them in most cases, but that's just my personal preference). Similarly, you don't need a robots.txt to perform in search engines, or indeed to allow search engines to locate a sitemap (you can tell them where it is via the submission tools they provide for sitemaps).
Any chance you could clarify this issue you think you were having with the robots file?
some robots checkers bring up my robots.txt some don't.
i'm afraid some good spiders are getting accused and banned for being a bad bot. when i would love for them to crawl my site, without letting the bad bots snoop around my sites admin files.
one instance i was looking at google crawl my site so i decided to see where it was exactly. it was on a page where you would make an edit. i don't think it should have been there. me being admin could have made the actual post in edit mode appear. i didn't have a chance to test with an anonymous account.
basically i want the robots.txt be obeyed and not ban bots that are not on my Disallow list.
sorry for the jumble, my keyboard goes crazy with bringing up commands
or jumps around a page. some shortcut on here or key combination that's doing it.
Thank you, for your time and help in this matter.
Thank You,
Clyde