Redirect robot.txt from http to https

Forum Moderators: goodroi

Message Too Old, No Replies

Redirect robot.txt from http to https

how to redirect properly.

clyde4210

8:16 pm on Mar 3, 2008 (gmt 0)

Ok i have added this code:

RewriteCond %{SERVER_PORT} !443
RewriteRule (.*) https://example.net/ [R]

in my httaccess yet my robots text will not forward to to https.
basically what is happening is my site is being crawled but not being indexed like it was prior die to the robots and sitemaps are not being followed by the redirect.

Is there something i should add to the robots.txt to tell it allow the redirect? also another thing is, what few listings i do have on the net they all take you to home page. sometimes my right blocks show sometimes they don't.

thanks for taking time to read this.

Thank You,
Clyde

[edited by: tedster at 8:32 pm (utc) on Mar. 3, 2008]
[edit reason] use example.net - it can never be owned [/edit]

Receptional Andy

2:27 pm on Mar 4, 2008 (gmt 0)

Hi Clyde, and welcome to webmasterworld [webmasterworld.com] :)

You've captured the requested URL with the pattern (.*), but have not accessed this via $1 in the redirection URL. Try


RewriteCond %{SERVER_PORT}!443$
RewriteRule ^(.*) https://example.net/$1 [R=301,L]

clyde4210

6:44 pm on Mar 4, 2008 (gmt 0)

thank you,
i have yet to confirm that it works 100% due to the fact i keep getting dropped trying to communicate with the server. this could be a hosting issue or the rewrite as i am not sure just yet. was working fine a second ago.
redirected fine, i signed up for new account and actually
went to the rerouted confirm account page.

ok confirmed robots.txt detects a redirect and refuses to follow.
[tool.motoricerca.info...]

this is how it works using my robot checker.
http://example.net/robots.txt = redirected detected please correct
[example.net...] = refresh of page no changes.
googles robots checker on https = 200 success and lists my robots.txt

Thanks for your time and thank you for the welcome to webmasterworld.

Receptional Andy

7:12 pm on Mar 4, 2008 (gmt 0)

Just to confirm, the code I suggested will essentially force the use of https for every file on the site (including robots.txt). I assume (hopefully correctly!) that this was what you wanted to achieve?

clyde4210

7:56 pm on Mar 4, 2008 (gmt 0)

yes, i wanted for the robots.tx to enforce rules i set within the robots.txt but at the same time allowing the sitemap to utilized.

what happened is that my search rankings dropped down to just a link or two. when you used the search links to enter my site, you were taken to home page regardless of the url and left side blocks were missing. my understanding was that it was due to the robots.txt not being configured correctly to point to the new https.

using your code has removed all that, yet at the same time it's to early to tell how i will be crawled and indexed. i removed my code yesterday so my site could start getting indexed again. since removing the code i have gained an extra search ranking. today i put your code in and await for results.

i'm basically wanting the robots.txt to follow the https and enforce rules set within yet at the same time keep my site rankings within a search. i could be wrong in where i posted this but, i was told it all depended on the robots.tx

Receptional Andy

8:23 pm on Mar 4, 2008 (gmt 0)

Hi again :)

It certainly doesn't sound like you've been given very clear advice.

Robots.txt is a method to exclude spiders from accessing certain parts of a site. You can optionally use it to specify the URL of an XML sitemap for search engines to locate pages.

A sitemap is not necessary for pages to be included in search engine databases and ranked for keywords (I tend to avoid them in most cases, but that's just my personal preference). Similarly, you don't need a robots.txt to perform in search engines, or indeed to allow search engines to locate a sitemap (you can tell them where it is via the submission tools they provide for sitemaps).

Any chance you could clarify this issue you think you were having with the robots file?

clyde4210

9:40 pm on Mar 4, 2008 (gmt 0)

well whats happening is i am getting alot or, 10 so far. abuse attempts or string attempts. i'm not wanting to use robots.txt to tell it to find a sitemap. i just want to make sure it's being obeyed correctly as it stands now, i'm not to sure about that. i've put in place your code and i wait a day or so to see how things go.

some robots checkers bring up my robots.txt some don't.

i'm afraid some good spiders are getting accused and banned for being a bad bot. when i would love for them to crawl my site, without letting the bad bots snoop around my sites admin files.

one instance i was looking at google crawl my site so i decided to see where it was exactly. it was on a page where you would make an edit. i don't think it should have been there. me being admin could have made the actual post in edit mode appear. i didn't have a chance to test with an anonymous account.

basically i want the robots.txt be obeyed and not ban bots that are not on my Disallow list.

sorry for the jumble, my keyboard goes crazy with bringing up commands
or jumps around a page. some shortcut on here or key combination that's doing it.

clyde4210

2:59 pm on Mar 6, 2008 (gmt 0)

Well, the code you have given me works wonders. I like to thank you for taking the time dealing with my mumble jumble. Most robots.txt checkers hate the redirect but it's being obeyed which is what i wanted. I have lost alot of rankings since going https. I guess with me not knowing anything about how things should work with https/ sitemaps, robots.txt and a 301 htaccess redirect, that is expected.

Thank you, for your time and help in this matter.

Thank You,
Clyde