Welcome to WebmasterWorld Guest from 54.80.93.19

Forum Moderators: Ocean10000 & phranque

Questions on Checking Changes to https

     
12:22 pm on Jul 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13



Again, give it time. If you have correctly installed the 301 to HTTPS and HTTP paths are no longer possible, indexing (including cache) will catch up eventually.


I've noticed this with my site. I changed to https about 2 weeks ago. I checked the cache version and gave a 404.

I altered the said URL to http and showed the cached page. The thing is, it showed yesterday's date - long after I've changed to https.

How do I know I've 301 correctly? I'm dumb when it comes to .htaccess

This is what mine says (where I have used a 301)....

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ https://www.example.us/$1 [R=301,L]

RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.us/$1 [R,L]

Could you please tell me if that is correct?
Thank you kindly.



[edited by: not2easy at 3:06 pm (utc) on Jul 19, 2018]
[edit reason] moved/cleanup [/edit]

3:56 pm on July 19, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3902
votes: 222


I should explain that the quoted material above is from a discussion titled "Google Cache returning 404" at [webmasterworld.com...]



It looks like only the homepage (or any index pages) were 301 (permanently) moved. Apache's default is a 302 (temporary) which would likely cause new URL indexing problems. There are literally thousands of threads here on the topic, but some short, simple answers may be found in a few discussions: [webmasterworld.com...]
or [webmasterworld.com...]

Once you upload the new htaccess file, try to visit a few of your old http pages. If they are not automatically loading as https, there is something wrong in the setup. If you are able to check the live headers during a page load, you can see where the problem lies. If not, look through your most current log files and see whether your requests for http pages show a 200, a 301 or a 302.

Also remember that order counts. If your pages are static html pages, these should be the last lines in your htaccess file after all other rewrites.
4:13 pm on July 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


Hi there,

Thank you for getting back to me.

I don't understand Apache, and it's not something I use every day.

Everything was working prior to the swap to https.

This...
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ https://www.example.us/$1 [R=301,L]

is supposed to redirect
example.us/index to
example.us/
which it still does.

I googled the answer how to redirect the site to https using .htaccess

RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.us/$1 [R=301,L]

(yes, the last rewrite rule differs from the one in the original post because I didn't copy it from the live version)

Google site command shows the URLs as https, and everything is being redirected that I can tell.

If I type http, it will redirect to https, no matter what URL I choose.

It's Google not showing a cached version of the site is concerning me.

Thank you.
4:26 pm on July 19, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3902
votes: 222


The order I would use given your examples above is:
RewriteEngine On 
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ https://www.example.us/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^(www\.example\.us)?$
RewriteRule (.*) https://www.example.us/$1 [R=301,L]

RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.us/$1 [R=301,L]

The main difference is the "[R=301,L]" flag which means permanent vs. your "[R,L]" which means moved temporarily. Without specifying the 301, Apache defaults to 302.

4:30 pm on July 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


It was a mistake in my first post. I copied it from (what I thought) was the most recent .htaccess that I have offline. I noticed that the 301 was missing, so I went to the server and pulled it off from there, then copied what was on the live .htaccess

My live version is
RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.us/$1 [R=301,L]
4:45 pm on July 19, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15019
votes: 665


is supposed to redirect
example.us/index to
example.us/
which it still does.
Typo for index.html, I hope. My personal suggestion for an index redirect is:

RewriteCond %{REQUEST_URI} ^/((?:\w+/)*)index\.html
RewriteRule index\.html$ https://example.com/%1 [R=301,NS,L]
If you really have both html and htm, then use html? as in the first post of this thread. Otherwise the ? isn't needed. (For a wonder, search engines will not randomly try index.htm, only index.html.) If any of your URLs contain non-word characters, replace \w with [\w-] or whatever is appropriate.

The reason I do it this way is that the act of capturing uses a certain amount of server resources--and then on 99 requests out of 100, the capture ends up being thrown away when the server reaches the end of the request and does not see an “index.html”. Instead, defer the capture for the Condition. The [NS] flag means “don’t process this rule, and don’t evaluate its conditions, on subrequests”--in this case, the request from mod_dir that changes /dir/ back to /dir/index.html.

Although HTTPS would seem to be an absolute binary toggle--it's either ON or OFF--many people hereabouts suggest expressing the condition as
RewriteCond %{HTTPS} !on
just-in-case. But note also that this redirect has two conditions, separated by OR, because you're adding the new HTTPS redirect to your existing with/without www redirect to make a new improved canonicalization redirect. One rule, two Conditions, rather than two separate rules leading potentially to two separate redirects.
5:07 pm on July 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


Sorry, my mistake. My head is a mess atm. I can't seem to take anything in.

My .htaccess is at present...

Options +FollowSymLinks
RewriteEngine On
#
# redirect index.htm and index.html to / (do this before non-www to www)
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ https://www.example.us/$1 [R=301,L]

(I've only changed the site name to "example") for here.

All of my pages are .html except for the 404 403 and 500 pages which are...
ErrorDocument 404 /404.shtml
ErrorDocument 500 /500.shtml
ErrorDocument 403 /403.shtml

I have some URLs that have numbers in them. Are they classed as "non-word characters" ?

Just to clear up any errors I've made in my posts, and to clear things up for you all...

example.us/index.html redirects to
example.us/ without problems.

If I type any URL using http, they all redirect to https.

My pages are all static.

Not2easy said...
"Also remember that order counts. If your pages are static html pages, these should be the last lines in your htaccess file after all other rewrites."

Does that include things like...

RewriteEngine on
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^google\.com [NC]
RewriteCond %{HTTP_REFERER} !^bing\.com [NC]
RewriteCond %{HTTP_REFERER} !^yahoo\.com [NC]
RewriteCond %{HTTP_REFERER} !^https://example.us/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://www.example.us/.*$ [NC]

RewriteRule .*\.(gif|jpe?g|png)$ - [F,NC,L]

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^https://(www\.)example.us/.*$ [NC]
RewriteRule \.(gif|jpg|jpeg|bmp|zip|rar|mp3|flv|swf|xml|php|png|css|pdf)$ - [F]

RewriteEngine on
# Options +FollowSymlinks
RewriteEngine On
RewriteCond %{HTTP_REFERER} ANOTHERSITE\.info [NC]
RewriteCond %{HTTP_REFERER} ANOTHERSITE2\.info [NC]
RewriteCond %{HTTP_REFERER} ANOTHERSITE3\.info [NC]
RewriteCond %{HTTP_REFERER} ANOTHERSITE4\.net [NC]
RewriteCond %{HTTP_REFERER} ANOTHERSITE\.com [NC]
RewriteRule .* - [F]


deny from .....(removed IP)
deny from .....(removed IP)
deny from.....(removed IP)
deny from.....(removed IP)


If I read you correctly, my

RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.example.us/$1 [R=301,L]

should go just above the "deny from"?

Thank you both for trying to help out a thicko on this.
6:16 pm on July 19, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15019
votes: 665


I have some URLs that have numbers in them. Are they classed as "non-word characters" ?
A “word character” in RegEx-speak is any alphanumeric and also _ (lowline) but not hyphen. So, within the pure-ASCII realm,
\w = [A-Za-z0-9_]
7:00 pm on July 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


Thank you, Lucy.

Some URLs do contain an underscore _ or a hyphen.

Thank you for all your help. I will look at this again when my brain isn't so fried.

I've tried a number of times over the years with Apache, and it always leaves me more confused than before I started.
5:07 am on July 20, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3902
votes: 222


should go just above the "deny from"?
In your current format, yes.

It is convenient to leave a blank line between each "set" of rewrite rules, so that it is easier to edit/change/remove/work on each Cond/Rule set individually. It lowers the chance for errors. That said, you only need to set this part once before beginning the rewrites section:
Options +FollowSymLinks
RewriteEngine On

Once it is on, it stays on and doesn't need that for each rule. It doesn't hurt anything but it isn't needed.

The anti-hotlinking rewrite sets should come before your index and canonical rewrites and personally I prefer the deny lines before the rewrites as well. Of course if you have a custom error page for your 403, you should be sure to allow that page for all visitors, to avoid server errors.

You probably already keep a copy of your htaccess file before editing it. If not, that's a good habit to have so that in a worst-case boo-boo you can always put back the old one in its old form if something goes sideways.

12:18 am on July 21, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


Sorry for the delay. I've been tied up with things, and I still haven't got over from a fried brain.

I do leave a blank line between each rule, and yes, I do have a back up of the .htaccess file, but also have slightly older ones. I need to clear the older ones out so not to make the mistake again from copying and pasting from the wrong one.

Thank you for your help on this.

On a side note, I used the same .hraccess file for my other site (different URLs of course).

On that site, Google shows the cached page in the SERPS, yet my main site doesn't.
3:11 am on July 21, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3902
votes: 222


If you are still uncertain about whether the pages are all redirected via 301, you can always type or paste some of your http URLs in your browser's address bar. It helps to know your IP address for this. To verify that your site returns the https pages via 301, you can either view your access logs, or for many hosts' CPanel setup you can see "Recent Visitors" in CPanel. Look for the 301 server response in the requests for those pages you requested.

From your most recent comment, it sounds more like what happens when Google decides to evaluate your site for inclusion in the Mobile First index. If that is what is causing the problem, you will be receiving an email shortly telling you about their change. IF that is what is going on there isn't more to do except wait a few weeks.
7:34 am on July 21, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 30, 2006
posts: 3275
votes: 13


The pages 301 okay. They did before I even wrote and asked here.

My site isn't mobile friendly. It was written long before they came onto the scene. I was speaking to a friend about this problem the other day. He checked his sites and found he had the same problem... some were cached by google, others not... and they were done 18 months or so ago.

I'll wait and see what's going to happen as I have more pressing things going on atm with Adsense.

Thank you for your help on this.
4:44 pm on July 21, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15019
votes: 665


It was written long before they came onto the scene.
Paradoxically, a very very very old site may be more mobile-friendly than a newer one. HTML is responsive by default. You have to go out of your way to code it not to be, for example by setting an explicit width of 2200px because of course all humans have windows at least that wide.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members