Welcome to WebmasterWorld Guest from 54.226.23.160

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

So confused about htaccess

Help very much appreciated

     
10:53 pm on Jul 12, 2014 (gmt 0)

Junior Member from GB 

10+ Year Member Top Contributors Of The Month

joined:May 24, 2006
posts: 89
votes: 5


I've been going around in circles today with my Drupal website and I'm really worried that the "support" team that is helping me has made a mistake, but I don't understand enough to be 100% sure or to know what to do about it.

The problem started this morning when I noticed that the Google custom search box on my site isn't picking up any content that I've added in the last few weeks. I went to Google Webmaster Tools to check my sitemaps were being indexed, and they are. So I used "fetch as Googlebot" to check one of my recent pages, http://www.example.com/example-page

The response was "redirected" with the following information:

HTTP/1.1 301 Moved Permanently
Server: Apache/2.2.15
Location: http://example.com/example-page
Cache-Control: max-age=1209600
Expires: Sat, 26 Jul 2014 11:26:00 GMT
Content-Type: text/html; charset=iso-8859-1
Content-Length: 255
Accept-Ranges: bytes
Date: Sat, 12 Jul 2014 22:29:38 GMT
X-Varnish: 1728007996 1727157242
Age: 39819
Via: 1.1 varnish
Connection: keep-alive

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://example.com/example-page">here</a>.</p>
</body></html>

My site has always used www. at the front of the domain, and I haven't made any changes myself. I don't know why there should be a 301 redirecting the www to a non-www all of a sudden. In GWT my site is listed as http://www.example.com.

So I emailed my support people with the above information and asked whether somebody might have made some changes for any reason, or perhaps the change might have happened during an update.

Instead of receiving a yes / no answer, I got this response:

<< I've just modified .htaccess file so that your website is indexed both with and without "www". Please let us know if you want it to be available only through "www". >>

I really don't understand the .htaccess file but have a really bad feeling that making it both with and without www is absolutely the wrong thing to do? Am I right in thinking that would mean I have effectively two versions of each page to be crawled?

So I guess my two most immediate questions are

1. Should they have made these changes to the htaccess?

2. If not, what should I ask them to do to fix the problem?

Obviously I've also still got to find out why all my pages are suddenly being redirected away from the www. Any why my search box isn't working anymore!

Any help would be much appreciated. I've had a hard time with my website recently anyway and I'm terrified that today's change to the htaccess will be the final straw.

Smallp

[edited by: incrediBILL at 1:41 am (utc) on Jul 13, 2014]
[edit reason] Please use EXAMPLE.COM for all domain names [/edit]

11:23 pm on July 12, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:3524
votes: 324


Did you previously have a re-direct in the opposite direction, from non-www to www?
11:27 pm on July 12, 2014 (gmt 0)

Junior Member from GB 

10+ Year Member Top Contributors Of The Month

joined:May 24, 2006
posts: 89
votes: 5


Yes, when I moved the site to Drupal a year ago I specified that I wanted www (as it had been www previously).

Today I noticed that the address bar was defaulting to domain.co.uk, though, so something must have been changed.
11:39 pm on July 12, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3123
votes: 0


I've just modified .htaccess file so that your website is indexed both with and without "www".


If they've "modified" your .htaccess file to enable this then that would usually mean they have deleted (or commented out) an existing redirection?

My site has always used www. at the front of the domain, ...


Have you therefore always redirected the bare domain to www?

...making it both with and without www is absolutely the wrong thing to do? Am I right in thinking that would mean I have effectively two versions of each page to be crawled?


Well, potentially, yes. Google could then return either www or non-www in the search results. You can specify this preference in GWT - however - if a redirection states the contrary then this is not going to work.

You should register both the www and bare domain in GWT.

Post your .htaccess file (or the relevant parts from it if it's big) so we can have a look.

Also, clear your browser cache! Your browser will cache redirects, so it will be difficult to check if this has changed.
3:58 am on July 13, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15443
votes: 738


<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

This part makes me uneasy, because why is the server returning a page along with the 301 header at all? (Let alone HTML 2-- two! --instead of the expected HTML 3 that you normally see in server-generated pages like auto-indexes.)
10:38 am on July 13, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3123
votes: 0


This part makes me uneasy...


This actually looks fairly standard to me? It's the same as what my site returns, webmasterworld.com, and... your site!?
12:37 am on July 14, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11566
votes: 182


According to RFC2616:
Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).
12:40 am on July 14, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11566
votes: 182


1. Should they have made these changes to the htaccess?
2. If not, what should I ask them to do to fix the problem?

1. No
2. Change it back so all hostname requests are canonicalised to www.example.co.uk
1:28 am on July 14, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15443
votes: 738


the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

Ah. So it's just FF being cute by not showing me everything that gets returned, even though that's supposed to be the whole point of the extension.

In unrelated news, I've just learned that headers are supposed to send line endings in CRLF form (two characters instead of one). Wasn't sure if it was Firefox or the server. I recently met a robot that appended \r (the literal character) to all its redirected requests, which naturally made me curious.

HTML 2 is still pretty funny, though. Every auto-index I've ever seen used HTML 3, and you'd expect them all to be the same.
7:45 am on July 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3123
votes: 0


HTML 2 is still pretty funny...


Well, it is a valid HTML 2 document - and this is all the DOCTYPE is saying. So there would seem to be no good reason to specify it as something greater.

Whereas an "auto-index" document uses additional HTML 3.2 elements (TABLE, HR, ...) and attributes (ALIGN, VALIGN, COLSPAN, ...) so requires the higher DOCTYPE.
3:53 pm on July 14, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15443
votes: 738


<topic drift>
Tables didn't exist in HTML 1 and 2? Who knew :) I don't think I even want to know what they added between 1 and 2, then. I do remember that 3 didn't have entities; that was a 4 addition.

Anyway, it's a useful thing to keep in mind when people worry that the growth of HTML 5 could mean that aspects of their HTML 4 pages will stop working.
</topic drift>
4:22 pm on July 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3123
votes: 0


<topic drift>
> "Who knew"
- "HTML for Dummies Quick Reference, 2nd Edition, 'NEW! Revised & Updated!' (C)1997"
> "I do remember that 3 didn't have entities; that was a 4 addition."
Weelll, the book says... 3.2 has lots of entities! :)
</topic drift>
8:52 pm on July 14, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15443
votes: 738


1997, heh, so the book was published about 5 minutes before 4.0 came in. Which in turn must have been about 5 minutes before I made my first html page, because years later when I first met the validator, it politely suggested that I seemed to be using 4.01 Transitional. The horse's mouth [w3.org] says
Warning! HTML 3.2 was superseded by HTML 4.0 in December, 1997.

No kidding. 3.1 seems to have been just as invisible as Apache 2.1 and 2.3. (They do explain 3.0.)

When I said "entities" I should have specified named entities. The ones that are referenced as "HTML 4 entities [w3.org]". I vividly remember that MSIE 5 wouldn't display characters unless they had a named HTML4 entity. You didn't have to use the name-- UTF-8, decimal and hexadecimal all worked-- but there had to be one. And even then, it made a right mess of Greek letters.
<!-- Character entity set. Typical invocation:
<!ENTITY % ISOlat1 PUBLIC
"ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML">
%ISOlat1;
-->
<!ENTITY nbsp CDATA "&#160;" -- no-break space -->

et cetera. What fun! So's the .ent extension. I had to go look that up, though the file opened happily in a text editor.*

Someone in a different thread recently discovered that some mobiles won't recognize decimal entities, only hexadecimal. Happily there's no reason to use entities at all.


* SubEthaEdit and GraphicConverter can both open more-or-less everything... including each other's files. This can be unnerving when I've clicked on the wrong icon.
9:37 pm on July 14, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11566
votes: 182


as invisible as Apache 2.1 and 2.3

the apache versions alternate between development and production releases.
you would only see 2.1 and 2.3 on development/testing servers.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members