| 5:48 pm on Jul 9, 2012 (gmt 0)|
Do the URLs with a slash after them resolve, or do they 404?
I would take a look at your site structure, it sounds like you may have an issue with redirects or something.
| 6:51 pm on Jul 9, 2012 (gmt 0)|
Sounds very similar to this thread: [webmasterworld.com...]
Given two threads on the topic, maybe Google has a bug? Connect with the original poster there and see if they resolved the issue.
| 10:36 pm on Jul 9, 2012 (gmt 0)|
The page does not 404. IT simply doesnt show any images.
| 10:53 pm on Jul 9, 2012 (gmt 0)|
I'd 301 redirect that bad page and add a canonical tag on there to ensure that Google knows which one matters. They should disappear from the index before too long.
| 12:41 am on Jul 10, 2012 (gmt 0)|
I have thousands of .php files, and Google is starting to affect many of them.
| 12:53 am on Jul 10, 2012 (gmt 0)|
You can write one simple rewrite rule to redirect them to a version without the slash.
Better to do it through Apache (or whatever web server you're using) than in php anyway. Doing it in php means unnecessary load and a slower redirect. And for a redirect like this there really is no need to do it in php.
| 1:03 am on Jul 10, 2012 (gmt 0)|
I just started noticing this on my site too so I am adding rel canonical to all sections of my site.
@rango mentions 301 redirect as well as canonical but isn't that redundant? Shouldn't one or the other suffice?
| 1:11 am on Jul 10, 2012 (gmt 0)|
The 301 is just for this known situation. The canonical should help protect against future problems.
For the user who lands on one of those bad pages, the 301 is definitely nicer. They aren't going to be looking at your canonicals ;) And hey, the more good signals to Google the better.
That other thread also mentioned canonical tags not helping with this specific problem mind you.
| 7:44 am on Jul 10, 2012 (gmt 0)|
|The page does not 404. IT simply doesnt show any images. |
:: peering into crystal ball ::
The images use relative links, which worked fine as long as you had
side by side with
but now that the user-agent thinks you're in
it goes looking for
| 9:17 am on Jul 10, 2012 (gmt 0)|
Replace the links to images in the form
href="/images/file.png" (add a leading slash and the full path to the file) to fix the problem.
By the way, the thread title is misleading. Google didn't "change" your URLs. When Google accessed those incorrect URLs (from a link they found somewhere on the web, either on your site or on some other) a design error on your site meant they returned 200 OK and were therefore indexed.
| 2:25 pm on Jul 10, 2012 (gmt 0)|
lucy24, I read somewhere that using fixed links to resources is bad.
Can someone confirm?
| 2:27 pm on Jul 10, 2012 (gmt 0)|
g1smd sorry it was misleading ("the title"). (my God, I'm starting to write as Yoda).
English is not my main language, and I found it difficult to explain the situation in one line. I was going to add more info to the title, but there is a max of characters you can add.
| 5:34 pm on Jul 10, 2012 (gmt 0)|
|I read somewhere that using fixed links to resources is bad. |
Depends whom you ask. Right here in this very thread, you'll see g1 telling you to use site-absolute links for everything-- that is, the kind with / at the beginning. Conversely I'm all for keeping things in packages-- page plus associated files-- so the relative links stay the same even if the package is moved.
If your site is designed from scratch with all the images in one place, then absolute links will always work.
There are some situations where you must use absolute links, notably in error documents.
Oh, and google may not be literally changing your URL. But it has a solid history of inventing URLs out of its fevered imagination-- or out of willful misreading of anchor text, which amounts to the same thing. And depending on extension, the extra bits may calmly attach themselves to the URL. And then you're stuck with two.
| 8:07 pm on Jul 10, 2012 (gmt 0)|
RewriteRule (.*)\.php/ $1.php [R,L]
I hope I'm not screwing my fate.
| 8:17 pm on Jul 10, 2012 (gmt 0)|
Make that [R=301,L] to make it a permanent redirect. Other than that, I think it should work for you.
| 9:07 pm on Jul 10, 2012 (gmt 0)|
| 9:26 pm on Jul 10, 2012 (gmt 0)|
Add the protocol and domain name to the rule target. Never start the rule target with a backreference. It's a huge security risk.
(.*)\.php/ with a more efficient pattern.
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]
This parses left to right in one go and does not invoke tons of "back off and retry" trial matching.
| 9:45 pm on Jul 10, 2012 (gmt 0)|
Edit: Proving once again that I type much slower than g1...
Do you think there is the least possibility that your Apache installation does not include mod_rewrite? There's a thought to make the blood run cold. The "IfModule" envelopes are for boilerplate htaccess that comes with CMS packages whose designer has no idea where they will be used. Once you're on an individual site, you either have a given module or you don't. If it exists but the AllowOverrides settings don't let you use it, change hosts :)
.* should be expressed as
so the server doesn't have to backtrack after capturing the entire request and then learning that it was supposed to leave room for .php at the end. (Apache works only in one dimension. It can't see what's coming up ahead.) Opening anchor so it can't cheat by ignoring any earlier full stops-- not that there should ever be any in mid-URL. Unless, ahem, your name is apache dot org
+ rather than * because if you get a request for www.example.com/.php/ then the slash is the least of your problems.
Does it also attach / to the names of index pages? If so, you need to do some ruthless redirecting, because "index.php" should never occur at all:
RewriteRule ^(([^./]+/)*)index\.php/? http://www.example.com/$1 [R=301,L]
| 5:14 am on Jul 12, 2012 (gmt 0)|
I just started to see this problem on my site too.... certainly some kind of bug on google's side
in my case, the .php/ and the .php version both show up fine(same content), but google is indexing only the .php/ version(for a few pages)
the problem is that on some of those pages I have internal links pointing to relative urls, for example, www.example.com/somepage.php/ has a relative link to contact.php so the user ends up going to www.example.com/somepage.php/contact.php which shows the same content as www.example.com/somepage.php because my server will then treat /contact.php as some kind of query string
| 5:20 am on Jul 12, 2012 (gmt 0)|
One page that was "fixed" dissapared from Google's database. Maybe temporarily, maybe it will be "punished".
Previous page to disspear, lost a lot of ranking.
I've seen the same behaviour in one site of a new client, that had just been hacked, the thief was using 301 to redirect... the client lost ranking for 2-3 months.
| 5:21 am on Jul 12, 2012 (gmt 0)|
a more IMPORTANT UPDATE
I just checked the Webmaster tools and it states that I am having a DNS problem. Doesn't say how or why. It is strange. Nobody has been moving anything.
IT began in July 2. Weird.
| 5:51 am on Jul 12, 2012 (gmt 0)|
ns2 ping is failing. :S
| 11:19 am on Jul 12, 2012 (gmt 0)|
|Something like |
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$1.php [R=301,L]
actually it should be:
RewriteRule ^([^/]+/)*([^/.]+)\.php/ http://www.example.com/$2.php [R=301,L]
| 6:55 pm on Jul 12, 2012 (gmt 0)|
No. There's an outer layer of parentheses missing. Add that and change to $1.
RewriteRule ^(([^/]+/)*([^/.]+))\.php/ http://www.example.com/$1.php [R=301,L]
| 7:21 pm on Jul 12, 2012 (gmt 0)|
Or, in the alternative,
RewriteRule ^(([^/]+/)*([^/.]+)\.php)/ http://www.example.com/$1 [R=301,L]
| 7:36 pm on Jul 12, 2012 (gmt 0)|
I prefer the former as it more clearly higlights that the redirect is to a URL with a filename that includes a extension.
| 5:15 pm on Jul 30, 2012 (gmt 0)|
I had to resurrect this thread, since I recently fell in this boat, as well.
But unlike the OP, my page is not php, but simple .html which now has the trailing slash behind it. Unfortunately, it's also my most linked-to page and even in the GWMT it shows as "mypage.html/", as well as in the search results. Needless to say, the page ranks nowhere :(
I have no idea what to do. I don't really want to use redirect, since Google is clearly thinking that "mypage.html" and "mypage.html/" is the same page, not sure how any form of redirect would affect this.
Don't' really know the reasoning behind google choosing the page with the slash to be the one that displays, when the majority of the links pointing to this page are all without the slash. I am assuming some scraper messed up and linked to us with a slash.
Anyone has anything new to add to this problem?
| 9:32 pm on Jul 30, 2012 (gmt 0)|
This points to extremely lax URL rewriting or some such configuration error. On a normal site, a request for URL with both extension and trailing slash would simply return 404 Not Found.
A redirect will tell Google to no longer index the URL with trailing slash and to replace it with the URL without the trailing slash.