Forum Moderators: phranque

Message Too Old, No Replies

rewrite 404 issue

search not indexing mod_rewritten pages

         

frankstuner

8:21 am on Apr 23, 2012 (gmt 0)

10+ Year Member



I've recently made a new site on wordpress, the bulk of the site content is user profiles. I've added the following mod_rewrite rule so a user profile can be found at www.mysite.com/username

The code:


# BEGIN WordPress
RewriteEngine On
RewriteRule ^index\.php$ - [L]
RewriteRule ^([^/.]+)$ /index.php?profile=$1 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# END WordPress



Everything works perfectly on the surface, except the profiles aren't being indexed by search engines, so I checked the header response on an individual profile page and its saying HTTP/1.1 404 NOT FOUND.

Now I haven't technically got the individual profiles registered through the wordpress system as a post or page. I've just added a new custom table to the existing database and am querying the db directly. Could this be the issue, cause when I enable the wordpress 404 page and I go to www.mysite.com/username it will 404 instead of going to index.php

I'm not really sure where to start with this issue, if anyone could point me in the right direction I'd appreciate it!

Thanks
Frank

whoisgregg

2:41 pm on Apr 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've had this problem before with Wordpress.

Basically, if you are invoking Wordpress to handle a particular URL, but that URL doesn't actually match up to a Wordpress page or post, then Wordpress will send a 404.

Override it with this wordpress function in your code before any other output is sent to the browser:
status_header(200);


Note: it's very important to search engines that you still have hard 404 errors when a page truly does not exist... Double check your headers on actual missing URLs and make sure those still send a 404 header.

frankstuner

4:44 pm on Apr 23, 2012 (gmt 0)

10+ Year Member



hey thanks heaps for the reply, I gave that a go but its still showing up 404 in the response header.

I've got it set so that if the hook variable "profile" is set then run the function.

I set it to the very top of my header.php like so...

<?php if($_GET["profile"]!=""){ status_header(200); } ?>
<!DOCTYPE HTML>
<html>
<head>

I tried it without the if statement to the same result. Is that the correct way to implement it (above everything in header.php)?

thanks Frank

g1smd

5:19 pm on Apr 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Briefly try:

HEADER("HTTP/1.1 200 OK");

frankstuner

5:29 pm on Apr 23, 2012 (gmt 0)

10+ Year Member



same result, 404.

Does that mean its something on my end?

frankstuner

11:04 am on Apr 24, 2012 (gmt 0)

10+ Year Member



I've been reading up on header("http/1.1 200 OK"); & status_header(200); and i'm a touch confused about what they do.

Are they meant to over ride and essentially force the browser to say yep this is the page your looking for?

whoisgregg

5:47 pm on Apr 24, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The status_header function in Wordpress simply is a convenience method to call header() with the various common status headers.

The important thing is that any header() call needs to come before *any* output is sent to the browser. For example:

<?php 
header('HTTP/1.0 404 Not Found');
echo 'Hi!';
header('HTTP/1.1 200 OK');
?>

In this example, the 200 header is never sent and only the 404 header is sent. Depending on how you have error reporting configured, this may silently fail or you might see warnings in the browser.

Most of the time, the problem with status headers is that they are sent too late to change what's already been sent. You can test this by adding a
echo 'here is the 200 header';
right after you call the status_header(200) function and then view source to see if there is anything output before your message (maybe some html code, or even just a line break).

If there is, you'll have to hunt back through your code until you find a location where you can call status_header(200) before anything else is sent.

If not, well, let us know. :)

frankstuner

3:42 am on Apr 25, 2012 (gmt 0)

10+ Year Member



I just tried that at the top and you know there is an extra line at the top of my source code, but there is no line at the top my header.php, I then thought it might be one of those bom things cause they've been a headache for me in the past, not that i have a clue what they are, so i converted it to UTF-8 without bom and still had no luck..

I also tried to do a header(location: ...) just to see if it would function and no go, header doesn't want to work in general

the rendered source code is looking like this..

[extra blank line here]
Hi!<!DOCTYPE HTML>
<html>
<head>


my header.php is like this

<?php
header('HTTP/1.0 404 Not Found');
echo 'Hi!';
header('HTTP/1.1 200 OK');
?>
<!DOCTYPE HTML>
<html>
<head>


There is absolutely no spaces or lines being rendered before the header() code is being executed in header.php, I'm not sure if you guys are down with wordpress but does anyone know there are any precursor pages before header.php from my understanding header.php is the first page in the cycle.

I've backspaced a thousand times at the top looking for this extra line but I ain't got no idea what's going on here.

whoisgregg

10:17 pm on Apr 26, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are *many* precursor pages... One place to look is in plugin files. If they close their last php tag and then have a couple extra line breaks, those can carry through and show up at the top of your output page.

Example:

<?php
// some plugin code
?>




Those extra returns at the end there? Those could be your culprit.

frankstuner

9:24 am on Jul 5, 2012 (gmt 0)

10+ Year Member



FYI for anyone interested in the outcome of this thread, I couldn't find the whitespace causing culprit, but i remade the site clean and followed the instructions from whoisgregg and it is now working well.

Thanks heaps whoisgregg & g1smd, couldn't have solved it without you guys, really appreciated!