Welcome to WebmasterWorld Guest from 54.163.168.15

Forum Moderators: Ocean10000 & incrediBILL & phranque

a little help rewriting canonical pagination urls

   
8:09 pm on Feb 5, 2014 (gmt 0)

5+ Year Member



I've got a little php script i'm using for new content. But I did something a little wrong.

It does what it's suppose to, but it's also adding unnecessary, duplicate url's.

In .htaccess -


RewriteRule ^my-silly-new-content-([^.]+).html$ my-silly-new-content.html?page=$1



Basically what it does is add the first page which is :

mysite.com/my-silly-new-content.html

and also the pages that increase whenever more content is added...such as :

mysite.com/my-silly-new-content-2.html
mysite.com/my-silly-new-content-3.html

etc, etc..so that's good. (whenever I have more than 10 items per page, I need it to go on the next page)

But, I just noticed the problem comes in, whenever anything is added after the

my-silly-new-content

part.

so that just typing in anything after that part, such as :

mysite.com/my-silly-new-content-blahblahblah.html

or

mysite.com/my-silly-new-content-whateverwhatever.html

results in a new page as well, which displays the default content on

my-silly-new-content.html

Basically, I just want the numeral system to keep going, while blocking any other extras from being indexed...to prevent duplicate content.

Thought I had the rewrite down flawlessly, but I guess not. :(
10:06 pm on Feb 5, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



In .htaccess -

What flags are attached to your rule? It isn't directly relevant to the question, but I sure hope there's an [L]. Incidentally, the target should start with / (slash = root).

whenever anything is added

Yes, that's one of the long list of Problems You Don't Have To Worry About Unless They Happen. Here the fix is simple: just replace the all-encompassing
[^.]+
with a narrower
\d+
or (if you don't trust mod_rewrite's RegEx engine)
[0-9]+
Also make sure that the php script itself returns a 404 for any non-numeric values of "page".
5:23 pm on Feb 6, 2014 (gmt 0)

5+ Year Member



good enough, you should really have a Top Contributor tag next to your name. (Mods listening out there ? )

both instances worked whenever any letter / word was added into the mix. But numericals still passed through. example

my-silly-new-content-2222.html

still displayed a page, but idc. good enough. probably will only have 1000 max page total....before cutoff in the database.

The general flag / php coding that calls that part out :


include "pagination.class.php";

$p = new pagination;

// Items per page
$p->perPage = 120;

// Pagination left from current
$p->paginationLeft = 3;

// Pagination right from current
$p->paginationRight = 3;

// Link href
// $p->path = '?page=%d'; // or $p->path = 'example/%d/';
$p->path = '/feeds/my-silly-new-content-%d.html';

// Paginaion appearance
$p->appearance =
array(
'nav_prev' => '<a href="%s" class="prev"><span>prev</span></a>',
'nav_number_link' => '<a href="%s"><span>%d</span></a>',
'nav_number' => '<a href="javascript:;" class="active"><span>%d</span></a>',
'nav_more' => '<a href="javascript:;" class="more"><span>...</span></a>',
'nav_next' => '<a href="%s" class="next"><span>next</span></a>'
);


$count = $db->get_one('SELECT count(*) as cnt FROM feeds');

// Items count
$p->setCount(1000);

// Current page
if(isset($_GET['page'])){
$p->setStart($_GET['page']);
}
7:34 pm on Feb 6, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



both instances worked whenever any letter / word was added into the mix. But numericals still passed through.

This part is probably easier to do in php. Especially if the cutoff is some arbitary number: 2134 is valid, 2135 isn't.

You can certainly include a line in your RewriteRule that constrains the URL to some number of digits, like

^my-silly-new-content-(\d{1,4})\.html$

Anything with too many digits would then bypass the rule and meet an ordinary server-generated 404.

You might also think about eliminating leading zeros if they don't have meaning. That's yet another of those Problems You Don't Have Until You... I've got one myself that says
RewriteRule ^dir/dir2/chap0+(\d+\.html) http://example.com/dir/dir2/chap$1 [R=301,L]
10:54 pm on Feb 6, 2014 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Your PHP script should be amended to return a 404 HEADER and to INCLUDE your 404 page when a URL request is invalid, such as a non-existent page number.

The RegEx pattern in the Rule can be constrained to ensure the URL ends with digits for the page numbers. I would use -1 for page one here for consistency.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month