Forum Moderators: coopster

Message Too Old, No Replies

Security Check

before I invest more time

         

willybfriendly

2:02 am on Feb 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just want to be sure I am not creating a large hole into my site.

I am trying to convert an old static site of several hundred pages into a CMS. So, I need to preserve old URLs (lots of deep links, excellent rankings, etc.)

In .htaccess I have:

Options +FollowSymLinks
RewriteEngine on
RewriteRule .*(php)$ template.php

which directs the old .php URLs to the new template.php page

Then, on template.php I have:

$page = pathinfo(($_SERVER[REQUEST_URI]));
$page = $page['basename'];
if(in_array($page, $pages))
include("$page");

which captures the old file name, checks it against an array of valid pages

I have this working on my test server, but before I put much more time into it, or worse, take it live, I thought I would get a security check.

Any major holes in the above? If so, strategies to plug them?

WBF

jatar_k

8:29 pm on Feb 21, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I might test the value of $page to make sure it only has allowed characters.

what do you do if(!in_array($page, $pages))

do you serve a 404?

willybfriendly

10:56 pm on Feb 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks jatar_k

Yes, a 404 if no match.

My problem with this site is that it developed over a period of years with no real organization. Very hard to navigate through, but very content rich.

It ranks top 5 across literally hundreds of search terms in its niche, and has many, many deep links.

I have to get it under control and yet preserve what is already there.

Oh, there are already some areas that are DB driven, which muddies the waters even more.

The only saving grace is that all the existing static pages have that .php extension. So, I thought I could use that to map it onto a simple custom CMS. At this point I am thinking flat file, but I am wavering some on that over the past 24 hours. It would be easy enough to pour the current content into a db, and I would have more future flexibility.

But, back to your points.

$page should never come near a db unless it matches an existing page which is in the array $pages.

Are you suggesting that I do something like:

$flag = false;
if(preg_match([:alnum:],$page))
{
$flag = true;
if(in_array($page, $pages))
{
$flag = true;
//do the page mapping
}
}
if(!$flag)
{
header("HTTP/1.0 404 Not Found");
die();
}

Something like that should make sure that only alpha-numerics are in $page and screen against obvious probes or injections, correct? (Above is untested, so go easy on me ;))

WBF

jatar_k

11:08 pm on Feb 21, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



you also will want to accept a . in there as well, though I am only assuming you get the extension in basename also

>> My problem with this site is that it developed over a period of years with no real organization

hahaha, that's everyone's problem with any long term site ;)

as long as you don't actually change any pagenames then you should be ok

word of serious warning: make sure you really think all of this through and don't make a mistake, you could tank the whole thing

I assume you are considering the site as a whole since you reference that some is already db driven.

1. does the whole thing need to be db driven?
2. how many pages are we talking?
3. what about using the old include for header and footer and leaving the content in the actual files?
4. what about leaving real pages with includes for header and footer and then putting some kind of id in the file that refs where the content is in the db? a quick function call to output the content and all the pages still exist.
5. why does it really need to change? is the management time horrific?

willybfriendly

11:47 pm on Feb 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Management is HORRIFIC. And we're only talking a few hundred pages of static content.

This has been simmering for a year or more. My first thought was just revamp the the structure/navigation. (Even I can't find things that I know are there!)

That was daunting enough that it kept getting put off. But a combination of events have convinced me I need more flexibility than I now have.

1. A run through with Xenu revealed a large number of dead outgoing links
2. A friendly user email informed that at least two outgoing links now go to pron sites (they were originally good resource sites)
3. There are affiliate links that need to be maintained
4. While much of the content is "evergreen" there is some that needs to be updated now and then
5. Etc.

I have spent a good deal of time reviewing Tedster's wonderful posts about information architecture (http://www.webmasterworld.com/forum21/7649.htm) and think I have a handle on the site anvigation issues. So why not fix the maintenance issues at the same time...

You warnings are well taken. The site should be able to sustain a minor/temporary hit. In 2005 about 53% of traffic was from SE's, the rest from links, bookmarks, email, etc.

Preserving page names is my first priority. Improving the site nav while preserving the relative balance of internal links is my second priority. Getting some control over the maintenance is my third priority.

Give me a couple of months after I pull the trigger on this thing and I'll come back and tell you how crazy I was ;)

WBF

freeflight2

12:04 am on Feb 22, 2006 (gmt 0)

10+ Year Member



the in_array should make this code fragment 'secure'. I audited a packages a couple weeks ago with a couple 1000 lines of code and it was missing such a simple restriction ouchh

jatar_k

2:57 am on Feb 22, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



it sounds like you have a fairly good handle on it

make sure to test, test, test