Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- Stopping scrapers from the get-go


mcglynn - 3:21 pm on Feb 27, 2011 (gmt 0)


>> John, I think what he's saying is you can force
>> html to be exececuted as a php script (instead of
>> viewed as a static file). That way I can inject
>> php code into the top of the html files, and
>> it'll work just fine.
>>
>> I suppose I could also just rename everything from .html to .php.

Perhaps an academic point if you've already launched content and begun linkbuilding, but there is a tried-and-true method of creating a mini-CMS on the fly to serve static content and enforce access restrictions.

In a nutshell:
1- store the content outside the docroot, e.g. /home/httpd/static/
2- write a php script called "widgets" (no extension)
3- publish content URLs like example.com/widgets/1, example.com/widgets/2, etc

The "widgets" script enforces access rules, and if it determines the client should see the requested file, imports /home/httpd/static/1 or /home/httpd/static/2 or whatever is requested.

Needless to say, you can have full-text filenames, even paths too.

The point is that the static content remains static, but you get the full benefit of intelligent access control without modifying every static file, without importing them into a database, etc. The server config is minimal. And it's a trivial task to have the "widegts" script parse Apache's REQUEST_URI to figure out what static file had been requested.

Alternatively, put all the logic into the 404 handler.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/4267704.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com