Page is a not externally linkable
jdMorgan - 4:32 pm on Feb 18, 2011 (gmt 0)
One thing you can do to limit the performance impact of long and complex access-control code sections is to be very selective about how and when you invoke them.
For an example specific to the anti-scraper access control being discussed here, consider this: It is not necessary to control access to images, CSS files, JavaScript files, or other objects included on the page, it is only necessary to control access to the page itself.
This is because if the scraper cannot fetch the page, then it cannot know the URLs of the objects included on that page, and so cannot fetch them either.
So, to expand on what wilderness stated above, you can put your 'pages' into one subdirectory, put your images and other included objects into one or more additional subdirectories, and then put the really heavy access-control code into the 'page' subdirectory only. On the other hand, you could put your anti-hotlinking code only into these other included-object directories.
Be aware that by using mod_rewrite or ISAPI Rewrite, you can avoid having to include these new subdirectory paths in your URLs; just rewrite the URL example.com/logo.gif to the filepath /images/logo.gif without changing the URL at all.
Because .htaccess files are "per-directory" configuration files, you can use subdirectories to control when and how access-control code is executed. This can mitigate the server performance issues of large access-control lists.
An aside: I tend to emphasize .htaccess file techniques simply because the majority of Web sites are hosted on name-based virtual servers, where server-level config files are not accessible to the Webmaster. For those on dedicated or virtual private servers, the technique described above can be applied even more efficiently by putting the access-control code into <Directory> sections in your httpd.conf or other server-level configuration file(s) instead of into individual .htaccess files in the subdirectories themselves.
Jim