Forum Moderators: phranque

Message Too Old, No Replies

Custom HTTP header rel="canonical" for .pdf

         

lostdog

2:09 pm on Feb 2, 2023 (gmt 0)

10+ Year Member



Hello,

Looking to create a rewrite rule to cover all .pdf's in multiple folders. Important note: some of these pdf's are in multiple folders so I am not sure what to do here since this a for rel="canonical" for .pdf'

So the sample I created below the folders /sample1/sample2/ will need to be wild cards as these folders change, or I could create 4 different rules for the first sub folder /sample1/ as there are only 4 here but for /sample2/ there are many many more. Any advise on how to do this?


RewriteRule ([^/]+)\.pdf$ - [E=FILENAME:$1]
<FilesMatch "\.pdf$">
Header add Link '<http://www.example.com/sample1/sample2/%{FILENAME}e>; rel="canonical"'
</FilesMatch>

lucy24

5:25 pm on Feb 2, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Couple of tangential questions:

Why did you choose "Header add" rather than the conventional "Header set"? Granted, the Link header (unlike some) can carry multiple values, but it still risks confusing the recipient. Is there any possibility that your site will be sending out other Link headers for other purposes?

What is the underlying purpose of the header? It seems to be saying, with all pdfs, “The present URL is the canonical”. Are there any circumstances where your site would be sending out pdfs that are not canonical, and if so, how would you prevent the header from being sent?

Coincidentally, there are a couple of recent ongoing threads pertaining to environmental variables. One conclusion gleaned from all of them is to use SetEnvIf whenever possible, and here it's definitely possible, using the syntax
SetEnvIf Request_URI etcetera
This can go inside the FilesMatch envelope, bypassing concerns about inheritance and execution order.

:: detour to check ::

OK, yes, it's called mod_headers, so it will execute after both mod_rewrite and mod_setenvif.