Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Fixing dominant pdf rankings on Google with Apache redirects?

         

verso

3:38 am on Jul 23, 2012 (gmt 0)

10+ Year Member



Here's my scenario:

1. A user clicks a link to download a file, and is taken to a file "landing page" which contains the file (a PDF) in an iframe.

2. When anyone is going to dl a file from the site, I want them to go to the landing page, not the file itself.

3. Currently, many of the actual pdf's are ranking very well with their "real" URLs (example.com://abcd.pdf), and account for nearly 3/4th's of the sites total traffic (from search engines). Bad planning on my part? Indeed.

4. I added a rewrite rule to my .htaccess that sends direct "abcd.pdf" requests to the landing page with arguments to load "abcd.pdf" properly.

The question:
I want to maintain the established "pdf presence" in the search engines, and for that to happen, search engines need to be able to access the actual PDF, rather than the landing page(*). Can I redirect users to a landing page while sending [some] search engines to the PDF itself? Or will this trigger a "looks like you're being sneaky" error?

* Alternatively:
I could do my best to extract the text from the pdf (have had issues with that so far...), and simply do a permanent redirect on any request for a file to that file's landing page. I'm mainly afraid of shaking things up so much by permanently redirecting 70% of the site (about 8000 pages), and losing the current pdf rankings.


Any advice on these or other "solutions" is appreciated.

levo

6:04 am on Jul 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can use the 'rel=canonical' HTTP header, but I guess the content should be identical.

[googlewebmastercentral.blogspot.com...]

lucy24

8:00 am on Jul 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One approach is to put all your pdfs into a protected directory. Use the Satisfy Any directive to admit specific named search-engine robots, or humans with the appropriate referer or cookie.

There have been previous threads about analogous questions over in Apache. But it isn't something that comes up eight times a day, every day ;)

Generic page here >> [httpd.apache.org...]