Forum Moderators: phranque

Message Too Old, No Replies

Quick review of short htaccess file.

         

kahuna

8:07 pm on Feb 27, 2016 (gmt 0)

10+ Year Member



Thanks again group.. I switched over to PHP alas wordpress a couple of years ago.. but had lots of Perl scripts that I had previously used.
And previously called them with SSI... and now in PHP could only call them in an <iframe> tag...
But just founded out that Google was now also listing those scripts in the directory... because they were in an <iframe>,
and looked at them as a "webpage", and was afraid of it being thought of as duplicate content with the page with the <iframe>.

So I set about to only let the scripts being called from my own Ip address and my server address.
So this seems to be working...

Please check and review and comment...

RewriteEngine On
RewriteCond %{HTTP_REFERER} !mydomain\.com/directory/ [NC] <-- this line only lets the iframe to be called by the website.
RewriteCond %{REMOTE_ADDR} !^123\.123\.123\.123 <-- my ip address so I can check the scripts....
RewriteCond %{REMOTE_ADDR} !^124\.124\.124\.124 <-- the server address for cron jobs etc...
RewriteRule ^ - [F,L]

Thanks again group !

whitespace

11:00 pm on Feb 27, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



...in an <iframe>, and looked at them as a "webpage", and was afraid of it being thought of as duplicate content with the page with the <iframe>


It's not really "duplicate content", they are separate pages. But together they form one page that the user sees.

So I set about to only let the scripts being called from my own Ip address and my server address.
So this seems to be working...


Presumably by "scripts" you mean the contents of the iframes? But this is going to block the rest of the world (including Google) from seeing your scripts/pages?

lucy24

11:31 pm on Feb 27, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{HTTP_REFERER} !mydomain\.com/directory/ [NC]
I would call this a textbook case of when [NC] is not appropriate. Since it's your own site, you know the correct casing and can use it. (Why does it matter? Because [NC] doubles the server's workload: mydomain\.com/directory/ [NC] = [Mm][Yy][Dd][Oo] character by character, et cetera, you get the idea.)

RewriteRule ^ - [F,L]
You don't technically need the [L] flag here, because [F] carries an implied L. It does no harm, but why not save the two bytes ;)

I slap a noindex header on any and all scripts (<FilesMatch> envelope in htaccess), because you never know what people will search for. (I see this on one page where I evilly quote assorted error messages-- and, in consequence, will sometimes get visited by people seeking information on that exact error message, although that is in no way what the page is about.) If you want people to share your scripts, put them into html pages as visible text.

kahuna

1:16 am on Feb 28, 2016 (gmt 0)

10+ Year Member



@whitespace...
Right now... if one would search for 'dogs eat poo' the script aka cgi (perl) is in the index.. along with the page that is using the script.
I only want google and people to see the page.. not the script in the index.
People see the page with the information included by the iframe....
Hopefully google will only see that presentation as well, and not include the script URL (that is in the iframe) as a separate page.
Before using PHP ala wordpress... my regular HTML pages that called the scripts with SSI...
the page was treated as one page, not a page, and a script, in the index.... it's the iframe (now) that google is segregating the two.
Well... at least that is what it seems is going on.. now that I have to call the scripts in iframes in php.

In my little htaccess file... if I don't include my personal Ip address, than I can't see (ie 403 error) the script just by trying to get it from an url.. ie website/cgi-bin/dogseatpoo.cgi . But if I go to the webpage that has the script in the iframe.. all is good.

@Lucy
Thanks for the info.. I am always learning.. I haven't used <FileMatch>... but I like that idea very much. I will learn that soon.

Thank you Lucy and Whitespace for your time and comments. It is very much appreciated.

not2easy

2:19 am on Feb 28, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you can style the page to look like your WP theme, you can still serve it the same way you were and just link to it from WP. (Even if it doesn't look the same, but it 'fits' a visitors expectation better if it does) I have many parts of some sites that are static html using includes and scripts and also entire directories that have their own taxonomy and they are all integrated to the WP part of the domain via links in the menus. Google doesn't seem confused with it.

whitespace

10:15 am on Feb 28, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Before using PHP ala wordpress... my regular HTML pages that called the scripts with SSI...
the page was treated as one page, not a page, and a script, in the index.... it's the iframe (now) that google is segregating the two.


If you switched over to PHP then maybe you should be using some kind of PHP include(), rather than an IFRAME? After all, a PHP include is really the equivalent of the SSI, not an IFRAME. Then it would certainly be one page (again), not two. If you need to, PHP can make an external call to your CGI script (providing PHP is configured appropriately). For example:


<?php
include('http://example.com/cgi-bin/dogseatpoo.cgi');


In my little htaccess file... if I don't include my personal Ip address, than I can't see (ie 403 error) the script just by trying to get it from an url.. ie website/cgi-bin/dogseatpoo.cgi . But if I go to the webpage that has the script in the iframe.. all is good.


I'm not clear how this is working? Isn't your CGI-script the SRC of the IFRAME? If it is then it would be the client/user making the request - so they would be blocked as well? No?

kahuna

7:02 pm on Feb 28, 2016 (gmt 0)

10+ Year Member



@Whitespace

Yes you are correct using php include would be the purer way to go..
But each "post" has it's individual script... not all posts use the scripts.. just a specific type.
So I would have to include each "php include" with an injection of the specific script...
I know there are plugins that I can insert php code with shortcodes.. and I use them for
other situations..
But it would add one or two or three steps to the process... and thus generate as many more
steps to produce the page as it loads.

I tried using Order allow,deny by IP and variations thereof to solve the problem but ...
just as you asked in your question "so they would be blocked as well as you?"

After much searching and hours of throwing the pasta on the ceiling to see if stuck and was done...
I found someone with a similar situation and tried their approach and it worked.

And thusly posted here for a final review and comments.. and very much grateful
as I am better at cooking pasta then figuring these things out.
And no.. my kitchen ceiling is not embed with Rigatoni :)

All I know is it works ! and I learned along the way.
Entering new information today.. and checking my error logs...
The scripts are being blocked at "raw" access... but not at the page level as it should be.

Thanks again group... too much time on this stuff... I need to go fishing !

not2easy

7:17 pm on Feb 28, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Most WP sites require some kind of plugin to use a php include in a page or a post and I've never found one that works reliably enough to depend on. The pages/posts don't physically exist but are assembled on the fly from components in the sql tables. The functions that add the components are php so the content is not normally processed after their own include.

kahuna

8:09 pm on Feb 28, 2016 (gmt 0)

10+ Year Member



@not2easy
Actually... during this adventure...
The plugin that I use to successfully insert php code where I need it.. didn't work with
<?php include('http://example.com/cgi-bin/dogseatpoo.cgi') ?>

So I guess what you just described explains the failure.

whitespace

9:15 pm on Feb 28, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



I'm not clear how this is working? Isn't your CGI-script the SRC of the IFRAME? If it is then it would be the client/user making the request - so they would be blocked as well? No?


Ah, moment of clarity! Sorry, my internal logic was fogged! I understand how users can see the contents of the IFRAME now! And Google, whilst it doesn't normally send a referer, it does when the document is contained in an IFRAME.

And I assume this .htaccess file is in the directory where your included scripts are located?


<?php include('http://example.com/cgi-bin/dogseatpoo.cgi') ?>


Whether this works or not is also dependent on a couple of settings on your server (php.ini) being enabled. allow_url_fopen and allow_url_include both need to be enabled. The later is not enabled by default, so might not be available on some shared servers. If only allow_url_fopen is enabled then you might be able to workaround this restriction by reading the file by other means and then echo'ing it out. But then you still have whatever restrictions WordPress imposes.

kahuna

10:54 pm on Feb 28, 2016 (gmt 0)

10+ Year Member



wrong post here...

kahuna

10:57 pm on Feb 28, 2016 (gmt 0)

10+ Year Member



<?php include('http://example.com/cgi-bin/dogseatpoo.cgi') ?>

I am not using that.