Forum Moderators: phranque

Message Too Old, No Replies

rewrite newbie | image files?

can't see images for rewritten page

         

cooch17

2:41 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



Hello --

Been using Apache for a while now -- currently 2.2.22 on a GNU/Linux box running RHEL 5.7. Finally decided to take stab at the voodoo of mod_rewrite. Recompiled httpd building in rewrite, and created a simple test directory off my webroot. For convenience, I'll call this directory /writest

In httpd.conf, made sure that AllowOverRides was on for this directory.

In the directory, I created to index.html files: index.html, and index2.html. What I'm trying to do is have the URL rewritten to index2.html if the user accessing the site has a specific ip address. Now, each of these pages (index.html and index2.html uses images in a subidrectory called /images. So, the overall directory structure is

/webroot/writest
/webroot/writest/images

The images are referenced in index.html and index2.html using the absolute path (ie., <img src="/webroot/writest/images/test.png">)

Decided to try the .htacess way to happiness. here is my .htaccess fil (using a fake ip number to demonstrate):


RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^ab\.cd\.ef\.gh$
RewriteCond %{REQUEST_URI} !^index2\.html
RewriteRule .* index2.html
RewriteRule ^images/(.+)?$ images/$1 [L]


Now, everything works pretty well -- when user from ip ab.cd.ef.gh tries to access /writest, index2.html shows up. Great!

Except for the one problem -- the images aren't there. As per the above, I tried adding the RewriteRule to also rewrite the location of the /images subdirectory, but that hasn't worked.

I'm not getting any overt error messages, so not sure what to try next. Missing images is almost invariably a path issue of some sort, and I'm sure there is a simple thing I'm missing.

Pointers to my obvious mistake (and solution thereto) greatly appreciated.

Thank very much in advance.

g1smd

3:06 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Begin the link to images, css files and javascript files with a leading slash and state the full path to the folder.

It is the browser that works out the location of these objects. This location is a URL.

From this page: "example.com/folder/thispage.html" a link to "images/image.jpg" points to "example.com/folder/images/image.jpg".

If your image is really at "example.com/images/image.jpg" then you need the links on your pages to instead point to "/images/image.jpg" with a leading slash.

Do not use the ../ notation here. That adds even more unwanted complexity.

cooch17

4:28 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



Thanks -- some followu[ questions:


Begin the link to images, css files and javascript files with a leading slash and state the full path to the folder.


Where? In the .htaccess file? In index2.html file I referred to in my original post.?



It is the browser that works out the location of these objects. This location is a URL.

From this page: "example.com/folder/thispage.html" a link to "images/image.jpg" points to "example.com/folder/images/image.jpg".


This much I know.


If your image is really at "example.com/images/image.jpg" then you need the links on your pages to instead point to "/images/image.jpg" with a leading slash.


Not sure how this helps (or, I'm missing your point).

Using your example, my directory structure looks like

example.com/folder
example.com/folder/images

In index.html, and index2.html, both have links to images that are in the /images subfolder. These links use the full path.

So, if I'm reading your note correctly, I should change links from say) <img src="http://examples.com/folder/images/test.png"> to <img src="/images/test/png">. Correct? For both index.html and index2.html, or just the latter?

Do I need to make changes to the .htaccess file? Specifically,

RewriteRule ^images/(.+)?$ images/$1 [L]


Thanks again in advance?

cooch17

6:18 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



ZTo mae things as simple as possible, the files index.html and index2.html are *identical* in every respect - just different file names. So, if I look at index.html, or index2.html with htaccess turned off, they're idential. If I turn .htaccess back on, then index.html shows me the images, but index2.html (which I'm rewriting to) does not.

lucy24

8:02 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No you do not need to rewrite the images!

The html within the pages themselves should give the full address of the image. If the image lives at

www.example.com/images/picture23.jpg

then the html would say <img src = "/images/picture23.jpg"> with leading slash. With this format, it does not matter where the html file lives.

Now, if index.html and index2.html live in the same directory, and are identical in every way, but only one of them shows the images-- then you've got a head-scratcher.

Incidentally, since you're rewriting rather than redirecting, and the two files are identical, how do you know which one you're looking at? That is, how can you be sure the rewrite is happening? Do they at least have different titles?

cooch17

8:32 pm on Feb 2, 2012 (gmt 0)

10+ Year Member




No you do not need to rewrite the images!

The html within the pages themselves should give the full address of the image. If the image lives at

www.example.com/images/picture23.jpg

then the html would say <img src = "/images/picture23.jpg"> with leading slash. With this format, it does not matter where the html file lives.


I'm pretty sure I tried this permutation, but I'll have another go at it tomorrow. More intriguingly (to me), I have been using the full path for images (I find that relative links usually leads to issues, so I routinely specify links to images using the full path). How can something where the full path is specified 'break'?


Now, if index.html and index2.html live in the same directory, and are identical in every way, but only one of them shows the images-- then you've got a head-scratcher.


Which only exacerbates current tendencies towards hair loss. I digress...


Incidentally, since you're rewriting rather than redirecting, and the two files are identical, how do you know which one you're looking at? That is, how can you be sure the rewrite is happening? Do they at least have different titles?


Fair question -- yes -- different titles.

Thanks, again. I'll post of tomorrow's experiment as they come.

cooch17

9:00 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



Found a few moments:

1\ took the image rewrite out of .htaccess

2\ in index2.html, tried both /images/picture23.jpg (i.e., leading slash), images/picture23.jog (no leading slash), and /full-path-to/images/picture23.jpg). I even tried [site.name...] Heck, I even tried putting the image in the directory where index2.html resides, and using src=picture23.jpg" (i.e., eliminating the path issue).

Made no difference -- images not visible.

So, there *must* be something in either the .htaccess file, or httpd.conf, or both, that either needs fixing, or turning off, or on...


Now I know why people refer to mod_rewrite as 'voodoo'. ;-)

lucy24

9:55 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hm, you don't happen to have an anti-hotlinking routine somewhere do you? Something that only allows those specific images to be called by a specific filename?

Wait, scratch that. If it's a rewrite, I'm pretty sure the original name would come through as referer. But...

D'oh!

What do your logs have to say on the subject? Are the images getting requested and 403'd (or possibly 404) or are they not getting requested at all?

:: whapping self upside of head at this much-overdue obvious information source ::

cooch17

11:57 pm on Feb 2, 2012 (gmt 0)

10+ Year Member



No, no anti-hotlinking stuff.

here is the relevant part of the logfile:


cs/writest/] strip per-dir prefix: /home/users/www/htdocs/writest/ ->
cs/writest/] applying pattern '.*' to uri ''
cs/writest/] RewriteCond: input='aa.bb.cc.dd' pattern='^71\.255\.42\.132$' => matched
cs/writest/] RewriteCond: input='/writest/' pattern='!^index2\.html' => matched
cs/writest/] rewrite '' -> 'index2.html'
cs/writest/] add per-dir prefix: index2.html -> /home/users/www/htdocs/writest/index2.html
cs/writest/] strip document_root prefix: /home/users/www/htdocs/writest/index2.html -> /writest/index2.html
cs/writest/] internal redirect with /writest/index2.html [INTERNAL REDIRECT]
www/htdocs/writest/] strip per-dir prefix: /home/users/www/htdocs/writest/index2.html -> index2.html
www/htdocs/writest/] applying pattern '.*' to uri 'index2.html'
www/htdocs/writest/] RewriteCond: input='aa.bb.cc.dd' pattern='^aa\.bb\.cc\.dd$' => matched
www/htdocs/writest/] RewriteCond: input='/writest/index2.html' pattern='!^index2\.html' => matched
www/htdocs/writest/] rewrite 'index2.html' -> 'index2.html'
www/htdocs/writest/] add per-dir prefix: index2.html -> /home/users/www/htdocs/writest/index2.html
www/htdocs/writest/] initial URL equal rewritten URL: /home/users/www/htdocs/writest/index2.html [IGNORING REWRITE]

cooch17

12:06 am on Feb 3, 2012 (gmt 0)

10+ Year Member



Perhaps this will help. When the page comes up, the image isn't there, but you do see the little placeholder. If I right click, to select 'save as', the right filename shows up, but when I actually try the save, I get a 403 error.

cooch17

12:27 am on Feb 3, 2012 (gmt 0)

10+ Year Member



Glimmer of understanding. I changed the directory structure somewhat, and moved the images to a different folder. Now the directory structure looks like

\images
\writest

(instead of

\writest
\writest\images


In index2.html, I now use

<img src="../images/picture.jpg">

and...it works *perfectly*.

So, now the question can be framed as follows: why can't the rewritten index2.html "see" the images in a subdirectory downstream from the directory where index2.html is, but it *can* see directories that are in a different part of the directory structure.

g1smd

12:58 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do not use the ../ notation here. Start the link with a leading slash.

Now you have explained some more, it looks like you should have previously linked to /images/filename.ext and set up a RewriteRule to rewrite external requests for example.com/images/<something> to the internal filepath /writest/images/<something>

URLs are used "out there" on the web. Filepaths are used "here" inside the server. They are not at all the same thing. They are merely "related" by the action of a server, and the way that server is configured.

cooch17

1:13 am on Feb 3, 2012 (gmt 0)

10+ Year Member




Do not use the ../ notation here. Start the link with a leading slash.


Didn't work when I tried that. Only worked when I used ../images/picture.jpg


Now you have explained some more, it looks like you should have previously linked to /images/filename.ext and set up a RewriteRule to rewrite external requests for example.com/images/<something> to the internal filepath /writest/images/<something>

URLs are used "out there" on the web. Filepaths are used "here" inside the server. They are not at all the same thing. They are merely "related" by the action of a server, and the way that server is configured.


OK, fine. But at the risk of getting to the point:

give

\writest
\writest\images

(which was my original directory structure) what should the .htaccess file look like. I'd be happy of someone simply took the following and showed me the necessary changes. Heck, its all of 4 lines. Once I see the corrections, I can then figure out what has changed, and why.

RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^ab\.cd\.ef\.gh$
RewriteCond %{REQUEST_URI} !^index2\.html
RewriteRule .* index2.html

g1smd

1:27 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What is the *URL* used "out there" on the web when you want to look at the image? (this answer will include a domain name).

What is the filepath "here" inside the server where the file actually resides? (this answer will consist of folders and a filename).

wilderness

1:44 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Begin the link to images, css files and javascript files with a leading slash and state the full path to the folder.


g1smd,
I've been seeing these references and I'm wondering the logic?
Is the full path a requirement of mod-rewrite effectiveness?

I used relative paths for more than a decade (with a very minimum of rewrites for either pages or images)and don't recall having any issues.

TIA

Don

lucy24

4:05 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You'll only have an issue if you're rewriting to a different directory, so the browser "thinks" the files are in one place and they're really in another. Most obvious example: I recently caved in and put all my error documents into a single directory so they could use a shared css. This style sheet has to be called via an absolute link
/boilerplate/errorstyles.css
since the ostensible URL could be absolutely anything.

www/htdocs/writest/] strip per-dir prefix: /home/users/www/htdocs/writest/index2.html -> index2.html
www/htdocs/writest/] applying pattern '.*' to uri 'index2.html'
www/htdocs/writest/] RewriteCond: input='aa.bb.cc.dd' pattern='^aa\.bb\.cc\.dd$' => matched
www/htdocs/writest/] RewriteCond: input='/writest/index2.html' pattern='!^index2\.html' => matched

There is something here that I am seriously Not Getting. Is "writest" part of the URL or isn't it?

When I asked about logs, I meant specifically the part that calls-- or doesn't call-- the image. That's probably a few lines after the part you quoted.

cooch17

7:47 pm on Feb 3, 2012 (gmt 0)

10+ Year Member



OK, here it is, in full.

1\ create directory off of web server root, called writest. Inside this directory I create a subdirectory called images.

So,

\serveroot\writest\
\serveroot\writest\images

2\ In the writest dir, I have two index files: index.html, and index2.html. They are essentially the same - I've made them as minimalist as possible. In the graphics subdirectory, a single graphic (which I'l call test.jpg).

Here are the two index files, in full. First, index.html


<html>
<head>

<title>index page (1)</title>

<body>
<p>
This is index page 1.
</p>
<center>
<img src="images/test.jpg">
</center>
</body>
</html>


and then index2.html


<html>
<head>

<title>index page (2)</title>

<body>
<p>
This is index page 2.
</p>
<center>
<img src="images/test.jpg">
</center>
</body>
</html>


So, the url for the page has a structure like: www.site.com/writest

index.html resolves perfectly, regardless of how I 'pull in' the image. In other words, it doesn't (and shouldn't) matter if I use

src="images/test.jpg"

src="/home/www/htdocs/writest/images/test.jpg"

src="http://www.site.com/writest/images/test.jpg"


3\ in httpd.conf, I add the following, after which I restart the server:


<Directory /home/users/www/htdocs/writest>
AllowOverride All
</Directory>


4\ I test both index.html and index2.html, using


http://www.site.com/writest/index.html


and


http://www.site.com/writest/index2.html


Both read in perfectly, and both show the image.

5\ create an .htaccess file in \writest, which contains the following directives (shown with fake ip address as string aa.bb.cc.dd).


RewriteEngine On
RewriteCond %{REMOTE_ADDR} ^aa\.bb\.cc\.dd$
RewriteCond %{REQUEST_URI} !^index2\.html
RewriteRule .* index2.html


Chmod 644 .htaccess

6\ now, the problem. With .htaccess 'active' in directory \writest, accessing

[site.com...]

is successfully rewritten (redirected?) to index2.html. In other words, the RewriteCond rules are successfully triggered, and the RewriteRule is parsed. I know this because I see that the title for the resolved page is now 'index (2)', and not 'index (1)'.

However, the graphic (test.jpg) is not resolved.

7\ I tried (in turn, separate experiements) each of the following changes to index2.html, wrt to srcing the image file:

src="images/test.jpg" (note: original syntax, which fails)

src="/images/test.jpg" (fails to show image)

src="/home/www/htdocs/writest/images/test.jpg" (fails to show image)

src="/http://www.site.com/writest/images/test.jpg" (again, no image)

8\ tried moving test.jpg into the same directory as index.html, and index2.html. Then, ran through 4 permutations of sourcing the image:

src="test.jpg" <fails to show image>

src="./test.jpg" <fails to show image>

src="/home/www/htdocs/writest/test.jpg" <fails to show image>

src="http://www.site.com/writest/test.jpg" <fails to show image>

9\ as a last result, move to image test.jpg to another directory (which I'll call testimage). So now the directory structure looks like

\webroot\writest
\webroot\testimage

Modify image srcing in index2.html to

src="..\testimage\test.jpg"


and -- VOILA! -- it works perfectly.

Summary: absolutely total fail in getting the imagein index2.html to be shown, so long as the image is in \writest or nested in a subdirectory within writest.


For people interested in the error log during 'failed' experiements, here (appended to the bottom) is the full log for the experiment where I used the original

src="images/test.jpg"


Enjoy!

Now, any chance anyone can explain what is happening (or not), and why? Makes *no* sense to me that I can't see the images, unless they're moved to a different directory.

g1smd

8:09 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



shouldn't matter if I use

src="images/test.jpg"

src="/home/www/htdocs/writest/images/test.jpg"

src="http://www.site.com/writest/images/test.jpg"

Don't use the first one as it will look relative to current folder level of current page.

The second is not valid, as "/home/www/htdocs/" is not in the URL.

No need for the third one. Drop the protocol and domain name, and begin the link with a leading slash.


RewriteRule .* index2.html

The above code rewrites all URL requests so that they are served by the file index2.html. It rewrites requests for the root, for robots.txt, image files, stylesheets, absolutely everything except for index2.html as shown in the preceding RewriteCond. The .* pattern is wrong. It should define requests that should be rewritten, not match all requests.

cooch17

8:35 pm on Feb 3, 2012 (gmt 0)

10+ Year Member



D'oh. So after all that, eh? Simple change RewriteRule to

RewriteRule index.html index2.html

Works fine after that. And of course, in hindsight, makes perfect sense.

Thanks, very much.

g1smd

8:56 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You shouldn't include the index filename in the href part of the links on your page. You should link to the canonical URL and this should end with a trailing slash after the parent folder name (or just a slash for root).

The DirectoryIndex directive sets the filename that is served for such a request. It's a special type of rewrite, and much more efficient.

It's a good idea to redirect URL requests that include a reference to an index filename to strip that filename from the request.

In your original post I missed the Rule with .* in it. One tip for laying out mod_rewrite code for clarity: include a blank line after every RewriteRule.

cooch17

9:05 pm on Feb 3, 2012 (gmt 0)

10+ Year Member




You shouldn't include the index filename in the href part of the links on your page. You should link to the canonical URL and this should end with a trailing slash after the parent folder name (or just a slash for root).


I know -- I just wrote them out so there was no 'ambiguity' in what file I was pointing to.


The DirectoryIndex directive sets the filename that is served for such a request. It's a special type of rewrite, and much more efficient.


Yup, had that in my httpd.conf for years.


It's a good idea to redirect URL requests that include a reference to an index filename to strip that filename from the request.


Now that is something I *should* add. Good advice.


In your original post I missed the Rule with .* in it. One tip for laying out mod_rewrite code for clarity: include a blank line after every RewriteRule.


Sage advice, if (when?) need to seek help for another problem.

Thanks again...

lucy24

9:14 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Quick edit:
While you and g1 were having that whole discussion, I was reading other posts in other tabs, and then came by here to post my own version {stuff deleted here} winding up with:

***

Oh, wait. You've got an absolutely generic Rewrite.

RewriteCond %{REMOTE_ADDR} ^aa\.bb\.cc\.dd$
RewriteCond %{REQUEST_URI} !^index2\.html
RewriteRule .* index2.html

Wouldn't that mean that the image itself gets rewritten to index2.html? But it can't display, because .html isn't an image extension.

Just for the ### of it see what happens if you constrain the Rule to

(\.html|/)$

where you currently have .*

***

Which I guess was the right answer ;)

g1smd

8:10 am on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes it was. I was somewhat distracted by a site re-launch while this thread was going on, especially the bit where I had to fix two bugs lurking somewhere in 750 lines of mod_rwrite code, bugs that were missed in dev site testing and only came to life as soon as the site was live on the web.

cooch17

2:03 am on Feb 5, 2012 (gmt 0)

10+ Year Member



Thanks again for the feedback.

(As an aside, I was distracted by having to figure out why one of my servers seems to have suddenly forgot that the RAID is hardware, not software. Nothing like not being able to access anything to get your attention).

lucy24

2:35 am on Feb 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Funny you should say that, because I was about to ask g1 if he wanted to trade bugs. I have reason to believe there is a cockroach living in my modem.

:: now off to try that mod_dir test ::