Forum Moderators: phranque
I have set Apache up as a reverse proxy to a back-end Content Management System. Apache is running on port 80, the back-end CMS on port 3128.
The reverse proxying works fine, but my back-end CMS is producing hard-coded absolute URLs in all the HTML links. These links have the 3128 port hard-coded into them.
Consequently when I request www.sample.com/cms the page loads fine. However, all the links in the page are of the form www.sample.com:3128/#*$!x
Therefore I am using mod_substitute to filter out the port number from the response body.
Here is the config from my httpd.conf:
# www.sample.com
<VirtualHost *:80>
ServerAdmin support@sample.com
DocumentRoot /usr/local/apache2/htdocs
ServerName www.sample.com
ErrorLog logs/error_log
CustomLog logs/access_log common
ProxyRequests Off
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
<Location /cms>
ProxyPass [sample.com:3128...]
ProxyPassReverse [sample.com:3128...]
AddOutputFilterByType SUBSTITUTE text/html
Substitute s/3128/80/inq
#Substitute "s¦http://www.sample.com:3128¦http://www.sample.com¦inq"
</Location>
Unfortunately nothing is happening. mod_substitute is not doing its job and re-writing the HTML. Also there are no error messages in error_log.
Any ideas?
mod_substitute was working fine. The problem lay in the fact that the CMS was gzipping the webpages up in an attempt to speed up its delivery.
Most modern browsers will accept gzipped webpages (they use the "Accept-Encoding" header to tell the webserver whether they can accept zipped pages). All I needed to do was switch off the gzipping in the CMS and suddenly mod-substitute sprang into life!
(Note: I did try to add the gzip encoding into my mod_substitute directive and it *didn't* work: AddOutputFilterByType SUBSTITUTE text/html application/x-Gzip multipart/x-gzip)
Further investigation has shown that other ways of re-writing the HTML output (in addition to mod_substitute) are:
1. mod_proxy_html
2. mod_sed
3. mod_ext_filter to invoke some other program to filter the output:
# mod_ext_filter directive to define a filter which
# replaces text in the response
ExtFilterDefine external_sed mode=output intype=text/html cmd="/bin/sed s/california/CA/g"
<Location />
SetOutputFilter external_sed
</Location>
Jim