| 8:03 pm on Mar 16, 2009 (gmt 0)|
Yes it is possible.
You need a rewrite that accepts 'root' URL requests and rewrites them to fetch content from the script in the sub-folder.
The difficult parts will be:
- restricting the rewrite so that it only captures valid URL requests that should be rewritten (i.e. it should not capture requests for robots.txt and such like),
- making sure that links to CSS, JS, and image files still work. If they are 'relative' links then this change on rewrite will 'break' them. They will need to be changed to be root-relative (begins with /) links instead.
| 8:35 pm on Mar 16, 2009 (gmt 0)|
Ok I was wondering why it was going so slow, and the CSS was all messed up. I tried this:
RewriteRule ^(.*)$ /wiki/index.php?title=$1 [L,QSA]
And I got a 500 Internal Server error. Then I tried this:
RewriteRule ^(.*)$ wiki/index.php?title=$1 [L,QSA]
Without the front slash in front of wiki, and it worked, however it was slow, and the css was all broken. So it looks like I may need to put the CSS and image files in the root?
Also, how do I do this:
|- restricting the rewrite so that it only captures valid URL requests that should be rewritten (i.e. it should not capture requests for robots.txt and such like), |
| 9:05 pm on Mar 16, 2009 (gmt 0)|
For more information, read this thread from our library [webmasterworld.com].
[edited by: jdMorgan at 11:15 pm (utc) on Mar. 16, 2009]
| 12:16 am on Mar 17, 2009 (gmt 0)|
Thanks for the info. I have most of this working now, including the CSS and Image rewrites. However, I still do not understand this statement:
|it should not capture requests for robots.txt |
I don't see why this is a problem?
| 12:50 am on Mar 17, 2009 (gmt 0)|
Does you wiki/index.php script generate a correct robots.txt file for the site? If not, then if you rewrite robots.txt requests to your script, it's a problem.
| 1:15 am on Mar 17, 2009 (gmt 0)|
My robots.txt file is in the root created by me (not in the wiki folder), and I have the robots only blocking the /wiki/ folder. This way, it will only allow my root article pages to be spidered. I have nothing else in the root directory, just robots.txt and /wiki/ This is a subdomain, and its only purpose is to run the wiki.
If this setup is a problem, I still don't see why, despite the warnings you, and Mediawiki give.
| 1:23 am on Mar 17, 2009 (gmt 0)|
In that case, I suggest you test your code and your site by requesting robots.txt with your browser. If it works properly, and you see your robots.txt file contents in your browser, then fine.
[edited by: jdMorgan at 1:23 am (utc) on Mar. 17, 2009]
| 1:34 am on Mar 17, 2009 (gmt 0)|
Yes, try to view the robots.txt in your browser. Can you see it, or does your script return 'junk' for that request?
| 1:40 am on Mar 17, 2009 (gmt 0)|
Ohhhh, whoops. I understand the warnings now. The robots.txt won't show. Arggg, is there any solution to this other than using another directory?
| 2:00 am on Mar 17, 2009 (gmt 0)|
Yes. Make the rewrite less permissive so that only URL requests that are supposed to be handled by the script are actually going to match the URL pattern in the rewrite.
On a site I recently worked on, all URL paths to be fed to the rewrite were simply a ten digit number with no extension. A simple pattern ensured that only those matched, and everything else was not rewritten to the script.
You need to study your URL patterns closely to derive a simple regex pattern that matches everything it needs to match while not matching robots.txt, images, CSS, JS, SE verification files, and other such files. You can 'negative match' specific names with a RewriteCond, but you need to be thorough.
Do you know what Google would have done with whatever you were actually returning for the robots.txt URL? No? Neither do I. That's why it is very important to be very careful with this stuff. Bad rewrites/redirects can drop your entire site out of Google.
| 4:19 am on Mar 17, 2009 (gmt 0)|
Thanks for your help. Now I get it. I allowed for the robots.txt to be displayed. If I can view this file in the browser, is it safe to say that Google can read robots.txt correctly?
| 12:53 pm on Mar 17, 2009 (gmt 0)|
I still have one more problem....
Note these facts:
1. Wiki is in /wiki/
2. I am redirecting URLS successfully to root with this code:
RewriteRule ^(robots)\.txt - [L]
RewriteRule ^[^:]*[./] - [L]
RewriteRule ^(.+)$ wiki/index.php?title=$1 [PT,L,QSA]
The top line of code allows robots.txt
The second line of code allows all directory folders to be a negative match (allowing css/skins/images to work)
The last line redirects to root.
Now to the problem:
Edit and history pages do not display correctly. When I click on edit for instance, this is the url that is displayed:
Let's say I am editing a page called "Page_Name". Usually if there is text on the page, it will say "Editing Page_Name" and of course, I could edit. But instead, I get a page with no text in the edit box, and I get this in the title on the page "Editing Wiki/index.php"
Any assistance on this would be greatly appreciated. I am trying my best here.
| 8:35 pm on Mar 17, 2009 (gmt 0)|
*** I am redirecting URLs successfully to root with this code ***
That is not how it works. You have this backwards.
The code takes URL requests for URLs in the root and rewrites the URL request to get the content from a different internal filepath within the server.
That is, a rewrite does not 'make' a URL. URLs exist when they appear in a link that someone can click on. The rewrite connects that request for URL 'A' to the place on the server, at 'B', where that content resides.