Forum Moderators: phranque

Message Too Old, No Replies

Can a Bot Reverse Engineer a ModRewrite?

         

HoboTraveler

7:49 pm on Jan 24, 2008 (gmt 0)

10+ Year Member



Hi All,

I am wondering on how does a search engine reverse engineer a mod written URL?

I had a bunch of mod-written URLs. The variables are sent to a PHP file specified in the .htaccess file. The .htaccess file has permissions set so that it is unaccessible from a browser.

I was going though the logs and found instances from the search bot cuill.com that directly passes GET variables to the PHP file mentioned in the .htaccess. The variables passed match those that are mod-written. These URLs are not search engine friendly and have the? in the URL.

I am curious though, how was cuill able to figure out the variables and the file that needs to be called? The raw bot unfriendly URLs were never listed anywhere.

Is it possible to figure out the bot unfriendly format from a mod-written URL?

TIA

phranque

8:53 pm on Jan 24, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if it's an internal rewrite, i can't see how it could happen.
perhaps your intended internal rewrite url was exposed with an external redirect.

HoboTraveler

3:05 pm on Jan 29, 2008 (gmt 0)

10+ Year Member



*bump*

Anyone else have any ideas?

TIA

jdMorgan

8:02 pm on Jan 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, you say that you have internally rewritten static URLs to dynamic-type filepaths. But if the dynamic filepaths/URLs ever appeared in public logs, or were linked-to accidentally, or were 'exposed' by a redirect that occurred after the static-to-dynamic rewriting, then a search engine could have picked up those dynamic URLs.

So... in addition to the static-to-dynamic internal rewrite, have you put in place an external redirect to redirect any direct requests for the dynamic URLs back to their corresponding static forms? It's only a little tricky, as long as each form of the URL contains all the information needed to 'build' the other form.

Lots of past threads on that subject here...

Jim