Forum Moderators: phranque

Message Too Old, No Replies

Using bots to create static html pages

I'm new to bot scripting and could use some resources

         

crowthercm

7:24 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



Hello,

I am interested in creating a bot that will convert search queries into static html pages. The pages do not need to be persistent, but I would like them to be indexable by a search engine should a bot arrive.

For instance, suppose someone searched the term "blue widgets". I'd like to then create a page like "/category/blue-widgets.html". From there I could include information relevant to blue-widgets.

Does anyone have any idea how this could be done or where I can learn some more about it?

Thanks,
Chris

lorax

7:59 pm on Jul 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> I am interested in creating a bot that will convert search queries into static html pages.

I'm not sure how you'd do this - they don't accept instructions. Unless you're talking about a SE on your own site in which case all you need to do is to write the code the way you want it.

But I think what you might be in search of is the htaccess file (Apache servers). There are several threads here on WebmasterWorld regarding how to make dynamic pages look static etc... Search for htaccess or mod rewrite.

khuntley

9:25 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



Lorax, I think Chris is getting at something else. I believe he is referring to the practice of creating a static page for every search term entered at his own site. So if the site is about large blue widgets and someone uses search at his site for pretty gold small widgets, a static page for the searched term is created. Then, two months later after spidering and SE update, Voila. Next time someone searches for pretty gold small widgets, you are in the serps.

This has been discussed here before and some have used the technique with much success. I don't remember the technology used, although it was discussed in the last thread I read about this. I don't know if it was PHP or some other script. Good luck finding the threads though...I don't even know what you'd search for here at WW.

I might add that other here have considered it a spammy technique. And if I saw a competitor doing it I would send in the spam report faster than certain politicians jump on vulnerable oil reserves in certain countries.

Kevin

john316

9:31 pm on Jul 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need to log the searches and use the log file for the db, use an ssi call to display the db (log file) and somewhere in there you rewrite the url to something friendly.

Obviously not a step by step tutorial, but the concepts are there.

lorax

9:38 pm on Jul 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thank you khuntley for straightening me out. I had not heard of this before though I should have guessed. *sigh* so much to learn.

khuntley

9:47 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



Lorax,
Think of the possibilities if we were not ethical. Using this technique over time you would rule the world! (well maybe not). And how in the world would there be an automated process to catch it? Your own visitors are developing the content and relevance for you.

I am going to reconsider the honesty thing.

Kevin

lorax

9:50 pm on Jul 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, it is mind boggling. But then you need to have the content ready to go in some form or another otherwise the search is of little use.

So, where does the content come from? Do you use bot of your own to collect content from other sites and then strip, cleanse, reorganize, reformat, and then deliver the new page?

khuntley

10:37 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



Actually, by content I was just referring to the new keywords on the static page. The last example I heard of this was that the content consisted of:

"Looking for small purple widgets? Try our little blue widgets."

And at least ink has an excellent natural language system to detect garbled keywords so you couldn't just steal content with a bot and rearrange.

Alas, I can think of no automated way to provide any semi-unique stolen content. Anyone have any ideas around here?

I guess the best thing would be to steal content and leave as is. Then you could strip out all of the hrefs and have the link to your real content in a table above with lots o' white space. Then just contend with the occasional angry email and remove the page then or rearrange.

BTW, this is just all out of curiosity as I am one of those spam-cop types that think if it unnaturally helps your rankings...don't do it.

Kevin

[edited by: khuntley at 12:51 am (utc) on July 28, 2003]

crowthercm

11:10 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



Kevin,

That is in fact exactly what I'd like to do. Namely have a search on my own site that creates the "static" and SE listable page.

I just noticed someone doing exactly this and I've got to admit, it's really got me intrigued on how it's done.

John, thanks for the advice, I will try that out and see how it goes.

Chris

khuntley

11:28 pm on Jul 27, 2003 (gmt 0)

10+ Year Member



No Chris! Don't Do it! I'm kinda kidding...it's your site and you can do what you want with it.

However, I can guarantee you that if the site is in any category where those at the top of the serps are making any money, when you start approaching them they will delight in going to extreme lengths to remove you from the engine altogether. And such an artificial traffic generating system is the perfect ammo for them.

Sure you might make a little money for a little while, but if the site isn't a throw-away in the long term I wouldn't do it. I'd say you would last 3-4 months with this system in place.

Kevin

crowthercm

5:26 am on Jul 28, 2003 (gmt 0)

10+ Year Member



hehe, well I'm sort of split on the ethics of it. If it's packaged right, I don't see why the site should get blacklisted by a SE. Not to say that the search engines will agree with me and certainly competitors won't. :P At any rate, I don't expect it will take a lot of effort to put together and it sounds kinda fun. :P

Chris

ncw164x

7:20 am on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There was a thread a few weeks ago regarding what you are asking about but I can't find it.
It is a search engine which is doing the same as you require, building a static html page from a search result.

There are scripts and programs available to build the pages first which are spidered by the search engines to get visitors to your site and then your site can be searched for more results

lorax

3:28 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I could see the value of it if - and only if - I had legitimate rights to the content. And only if I could rely on the script to generate human-readable and grammatically correct pages. Now we're talking AI.

trillianjedi

3:53 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is a search engine which is doing the same as you require, building a static html page from a search result.

How do you do that without creating an orphaned page?

TJ

crowthercm

4:39 pm on Jul 28, 2003 (gmt 0)

10+ Year Member



Could you not just build the static html page (I'd probably do a static php) based upon a template you've set up including adequate and relevant back-linking tj? Assuming that's what you mean by orphaned..

What I've been thinking about doing is simply:
1) Saving search queries into a file.
2) Scheduling an hourly parse of that file to find the most popularly searched terms.
3) Generating x number of top search pages based upon a given template and wiping the log file.

What I'm most worried about is efficiency, if I start getting hundreds of thousands of entries, what's going to happen to my scheduled task and also the size of the log file.

killroy

4:49 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm... Google never had a prolem indexing my search results... I think the bigger issue is how and where and if to link to serch result pages on your own site.

One yould always do a "Recent Searches" list on teh home page listign the last 10 searches.... since in my case searches are in the form of domain.com/search/searchterms Googles eats that up...

Don't see how that would be in any way "illegal".

SN

ncw164x

4:50 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could do that from any list of keywords or search terms, what trillianjedi said is true you are creating orphaned pages unless you build an index.

If you are going to the trouble of creating an index you might as well use a directory script and build a directory, then you can target the pages for your keywords?

trillianjedi

5:09 pm on Jul 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, I see. I like the "recent searches" idea...

TJ