Forum Moderators: coopster & phranque

Message Too Old, No Replies

need to create text ala mad libs, around keywords

tool for text creation on the fly from grammar rules

         

han solo

6:18 pm on Feb 15, 2001 (gmt 0)



I've found some tools to do this, but I am really looking for some ideas on how to put it together.

Looking for random sentence generator on google gives me something along the lines of what I want, but I am trying to put together the software myself.

An example of what I want might be a topical mad lib generator, if there is such a thing. You simply select the topic, and it spits out some paragraphs that are pretty generic for you to use.

This is for SEO, obviously, and no, I'm not a college student looking for help with his homework (I saw one page about a class at stanford for something along these lines, so I figured I would head off the snickers, or at least some of them.)

Thanks for the help...I'm guessing that there are several here who have such software, but I'm not sure they'd be willing to share, tips would be okay.

Cheers,

Han Solo

msgraph

6:35 pm on Feb 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you ever find one Han can you stick mail me the link? I've been all over the globe trying to find one as well.

littleman

6:44 pm on Feb 15, 2001 (gmt 0)



Mr. Solo I advice against this approach. Random sentences are going flag your domain very quickly with some of the SEs. Well you may be able to get away with it with fast, and AV(?) -- but
you are sure to get nailed with inktomi, and there is a good chance google will catch you.

It's cool if you could throw away domains/ips, but I'd be careful.

I'm not talking from any moral high horse, it is just that such a strategy could be expensive.

han solo

6:51 pm on Feb 15, 2001 (gmt 0)



I'm not talking about completely random....

eg,

select one element from each (simplistic approach)

subject, for example "our company

verb, "serves

object "clients

prepositional phrase, "in the

industry, "e commerce area.

Hmm...and then you apply some logic to the grammar interaction between subjects, etc. Arrange them in terms of sentence parts, eg, prepositional phrase, subject, object, predicate, etc.

Would this really be bad? There are ways to combine lists of data, throw together some rules, and I don't see how this would be distinguishable from, albeit not great, content served on other sites.

Thanks for the reply littleman, you were one of the people I was thinking of when I posted this. ;)

Wanna share how you creat content for pages on the fly? :)

Cheers,

Han Solo

littleman

7:05 pm on Feb 15, 2001 (gmt 0)



The problem lies in the randomness of that approach. When the spiders cross reference the content of the page and it is significantly different each time you are going to get flagged. You need to keep the individual page's content the same each spider visit.

>Wanna share how you creat content for pages on the fly?
Some day when I am out of the game.

han solo

7:10 pm on Feb 15, 2001 (gmt 0)



So you're saying, if I kept the content the same each visit, it would be okay?

What I am thinking, is pumping this data into my sites, after the random creator does it's thing...not on the fly, or prior to, a spider visit.

I don't want to change, except for those variables I feel like I've nailed, ever. I know that might sound weird, but fact is, I believe that a 90 percent static page, with no contenct differential between spider visits, does the best...granted based on my own empirical evidence.

Any other pointers? BTW, I appreciate the help. One of these days, when we are all out of the game, I'll have to swap some stories with you, Brett, and the others I've probably been watching like a hawk, and trying to grab ideas from...

Han

littleman

7:24 pm on Feb 15, 2001 (gmt 0)



That sounds like a better approch. PeteU does something similar I believe ( I just got that from what he has said here). He has a script that generates the page on the fly, then he saves them and uploads them onto his server.

>swap some stories with you
That would be cool.

Brett_Tabke

7:32 pm on Feb 15, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you do go with generated stuff, get around the ramdomization hit by generating it to static content to the true destination url/filename. I do, and then use embedded ssi to control the headers/footers..etc.

Although word scrambled stuff is easy enough to do, I go with full paragraphs that are in context. You'd be surprised at the number of freely available and free distributable documents there are on the web once you start looking. If your site does have content, try pouring the whole site into a file, stripping the html, preserving the paragraphs and see what you have left. It is an education in content itself.

You can find many free docs that are often incontext and ontopic to your site. Althought I've went down to the word level and custom created sentences before, I don't care to do that for reasons little pointed out - just what too risky. With as few "all star pages" as there are anymore, you'd be time ahead actually setting down and working on old fashioned content first.

han solo

7:32 pm on Feb 15, 2001 (gmt 0)



This has all been pretty good but...was looking for some advice on programming the thing, in perl preferably.

Any takers?

Han

han solo

7:37 pm on Feb 15, 2001 (gmt 0)



Ooops, missed you post by posting, Brett.

How would I leverage content creation for hundreds of thousands of keywords?

The scope of that is a little beyond me...I've got help, but even then...I don't think anybody writes that much! Especially when you consider the breadth of industries, etc....

Han

han solo

6:39 pm on Feb 16, 2001 (gmt 0)



Hmm...no tips on the actual software creation, eh?

What I was hoping for, is some thoughts on setting up the grammars, and the formatting between elements...

and then some ideas on organizing the logic for stringing it all together.

Thanks,

Han Solo

PeteU

7:04 pm on Feb 16, 2001 (gmt 0)

10+ Year Member



Littleman I tried, but gave up on it :) too complex/cpu intensive
To generate good random content following grammar rules and without "machine created feel" requires a long set of rules if then loops, file accesses etc, way to much for a busy web server to handle.
Also if you change content on every request Slurp/si will absolutely hammer you to death.
One option is to create pages offline using some complicated algorithm - speed there is not important.
If you insist creating on the fly, then probably the best method would be to make a collection of on topic short sentences - about 100, and then randomly include 10 of them on each page. I've seen it done, and it looked passable to a human eye.
I would still create it once and then save a copy on server for subsequent requests.

sugarkane

11:29 pm on Feb 19, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Han, did you come up with anything for this?

Seems to me like the only feasible way to do it would be to define sentence structures in advance, along with a dictionary of words / phrases for each part of the sentence structure you've used in your definitions, and basically just drop a random entry from the dictionary into the correct place in the definition.

The topic would then be dependant on the dictionary entries.

Did that make any sense at all, and am I barking up the right tree?

han solo

6:38 pm on Feb 20, 2001 (gmt 0)



Yes, you are SugarKane...I've just been on holiday, or extended weekend as it is in the US (we just had a three day break at work, and I forced myself to just stay away from the computer, although now I'm spending hours and hours catching up.)

I was sort of hoping for somebody to give some examples of how to code the thing, as you can tell from what you said, and the example I gave previously, you are right on the money with what I'm trying to do.

The thing is, who wants to write original content for 1 million individual pages? I certainly don't, besides, when I sit down to do that, I am doing so with the idea of getting it into print :)

I don't believe the larger SEO types create individualized, hand written content for each page, which is targeted to each key phrase, they are optimizing on. Imagine the troubles of getting a rapid client influx, what do you do, double your writing staff? Hire contractors?

Those seem expensive solutions, and highly unnecessary. The cat is out of the bag already, i think people know that where ever I am, I cloak...and not to mention I love doing research on anybody else in the field I can find, especially since they might know more than me.

A lot of those types are here, which is why I'm here too ( i know this is off topic) and I figure, shoot, we share enough industry secrets with our "masks" on, why not this one? Oh, well...looks like I'll just have to sit down and map out the sentence structures like you mentioned.

I already have a few, just was hoping for some tips on how to create some, or if anyone knew of any of the myriad linguistic software packages might help along these lines.

Thanks as always,

Han

han solo

7:15 pm on Feb 20, 2001 (gmt 0)



What I'm working on is for "random filler", I can't really be more specific than that. ;)

Here is a little breakdown of what I have done:

researched, researched, and researched.

hung out here every day, (at least for work :) )

put together, based on everything I've seen, read, or heard of a new system for creating, and optimizing pages,

and then decided that instead of writing even partial sentences, I'd rather create data on the fly from a database.

I need rule sets, and yes, the one from Don Cross is one of those that I've seen. The grammars are fairly detailed, so you say that you've already reverse engineered some of that? Care to share?

I would like to put together one with the same elements, but different sentence structures, which means different orders of plug ins for the words, which also entails different grammars, and sub routines, one customized for each sentence type.

hmm...so want to share any more details? You could sticky mail me, too...if you don't want to post it.

Cheers,

Han Solo

toolman

6:00 am on Jun 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How did it all work out for you jeremy?

jeremy goodrich

1:02 pm on Jun 23, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought it went rather well, however that opinion is probably a little off, because I changed careers after that, toolman. ;)

TPK, is the source code for that available? Might be interesting to take a look at it, since reading through the story inputs, it wouldn't take too much to make that into something like what I was looking for before.

theperlyking

7:47 pm on Jun 23, 2001 (gmt 0)

10+ Year Member



Sorry, dont want to seem stingy but its not really available.