homepage Welcome to WebmasterWorld Guest from 23.22.173.58
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderator: open

Paid Inclusion Engines and Topics Forum

  posting off  
Symbols in URLS: %?,
grnidone



 
Msg#: 124 posted 4:13 pm on Sep 18, 2000 (gmt 0)

I know that search engines such as Alta will stop at a % and ?.

What about commas? I talked to Danny Sullivan about commas in urls and he said they were not a problem, but I am starting to wonder.

What do you guys think?

-G

 

seth_wilde

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 5:42 pm on Sep 18, 2000 (gmt 0)

I've never heard of commas being a problem. Of coarse to my knowledge it's not common to even have commas in a URL. They targeted question marks becuase they are very common in dynamic sites with millions of pages and they were afraid that their spiders would get trapped in a never ending loop. Last I new it was still possible to get url's with "?" indexed (if you submitted them directlty), spiders just wouldn't follow links containing "?". Has anybody experimented with this lately?

JamesR

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 6:53 pm on Sep 18, 2000 (gmt 0)

I have been submitting LookSmart directory pages and have seen them get listed on AltaVista.

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 124 posted 5:51 pm on Sep 25, 2000 (gmt 0)

I wouldn't trust it to survive. I'd think if you can reroll your cgi's to accept a comma, why not just educate yourself on apache ModRewrite. Once you get the basics, you can twist a url anyway you want. You can then do things like:
foo.com/bar/zippy-form-valueone-formvalue-two.htm

Then modrewrite can strip it all down and toss it to the proper script. The se never knows the difference. Looks like a standard html file to them.

grnidone



 
Msg#: 124 posted 12:51 pm on Sep 26, 2000 (gmt 0)

>I wouldn't trust it to survive. I'd think if you can >reroll your cgi's to accept a comma, why not just >educate yourself on apache ModRewrite.

Yeah. That would work, but we are using Vinette (sp?) Storyserver. We currently have a ModRewrite-like thing in place, but we can't get rid of the commas.

-G

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 124 posted 1:07 pm on Sep 26, 2000 (gmt 0)

Ok, is it on Apache? You can call the story server via SSI. The same way I am using it here to present the posts. Everything here is of course dynmically generated but laying under pretty looking htm urls.

Such as this post is located at:
[webmasterworld.com]

Which actually calls the cgi at:
[webmasterworld.com]

via a simple ssi :
#include virtual="/discussion.cgi?forum=13&discussion=100"

georged

10+ Year Member



 
Msg#: 124 posted 1:18 pm on Sep 26, 2000 (gmt 0)

commas are not a problem for indexing.
e.g.
[altavista.com...]
Look at the ninth spot.
No problem getting indexed. It's just a real pain not being able to name the pages properly, I'm thinking about this due to a client using Vignette Storyserver as well.

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 124 posted 7:18 am on Oct 3, 2000 (gmt 0)

I wonder about how well they rank George? There are so few of them in alta with commas it is hard to deduce what kind of rankings they are getting across the board.

georged

10+ Year Member



 
Msg#: 124 posted 10:13 am on Oct 3, 2000 (gmt 0)

I see them when I'm searching for sports people on AV and they seem to do OK. Never seen them at number one, though. These searches typically have less than 20,000 returns and the pages in question aren't optimised.
I suspect the sites that the pages come from have high link popularity, as they are generally sports news sites or official sites of clubs. I would also suspect that if you optimised them they'd do as well as any other page, since they can be indexed.
Probably the reason why we don't see more of them is that they don't get submitted due to the size of the sites they're from, they don't get optimised because these sites are just grinding out hundreds of these pages, and also because they are not up very long (or not linked to for very long).
I wonder if this is the case with grnidone's site, high page turn-over etc? It is with my client's site and I'm trying to get them to introduce some absolutely static no-change pages, so I can test-drive my 'oh, hell, just optimise it and see' strategy. :)


Ted

10+ Year Member



 
Msg#: 124 posted 3:04 pm on Oct 3, 2000 (gmt 0)

I have seen AV index ? as well as % for a while now.

search [altavista.com]

This is a local Swedish search engine that has 52 search result pages indext by AV.

The URL's look like this:
[4en.net...]

They even index own URL’s including the ?

search [altavista.com]

Check the number 200 listing, the add URL page at AV is indext. Gives a good link to that adult site I suppose.

Another example is koll.se, also a Swedish search engine. They have over 2000 pages indext including a ? and/or % in the URL.
Just type "host:koll.se" and see the result.

What do you all say?

[edit]shortened the urls[/url]

henki

10+ Year Member



 
Msg#: 124 posted 3:29 pm on Oct 3, 2000 (gmt 0)

Funny to notice that e.g. altavista.se is not accepting question marks and also states so when you submit such a page.

But AltaVista.com is accepting them.

rencke

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 3:37 pm on Oct 3, 2000 (gmt 0)

It certainly seems that AV has started to index dynamic pages, allright. Think, think, think....

AV must have choked on their coffee when Fast announced last Feb that they had a bigger index and sent their spiders out day and night in order to beat Fast to the billion pages they had promised for new year.

A quick way to accumulate lots of pages fast, would be not to stop at "?" but to keep going. There is always the risk of getting stuck with lots of duplicates, but that risk is great even for static pages and besides, there is supposed to be 500 billion pages hidden in databases. So why not harvest what you can from them?

I wonder if other SE:s are going the same way? The business is turning into a numbers game and everyone wants to have a billion pages. Google says they already have that.

Life will become a lot easier for sites with dynamic content if this is the new trend.

seth_wilde

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 7:33 pm on Oct 3, 2000 (gmt 0)

Av has been indexing urls with "?" marks for months, the only catch is that the page has to be directly submitted. This still allows them to protect their spiders from never ending loops but at the same time allows them to index quality dynamic pages.

It's still not clear if their is any kind of penalty for these types of pages, But overall in results they make up of very small percentage of top 60 results (less than a 1/2 of a percent).

mark roach

10+ Year Member



 
Msg#: 124 posted 11:31 pm on Oct 4, 2000 (gmt 0)

There does seem to be a trend towards indexing dynamic sites recently. Excite crawled a load of pages with ? in the URLS last week and yesterday Google did the same. Av has never crawled any of my pages though :(

seth_wilde

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 11:33 pm on Oct 4, 2000 (gmt 0)

Mark-

Are you directly submitting the dynamic pages?

mark roach

10+ Year Member



 
Msg#: 124 posted 12:08 am on Oct 5, 2000 (gmt 0)

No

seth_wilde

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 124 posted 1:34 am on Oct 5, 2000 (gmt 0)

Try directly submitting to the add url page. You should see a much better success rate this way.

uksitesubmit

10+ Year Member



 
Msg#: 124 posted 9:03 am on Oct 7, 2000 (gmt 0)

>Av has never crawled any of my pages though
I think that AV do crawl all the pages of a site, they just dont list them.

Brett_Tabke wrote earlier..
>The same way I am using it here to present >the posts. Everything here is of course >dynmically generated but laying under >pretty looking htm urls.
I think you know exactly what you are at and where your going :)
I am working on the same thing at the moment but with more options!!
Please let me know more, as i am scepticle about one thing if you want ill show you it.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved