homepage Welcome to WebmasterWorld Guest from 54.226.230.76
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
exclude bookingform that is included on pages
helenp




msg:4415869
 1:29 pm on Feb 9, 2012 (gmt 0)

Hi, I have a bookingform that is inluded in all properties, have a php code to get the name and property,
Im not sure they are indexed as pages.
For exampel I have an url:
www.mysite.com/nameofproperty.php
then if the person click on the bookingform I get an url like this:
[mysite.com...]

I think I should not let these pages be indexed as there are many of them and its crap.
How can I exclude this in robots.txt?
disallow bookingform.php is enough?
Thanks,

 

helenp




msg:4416194
 9:54 am on Feb 10, 2012 (gmt 0)

Was stupid question,
seen indexed and I have excluded using dissallow: /bookingform.php

phranque




msg:4416225
 12:20 pm on Feb 10, 2012 (gmt 0)

part of the solution is to redirect with a HTTP response using a 301 status code and a Location: header.
like this:
GET http://www.example.com/bookingform.php?property=nameofproperty
301 Moved Permanently
Location: http://www.example.com/nameofproperty.php

assuming your server is apache you can use mod_rewrite to look for these patterns with regular expressions and RewriteCond/RewriteRule directives.
you should even be able to find examples of rewriting parameter values to php files in the apache forum on WebmasterWorld.

the other part of the solution is to refer to the canonical url from the booking form.

if you use the disallow in robots.txt the search engine will never see the redirect to the canonical url and the non-canonical url will likely get indexed without a snippet.

helenp




msg:4416572
 11:16 am on Feb 11, 2012 (gmt 0)

part of the solution is to redirect with a HTTP response using a 301 status code and a Location: header.
like this:
GET http://www.example.com/bookingform.php?property=nameofproperty
301 Moved Permanently
Location: http://www.example.com/nameofproperty.php

assuming your server is apache you can use mod_rewrite to look for these patterns with regular expressions and RewriteCond/RewriteRule directives.
you should even be able to find examples of rewriting parameter values to php files in the apache forum on WebmasterWorld.

the other part of the solution is to refer to the canonical url from the booking form.

if you use the disallow in robots.txt the search engine will never see the redirect to the canonical url and the non-canonical url will likely get indexed without a snippet.


Lets see if I understood you, the robots disallow I done is not enough.
Dont understand the last about canonical and non-canonical...
The solution you gave me doing a 301:
GET http://www.example.com/bookingform.php?property=nameofproperty
301 Moved Permanently
Location: http://www.example.com/nameofproperty.php
is to complicated as there are many properties and new one comes, other disappears...
So arent there any way to exclude everything started with GET http://www.example.com/bookingform.php? ?

And this I did not understand either, but sounds easier as all properties use the same bookingform:
the other part of the solution is to refer to the canonical url from the booking form.

Thanks a lot, we have lost some possitions in google but not badly and maybe this could be the reason.

helenp




msg:4416576
 11:37 am on Feb 11, 2012 (gmt 0)

just saw google parameter handling in webmaster tool,
and it says this:
Currently Googlebot isn't experiencing problems with coverage of your site, so you don't need to configure URL parameters. (Incorrectly configuring parameters can result in pages from your site being dropped from our index, so we don't recommend you use this tool unless necessary.)
If I click on configure
I have 45 with value propiedad,
if I understand it correctly there are 45 indexed that as per webmastertool is not causing problems, however there should be about 100 so all arent indexed.

And I sort of does not like it.

lucy24




msg:4416582
 12:24 pm on Feb 11, 2012 (gmt 0)

Do you want to add the 55 that are not indexed, or get rid of the 45 that are? Maybe g### thinks things are OK because those 45 different properties lead to 45 different pages. By google standards that is not a huge number.

As I understand it, the parameters tool is for getting rid of parameters that don't affect the content of the page. Or garbage parameters that will always create something even if you feed in a ridiculous value.

If you tell google to exclude the propriedades, will there be anything left of the page? That is, can the page be created without this parameter? Is that page worth indexing?

helenp




msg:4416588
 12:48 pm on Feb 11, 2012 (gmt 0)

Do you want to add the 55 that are not indexed, or get rid of the 45 that are? Maybe g### thinks things are OK because those 45 different properties lead to 45 different pages. By google standards that is not a huge number.

As I understand it, the parameters tool is for getting rid of parameters that don't affect the content of the page. Or garbage parameters that will always create something even if you feed in a ridiculous value.

If you tell google to exclude the propriedades, will there be anything left of the page? That is, can the page be created without this parameter? Is that page worth indexing?


I dont want to add the missing one :)
I red in google forum that taking away crap pages did a boost, so I started to look, and yes that is a simply bookingform that can be viewed witouth the parameters, the parameter only fills in the field property, just to make the user not having to go back and search for the name, so the only diference between bookingform 1, 2, 3 etc, is the name of field properties, so its not 100% duplicated but 99% as per me. And there is no reason to index them unless one want more pages in google, the only use users can have in google index is to spam them. And we must not forget about yahoo, bing etc.... So if should be excluded the best is not using google tools but excluding for all spiders.
However the number 45 could easily be 200 as I have more than 100 properties and to get rid of old pages takes time.
Thanks

phranque




msg:4416594
 2:16 pm on Feb 11, 2012 (gmt 0)

i misunderstood your description in the OP but now i see what your problem is.

you need to add the following to the head of the documents returned by the bookingform.php urls:
<meta name="robots" content="noindex">

and then remove the disallow in robots.txt - otherwise the search engine will never see the noindex directive and the url will likely get indexed without a snippet.

added after edit:
add the urls you meta-robots-indexed to the sitemap xml file until they get recrawled and dropped from the index.

helenp




msg:4416598
 2:45 pm on Feb 11, 2012 (gmt 0)

you need to add the following to the head of the documents returned by the bookingform.php urls:
<meta name="robots" content="noindex">

and then remove the disallow in robots.txt - otherwise the search engine will never see the noindex directive and the url will likely get indexed without a snippet.

added after edit:
add the urls you meta-robots-indexed to the sitemap xml file until they get recrawled and dropped from the index.


Thanks, I also thought about the classic no index, but thought as maybe as the url is ?parameter maybe its not enough.
I will do, thanks,

However the last I dont understand "add the urls you meta-robots-indexed to the sitemap xml file until they get recrawled and dropped from the index."
In the sitemap.xml one put the files you want indexed no? Or is it any code to say drop?

helenp




msg:4416606
 3:06 pm on Feb 11, 2012 (gmt 0)

Ah I think I get it,
you say to add to sitemap.xml bookingform.php to force google to quikly spider the page to get the noindex?
Is that what you mean?
Do I have to add bookingform.php?propiedad=property?
that is sort of complicated to find them all I guess,
or is bookingform.php enough?

helenp




msg:4416723
 12:16 am on Feb 12, 2012 (gmt 0)

Hope it helped, deleted in robots.txt, added meta-tag, added to sitemap.
Submitted sitemap and URL parameters rized from 45 to 47

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved