Forum Moderators: open

Message Too Old, No Replies

Google's crawling of members-only content

urgent: plz post your experiences

         

aris1970

8:21 am on Oct 4, 2004 (gmt 0)

10+ Year Member



Hello All,

We have a business website that offered free access to visitors for the last 3 years. It's on the top-5 results of Google for a very competitive business word and now we have developed a membership system within PHP and MySQL that will restrict access to members for specific pages OR provide free access to visitors but will also catch whether they are members or not ("Welcome Guest" or Welcome YourName" will be displayed on free-accessed pages).

For this reason we will be using a PHP script on the top of each page (free and restricted) that will check whether a visitor is already registered or not, based on a cookie.

I would appreciate if you could share your experiences on how Googlebot will probably behave on this major change of content structure and how we could be sure that all the pages (can we confirm it for restricted pages also?) will be crawled, so we will not loose our high ranking.

Thanks for all your help!

kaled

10:50 pm on Oct 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I imagine you'll want to switch off caching by Google for starters. This can only be done with a robots meta tag in the <head>. The required value is something like NOCACHE, but I'm half asleep and I don't think that's quite right.

Kaled.

aris1970

8:03 am on Oct 5, 2004 (gmt 0)

10+ Year Member



Any thoughts on this issue?

Our website has 15,000+ pages of unique content, so we want to be sure that members-only content will not result in not-crawling by Google.

Thanks

aris1970

1:53 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



I thought that Google crawling of member-only content would be an issue that many friends here would have faced before...

Anyway, I am still trying to find some friends to share their experiences on this.

Thanks

Sanenet

2:01 pm on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you can only access those page via a login form, then Google won't crawl / display ads for those pages (it won't be able to access them).

Otherwise, nothing much should change if you are only adding a line with the username.

aris1970

2:19 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



Thanks for your feedback Sanenet.

Is there any way to leave specially Googlebot to crawl the pages that require login?

kaled

5:11 pm on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you wish to allow googlebot to crawl member-only pages you will have to test the environment variable "HTTP_USER_AGENT" and treat user-agent strings that contain 'googlebot' as members.
Alternatively, you could test the ip address instead of user-agent.

If you have written a membership system in php, you should be able to figure out the details of this without difficulty.

To prevent member pages being cached by google, you will have to ensure that each is delivered with the robots meta NOARCHIVE (not nocache as I suggested above).

Kaled.

BigDave

5:46 pm on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You don't exactly mkae it clear in your initial post if all pages aree accessible by "guest" or only some of the pages.

In other words, if I follow a link from google to any of your pages, would I get the page I am looking for, or would I get a login page?

If it is the latter (i.e. login REQUIRED) then you can expect that you will get yourself in trouble by allowing googlebot, but not a surfer to access the page.

You should always depend on your free content to attract users through search engines. Users don't like the first page they get on a site to be a login/register page. And if users don't like it, then google won't like sending users there. I have never seen a site that cloaks to allow googlebot into member only pages that lasts more than a few days.

If you are just calling a user without a cookie "Guest" but still allowing access to the page, then you should be just fine.

sabai

7:08 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



I have never seen a site that cloaks to allow googlebot into member only pages that lasts more than a few days.

It's frequently done with good results, but it is definately black hat and can land you in trouble. My advice is: don't do it - make enough of the site free that you keep your keyword positions.

aris1970

7:32 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



kaled, BigDave & sabai,

Thank you all for sharing your thoughts on this. Let me try to make it more clear:

Our site has more than 15,000+ pages of unique-real content and our new membership script (I am not a/the developer) will divide pages into 2 categories:

1. Pages that are free to read (no login required) but there will be a cookie-check in order to welcome our registered members (i.e. "Welcome BigDave" or "Welcome Guest" if no cookie available).

2. Pages that will require a login in order to be viewed. This category will certainly include the minority of our pages.

So I wonder the following:

- Will the 1st category have any problem with Google being indexed?
- Is there any way (legal and without ANY risk to get penalty!) to allow Google to crawl successfully the 2nd category pages?

Please note that joining our site is free (maybe this is helpful) and furthermore our site - as many others! - tries to be very credible; for this reason we would never like to risk any kind of penalty by Google or any other search engine.

Does Google considers such a googlebot manipulation as spam or reason-for-penalty? I mean allowing googlebot to access these pages (what if letting it archive it also? does it make any difference).

THANKS again for your help. I am the founder of the site but don't have much experience on such Googlebot issues.

Best Regards from Athens, Greece
Aris

PS> If you can share some similar thoughts about Adsense impact of the above scenario it would be very useful too.

BigDave

8:17 pm on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



- Will the 1st category have any problem with Google being indexed?

No, unless you are doing something stupid with the code. Just let them be browsed as guest.

- Is there any way (legal and without ANY risk to get penalty!) to allow Google to crawl successfully the 2nd category pages?

I sure hope not.

Please note that joining our site is free (maybe this is helpful)

It doesn't matter in the least.

Think of this from a searcher's perspective. They see a somewhat interesting listing in google. They click on the link. They get a page that is totally unrelated to the page they were looking for asking them for personal information.

Your page is not giving them what they are looking for. It is not giving them what you showed to googlebot.

The surfer has not had a chance to find out if they would like to become a member of your site before you ask them to register.

What is the likely response to your site? The back button in probably 98% of the cases.

Does this sound like the sort of thing that google wwould want to consider as a top 10 result?

Remember, you have to sell your site to the user if you want them to put in the effort, and what you are trying to do is wasting everyone's time and effort.

and furthermore our site - as many others! - tries to be very credible; for this reason we would never like to risk any kind of penalty by Google or any other search engine.

Well, what you are suggesting is not credible.

Does Google considers such a googlebot manipulation as spam or reason-for-penalty?

Yes. You are serving something different to the bot than you are to the user.

Put up some intro pages to the full articles, and let the bot crawl those.

aris1970

8:35 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



Thanks BigDave,

Your last note is really what I was ready to do, but I was still interested in understanding the problem with google and member-only pages.

You were very clear and it seems I am "forced" to fully agree with your perspective. :)

I do appreciate your valuable feedback.

Sanenet

8:50 pm on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Although the cloaking scenario would allow you to run Adsense on those pages (if the useragent / IP is mediabot then allowed in). Not sure about the "legality" of this tho.

Symbios

9:06 pm on Oct 6, 2004 (gmt 0)

10+ Year Member



Probably the best thing to do is as BigDave suggests but it shouldn't harm you to serve up pages with the login required and some snippets of the paid/member content provided that they are identical.