Forum Moderators: open

Message Too Old, No Replies

Parsing referer URL

will this upset google in anyway?

         

leef50

6:22 pm on Jan 7, 2003 (gmt 0)

10+ Year Member



I want to write my own stats tool to parse the URL and get things like the search terms used and that sort of thing... is there anything I need to be aware of that googlebot may not like. I think I read something here in the past about this but it may have been about checking UserAgent for cloaking or whatever its called.

andreasfriedrich

8:19 pm on Jan 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Analyzing the server logs has no effect on the ability of GoogleBot to spider your page.

Andreas

jatar_k

8:31 pm on Jan 7, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



If you are building a script to collect and/or analyze this info on the fly, don't worry, bots can't see what is done server side, as in php or perl. It can't get upset for something it doesn't know about.

As andreas mentioned, you can get all of that info from your server logs and analyzing them has no effect to any bots.

rincey

8:50 pm on Jan 7, 2003 (gmt 0)

10+ Year Member



Like the other posters said the Google bot can't see what you do on the server side.

For over 12 months now I log every page accessed on any of my projects. They are all served out of my own small CMS which can react on the user agent information (e.g. replace highly dynamic parts of pages with static info for Googlebot etc.) and parses page accesses on the fly.

E.g. the server sends me a mail when Googlebot starts spidering a domain or a new page is first hit by a referer url containing "%google%" and extracts the search query string from SE referer URLs and logs them to a special table etc.

Anyway, I would be quite interested in whatever efforts you make in this direction. I still try to make my tools more flexible and intelligent.

Rincey

leef50

9:02 pm on Jan 7, 2003 (gmt 0)

10+ Year Member



so... does this mean I could strip all the pretty HTML type things so google sees a better txt to HTML ratio or is this dodgy ground? Sounds bit spammy like feeding google bot something else.

jatar_k

9:07 pm on Jan 7, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



now you are getting into dodgy ground, that goes into cloaking

puzzled

9:16 pm on Jan 7, 2003 (gmt 0)

10+ Year Member



jatar_k is right. Googlebot doesn't want to be
discriminated. It want to be treated as a human
being.

If you still do, it will report to the
Google Guys. And they will punish you!

jatar_k

9:20 pm on Jan 7, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



paranoia aside, I am saying that when you start serving different content based on who or what is accessing your site you get into the cloaking neighbourhood.

If you serve different content based on human/bot visits for the explicit reason of increaing rankings you are playing with something engines hate.

I am not saying don't do it, I am just saying make sure you do it right because an overly apparent mistake could cost you. The learning curve becomes very steep.

atadams

10:00 pm on Jan 7, 2003 (gmt 0)

10+ Year Member



I don't know if most people know this, but, if you come to a WebmasterWorld thread from a Google keyword search, WebmasterWorld parses the search terms from the referrer info and highlights them in the forum content.

Rumbas

11:09 pm on Jan 7, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Don't know much about this stuff, but Brett uses it right here:

Try this search [google.com] and click the 4th spot (this forum). See the search term being highlighted?

It can be done ;)

<added>Gee, if I only read the whole thread - better keep out of this forum *blush*</added>

FlameOut

1:41 am on Jan 8, 2003 (gmt 0)

10+ Year Member



Please let me rephrase the original question in another way. My site is hosted so I don't have access to the server logs. However, I can create an embedded image link <img=.../cgi?referrer=...> that makes a CGI query and passes the referrer (pulled from JavaScript).

Could this be considered dangerous? I don't have any other way to get referrer info on internal pages.

Thanks.

duckhunter

2:05 am on Jan 8, 2003 (gmt 0)

10+ Year Member



FlameOut,
I store the following four pieces of data in a small database table that can be truncated whenever I need to free up server space.

1) Referring URL
2) Browser Type
3) First page hit
4) IP Address

I can run queries on the data however I like now. Believe me, it's valuable information to have in a database format.

FlameOut

2:15 am on Jan 8, 2003 (gmt 0)

10+ Year Member



Thanks duckhunter. I absolutely agree on the importance of this data. That's why I'm going to such extent to capture the data.

Again, I don't have access to the server logs. I am using an embedded image tag <img src=../cgi/save?referrer=> (where referrer is captured from the JavaScript Document object) to forward this information from each internal page on my site. Could this appear to be a cloaking technique to a penalty filter or does anybody see any other penalty dangers with this technique?

jatar_k

2:28 am on Jan 8, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



a lot of tracking software uses that type of thing and no it will not get you in trouble.

webtrendslive, hitbox, etc.

jamesyap

8:30 am on Jan 8, 2003 (gmt 0)

10+ Year Member



I have seen a site (That I don't like) stand strong in very good ranking (some #1) in very very competitive keywords.

And surprise surprise surprise, this guy, who I have seen him posting in another forum, who like to bash other people web sites and claimed himself to be the BEST have actually CloaKing ...

Why I say this? This guy really have a lot inbound links to his site, all from related area. But if you go diretly to this site, it looks like the main page. But if you get in through other sites, it parse the referral URL and reflect the page it should display, which is a status of that link site.

Since googlebot never send referral URL she will always sees the main page.

I am sure this is call Cloaking, just I am not sure if it is LEGAL cloaking or ILLEGAL cloaking. I didn't report this case. I would like to gather more information and comments before I do so.... would like to listen to you all...