Forum Moderators: open

Message Too Old, No Replies

Advanced google checking

check ranking, number of page index, links from all google datacenter

         

speedmax

8:30 pm on Jun 22, 2003 (gmt 0)

10+ Year Member



i am not sure if it is ok or not to post it..

Anyway i got inspired by the disccussion [webmasterworld.com...]

so quickly wrote a php (tested on PHP4.31). i can say this is very very handy for webmaster. save time to go through loads of pages type in all those "allinxxx" crap.
i call it advanced Google Checker

ranking position in top 100 result
number of page indexed
number of links

From every google datacenter(9).

Keep in mind that this php is pretty slow due to it parse about 30 pages to check it, performance also depend on your connection and speed.

This code is for educational use only, use at your own risk.

Enjoy it!

<?php
/*
File Name google.php
Advanced Google checker
Author : SPEEDMAX

HOW TO USE
[yourdomain.com...]
*/
$query = $_GET['key'];
$domain = $_GET['url'];
$numResult = 100;
// if you want to add/delete datacenter edit this line
$datacenter = array("www","www2","www3","www-ex","www-fi","www-cw","www-sj","www-ab","www-zu");

echo "Google Checker : $domain <br>";
echo '<table border=1 ><tr><td>Data Center</td><td>Ranking</td><td>No of page indexed</td><td>Inbound Link</td></tr>';
foreach($datacenter as $v){
$url="http://$v.google.com/ie?q=".urlencode($query)."&hl=en&lr=&ie=ISO-8859-1&num=100&start=0&sa=N";
$file=implode("",file($url));
$file = explode("<NOBR>",$file);
foreach($file as $key => $value){
if(eregi($domain, $value)) {
$result = $key;
}
}
echo "<tr><td>".$v."</td><td>".$result."</td>";
$url2="http://$v.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=site%3A$domain+-wx34A&num=1";
$file=implode("",file($url2));

if(preg_match("/ of about \<b\>([0-9,]*)/",$file,$text)){
echo "<td>".$text[1]."</td>";
}

$url3="http://$v.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=link%3A$domain&num=1";
$file=implode("",file($url3));

if(preg_match("/ of about \<b\>([0-9,]*)/",$file,$text)){
echo "<td>".$text[1]."</td>";
}
echo "</tr>";
}
echo "</table>";
?>

Brett_Tabke

10:37 pm on Jun 22, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It will also be very effective at getting your ip banned by google for abuse. Atleast use a stock agent name.

speedmax

3:00 am on Jun 23, 2003 (gmt 0)

10+ Year Member



before i thought the same. but then i realise that it only download 3 file from the same server. plus in the apache log.

it will show like this, which is very very hard to track.
---------------------------------------------------
192.168.0.1 - - [23/Jun/2003:11:00:09 -0500] "GET /search?q=keyword HTTP/1.1" 200 4953 "-" "-"
---------------------------------------------------

i do believe even if you run this script in daily base it is okay. i mean who cares if some one download 3 page from your site compare to millons of request. :P

[edited by: speedmax at 4:46 am (utc) on June 23, 2003]

caustic

4:10 am on Jun 23, 2003 (gmt 0)

10+ Year Member



interesting. Shouldn't be prob if you use your API key?

speedmax

4:52 am on Jun 23, 2003 (gmt 0)

10+ Year Member



For those who is interested.
This is the result of the script.
=========================================
Google Checker
URL : www.example.com
Keyword : lyrics
Data Center¦Ranking¦No of page indexed¦Inbound Link
www 97 19,500 154
www2 90 17,000 154
www3 90 17,000 154
www-ex 98 19,100 154
www-fi 90 17,000 154
www-cw 97 19,500 154
www-sj 97 19,100 154
www-ab 97 17,200 154
www-zu 97 17,200 154

georgeek

6:18 am on Jun 23, 2003 (gmt 0)

10+ Year Member



It will also be very effective at getting your ip banned by google for abuse.

I thought this myth had been debunked once and for all by GG when he confirmed in a previous thread...

...the things you mentioned are good reasons why IP address penalties are usually not a good idea. All the comments about manual penalties apply, plus it's easy for a bad guy to switch to a different IP. I haven't seen an IP-based penalty in a long time.

There are very good reasons not to make a habit of programmed interrogation of Google servers - but this does not appear to be one of them.

RBuzz

11:12 am on Jun 23, 2003 (gmt 0)

10+ Year Member



> interesting. Shouldn't be prob if you use your
> API key?

You couldn't run this program via the API -- the API doesn't allow access to all the Google servers referenced.

Josk

11:14 am on Jun 23, 2003 (gmt 0)

10+ Year Member



Hmmm... Well..., in the past a company that I was working at did lots of programmed requests to google. And the IP address used got banned from searching Google. Oops.

GoogleGuy

4:47 pm on Jun 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



georgeek, that post was talking about doing an IP address ban for web sites that spam--that's a bad idea. However, we have no problem banning IP addresses that send automated queries to Google. If that happens, you'll get a 403 Forbidden error any time you want to access Google from that IP address. Sometimes, your ISP will want to talk to you, because we can drop them reports of abuse as well.

Overall, I highly recommend *not* hitting Google with automated queries.

coolasafanman

4:58 pm on Jun 23, 2003 (gmt 0)

10+ Year Member



there's a website that people use all the time as a 'link popularity check' - how do they get away with it, and maintain a PR 7?
Also, how does using the free google API software differ from automated queries written in PHP and other lanquages?

georgeek

5:15 pm on Jun 23, 2003 (gmt 0)

10+ Year Member



GoogleGuy

...we have no problem banning IP addresses that send automated queries to Google...

Thanks for the clarification - I hope you have your official hat on for this one :)

Bandwidth theft gets punished on my sites too ;)

GoogleGuy

5:47 pm on Jun 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



coolasafanman, with the API we can throttle back on queries if we needed to--that's why the API is there, so that people can do low-level queries on a non-commercial basis without worrying at all.

Any IP that does automatic or programmatic queries can get banned, so I would use the API as the legitimate way to access Google instead.

coolasafanman

6:11 pm on Jun 23, 2003 (gmt 0)

10+ Year Member



sounds good to me GG, now all i need to do is figure out how to get it to work ;-)

The Subtle Knife

6:12 pm on Jun 23, 2003 (gmt 0)

10+ Year Member




before i thought the same. but then i realise that it only download 3 file from the same server. plus in the apache log. it will show like this, which is very very hard to track.
---------------------------------------------------
192.168.0.1 - - [23/Jun/2003:11:00:09 -0500] "GET /search?q=keyword HTTP/1.1" 200 4953 "-" "-"
---------------------------------------------------
i do believe even if you run this script in daily base it is okay. i mean who cares if some one download 3 page from your site compare to millons of request. :P

can someone else independently confirm this?

Surely, running this script a few times a day,
you could never be tracked or banned by google?

can someone explain clearly and concisely (no wish wash)
why sites like google-dance don't get banned then?
Is is becacause it's just URL/Frames, no autmatic queries
are ever sent to google?
Surely, if some php in the same pages, counted how many
times your domain appeared, that would be OK? (rather than manually look at each datacenter and try to find your domain?)

abcdef

6:19 pm on Jun 23, 2003 (gmt 0)

10+ Year Member



so all i want to know is:

how with google tools can i do a search on our domain with a tandem keyword, and find out where we are ranked for that keyword?

i have been hitting referring URL on who's on now statistics tracking until now to find out what page we are showing up on various terms we get visitors from.

thanks anybody.

speedmax

10:15 pm on Jun 23, 2003 (gmt 0)

10+ Year Member



i think it is pretty easy to add the API thingy.

but any way this nice and simple..

ogletree

10:35 pm on Jun 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Would Google be willing to work with a 3rd party to offer a paid service for automated searches? I would provide a server and a dedicated T1 to a Google datacenter or any other hardware needed. They could send me updates at off peak times. I would pay a licensing fee or whatever.

bigace

1:15 am on Jun 24, 2003 (gmt 0)

10+ Year Member



Hypothetically: Would Google consider an automatic search tool to be abusive if there was a 10 second delay between each query.

I would like to find out how I stand with each of my important keyword phrases. I can sit there and enter the phrases one at a time (which useally takes about 10 seconds between queries) or it would be quite simple to write a little auto query program that has a 10 second delay to do it for me. The impact on the bandwidth would be exactly the same.

Any opinions on this option.

caustic

2:31 am on Jun 24, 2003 (gmt 0)

10+ Year Member



speedmax if you add the api code we would be grateful. I can't see anyone using it more than a few times a day though. Thanks for the cool script.

Lippsy

8:30 pm on Jun 24, 2003 (gmt 0)

10+ Year Member



I could be wrong, but I don't think that the Google API will not let you query other datacenters besides www.

eaden

5:45 am on Jun 27, 2003 (gmt 0)

10+ Year Member



Exacly lippsy.

Goggle API is of no use in the case you want to check your rankings on multiple datacenters.

GoogleGuy

5:54 am on Jun 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Depends, bigace. 10 seconds between queries would still allow 8640 queries a day, which starts to add up if everyone does it. :)

Personally, I don't recommend following queries down to comparing data center by data center. See my "knob" post for the reasons why. :)

eaden

6:29 am on Jun 27, 2003 (gmt 0)

10+ Year Member



GoogleGuy Said
Personally, I don't recommend following queries down to comparing data center by data center.


Well why is it possable then? Why not just redirect www-fi.google.com to www.google.com? If Google doesn't want people to check individual datacenters, then they should do something about it.

Balboa

9:17 am on Jun 27, 2003 (gmt 0)

10+ Year Member



This is the "knob" post from GG
[webmasterworld.com...]

ogletree

11:28 pm on Jul 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



He is not saying that it will hurt us somehow by looking at the data centers he just thinks people that do are wasteing their time. That there is no value in doing so. I like to watch the seti@home screensaver or the defrag on my computer but it really does not help me find aliens or understand how I can use my computer better so that it is less defragged next time.