Forum Moderators: open
$url = strtolower(urldecode($referer));
if (eregi("www\.google+(\.[a-z]{2,3})+[/\]+search",$url))
{
preg_match("'(\?¦&)q=(.*?)(&¦$)'si", " $url ",
$keywords);
$this->referer = "Google";
}
$keys = substr($url, strpos($url,"q="));
$keys = substr($keys,2);
if (strpos($keys,"&"))
$keys = substr($keys, 0,strpos($keys,"&"));
$keywords = urldecode($keys);
echo $keywords;
simply would this be detected as cloaking by google bot?
Thanks
Drew
Maybe this was just well written pseudo code?
It also interesting how you have some really cool regular expressions but then switch over to strpos and substr function to get the query terms.
I code by the seat of my pants and I am happy when things work no matter how I got there, so I'm just trying to learn a bit here.
www.google.com/search
www.google.co.uk/search
etc...
if we detect its a google search referer then we know the keywords will be between
q= ...keywords... &
for this reason we can stip eveyghin of left and right of it!
to be honest that keyword code should be within the if statement, i think that is what you referreing to but i jusdt copied and pastes some of my test code, your rite it should be in the if.
the result is the searchengine name and the keywords
Is this clearer
Drew
ill change the code above
eregi("www\.google((\.[a-z]{2,3}){1,2})[/\]search",$url)
the earlier code was just a bit of a bash :)
this expression allows
www.google.co.uk = OK
www.google.com = OK
www.google.coms = IVALID greater than 3 char extension
www.google.co.uk.uk = INVALID more than 2 extesions
so final code looks a bit like this
if (eregi("www\.google((\.[a-z]{2,3}){1,2})[/\]search",$url))
{
//check we have some keywords first or at least structure for them
//starts with? or & then a character followed by as many chars as you like ending in & or $
//use [0] for all [1] for & [2] for keys [3] for &
(preg_match("'(\?¦&)q=(.*?)(&¦$)'", " $url ",$keywords))?$this->keywords= urldecode $keywords [2]):$this->keywords="Unknown Keywords";
$this->referer = "Google";
}
feedback would be great
Thanks
Drew
you dont know the exact format of the referer
www.google.com/search
www.google.co.uk/search
I would not have though of that. I probably would have tested for strpos($url,"google.com/search?")
That's why I'm just a hack :)
The reason I'm kind of interested in this is that I was just about to write some code to do exactly the same thing on raw log files.
I was going to sort the search terms by engine and then by the number of times each term was searched.
I have looked around the web and have yet to see any free scripts that will do this basic task.
I have a few scripts that do some very specific things that others may be interested in but I'm afraid to put them on the web because I'll be the laughing stock of the coding community. It sometimes takes me 10 lines to do what others can do with one.
Im attempting to produce a php class that is far superior to anything out there with a few hidden gems :) and all open source. Unfortunately with php being server side its difficlut getting info about the users enviro, i can get browser, os etc.. but things liek screen res etc.. is difficult. I have a few options that i am investigating which avoid the use of javascript.
at presnt i can detect 20 different search engines and identify a handful of bots/spiders
Feel free to give me a hand with the code, i need all the help i can get :)
any features you would like to see in it just let me know ill be happy to include them
Drew