ThWboard Support-Forum (Archiv)

Ort: / Boardübersicht / Code Hack Requests / Control "Spiders" Statistics


Seite 1 von 1

fingers schrieb am 08.05.2005 um 01:36 Uhr

I Was wondering if anyone was using any code that will stop spiders accessing particular pages ie
register
login
etc

and stop them showing in the Statistics as IP's
?
fingers
ps saw this on a forum for phpNuke http://nukecops.com/postp181209.html

$guest_online_num = $db->sql_numrows($db->sql_query("SELECT uname FROM ".$prefix."_session WHERE guest='1' AND host_addr NOT REGEXP '^68[.]142[.][192-255][.]*'"));

Luki schrieb am 10.05.2005 um 09:52 Uhr

i tried out robots.txt ... < doesn't work!

now i ask for useragent and stop Google & co from index login, profile, pm or better rewrite link to plain text.

fingers schrieb am 10.05.2005 um 13:32 Uhr

maybe a variable to include in header.inc.php to write a correct metatab?? to the frame.html ???

//Robots Meta Tag Examples: 

<META NAME="ROBOTS" CONTENT="INDEX,FOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX,FOLLOW">
<META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">

Source: http://searchengineworld.com/metatag/robots.htm

Luki schrieb am 11.05.2005 um 01:20 Uhr

yea why not, sounds like a good solution!

fingers schrieb am 11.05.2005 um 04:09 Uhr

Maybe it is possible that the script could be an include in 2.85 RC4?? :)

here is a little code i will tryout i called it testmeta.php but script would need to be placed at the end of header.inc.php file before the last ?>

<?php
// testmeta.php
// code to give spiders directions
?>
<html>
  <head>
   <title>
        SCRIPT_NAME CHECKING
   </title>
<?php
// begin header.inc.php code
/*
#################################
Start: <META NAME="ROBOTS" Hack
#################################
If needle is not found, returns FALSE.
<META NAME="ROBOTS" CONTENT="INDEX,FOLLOW">
<META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">
these next 2 could use another Array and use elseif {} etc
<META NAME="ROBOTS" CONTENT="NOINDEX,FOLLOW">
<META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW">
*/
$no_index_file = array("testmeta.php", "newevent.php", "newtopic.php", "v_profile.php", "logon.php", "register.php", "reply.php");
list($required ) = explode ( "?", $_SERVER["SCRIPT_NAME"],1 );
$intLastSlash = strrpos($required, "/");
$required = substr($required, $intLastSlash+1, strlen($required));
if (in_array( $required, $no_index_file))
{
   $metaname = '   <META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">
   <META NAME="KEYWORDS" CONTENT="thwboard,forum,members">';
}
else
{
   $metaname = '   <META NAME="ROBOTS" CONTENT="INDEX,FOLLOW">
   <META NAME="KEYWORDS" CONTENT="thwboard,forum,members">';

}
/*
The variable $metaname need to be added to..
 frame.html inside <head> here </head>
#################################
End: <META NAME="ROBOTS" Hack
#################################
*/
//end header.inc.php code
echo $metaname;
?>

</head>
<body>
See source to check if Correctly used <br />
Script_name text: <?php echo $_SERVER["SCRIPT_NAME"]; ?>
</body>
</html>
<?php
// last line
?>

***Edited code to allow deeper install in tree

Luki schrieb am 11.05.2005 um 11:58 Uhr

^^ very cool!

fingers schrieb am 13.05.2005 um 08:52 Uhr

A Update after reading some more and looking at the RAW access.log file


This will not work for all spiders as not ALL of them respect the robots meta tag
***and as i have found they still hit the page, so its visit goes into the site statistics but not the Search engine INDEX database..... that would need something in the code to not display the links to search engine spiders to stop them being recorded in the Forum Stats... over to theDon!

Here is more info install
in template frame.html
ADD
$metaname here
eg

<html>
<head>
<title>$titleprepend $config[board_name]</title>
$metaname

and

in the header.inc.php
ADD
this code

$no_index_file = array( "memberlist.php", "edit.php", "misc.php", "search.php", "postops.php", "newevent.php", "newtopic.php", "v_profile.php", "logon.php", "register.php", "reply.php");
list($required ) = explode ( "?", $_SERVER["SCRIPT_NAME"],1 );
$intLastSlash = strrpos($required, "/");
$required = substr($required, $intLastSlash+1, strlen($required));
if (in_array( $required, $no_index_file))
{
   $metaname = '   <META NAME="ROBOTS" CONTENT="NOINDEX,NOFOLLOW">
   ';
}
else
{
   $metaname = '   <META NAME="REVISIT-AFTER" CONTENT="1 DAYS">
   <META NAME="RATING" CONTENT="GENERAL"> 
   ';

}
$metaname .= '<META NAME="DESCRIPTION" CONTENT="';
$metaname_content = "THWB Forum";
$metaname .= $metaname_content.'">
';
$metaname = $metaname.'   <META NAME="KEYWORDS" CONTENT="your words, and phrases">';

insert above last ?>

?>

Seite 1 von 1