Create A Forum - Support Forums

Support => General Support => Topic started by: CB on December 03, 2011, 07:33:34 pm

Title: Web Crawlers
Post by: CB on December 03, 2011, 07:33:34 pm
I'm curious to know how Web Crawlers, Spiders, and Bots work. More often than not, when checking to see who is online, Guest (Spiders) will be attempting to post in existing threads. Is this a glitch in the software as I can't really think there are that many dumb people out there that would continually try to do this if they haven't yet registered on the site.


I also get a lot of people "Registering to the forum", but never get any email notification. And when I check in Admin and look in the Waiting for Approval, there isn't any there to approve.


Just odd.
Title: Re: Web Crawlers
Post by: CB on December 06, 2011, 04:05:50 pm
Can anyone give me some info on this?
Title: Re: Web Crawlers
Post by: aIURbliS on December 11, 2011, 05:38:45 pm
Well the bots click on every link they see. They also record all text on the page, so if there's the word "mario" a hundred times on a page and I go to google and type in "mario" then I can expect your forum to be in between pages 1 - 30 on google because the bot can expect there'll be more information on mario which would make your forum more relevant.
Title: Re: Web Crawlers
Post by: CB on December 11, 2011, 05:46:33 pm
Okay. Thanks.


Any info regarding "Registering to Forum" or "Posting In Thread" that I see a lot when checking to see who is online. I know that the people/bots that are posting are not in fact registered with the forum.
Title: Re: Web Crawlers
Post by: aIURbliS on December 11, 2011, 05:51:40 pm
Well the bot doesn't have common sense, cause it's a bot.... If it sees a link, it goes there. They might be stuck at that link because they can't get past it.
Title: Re: Web Crawlers
Post by: CB on December 11, 2011, 05:53:19 pm
Well, if a Bot is stuck at a link, then I guess I shouldn't be too concerned with them rising up and taking over the human race then.  ;D
Title: Re: Web Crawlers
Post by: aIURbliS on December 11, 2011, 06:02:41 pm
They might be smart enough to all pile up on a website and overload the server :P
Title: Re: Web Crawlers
Post by: investigator1 on December 27, 2011, 12:48:10 pm
how do you get the bots and spiders on the forum basically how do you activate them or allow them on your forum
Title: Re: Web Crawlers
Post by: aIURbliS on December 27, 2011, 09:03:47 pm
They automatically come to your forum. No need to activate them whatsoever
Title: Re: Web Crawlers
Post by: investigator1 on January 06, 2012, 11:37:50 pm
but i have never seen one yet  ??? ??? ???
Title: Re: Web Crawlers
Post by: Alex on January 07, 2012, 03:05:04 am
They'll be listed under guests. If you see a random guest every now and then, it's quite possible they are bots. You just have to check their IPs.
Title: Re: Web Crawlers
Post by: CB on January 07, 2012, 03:30:34 am
Is there a difference between real IP's and those of Bot IP's? What do we look for that tells us one way or another?
Title: Re: Web Crawlers
Post by: aIURbliS on January 07, 2012, 07:30:45 am
I can't confirm if this list is valid but this is a list of known ip addresses for Google bots: http://www.iplists.com/google.txt
Title: Re: Web Crawlers
Post by: Alex on January 08, 2012, 08:44:06 pm
Click on the IP next to the name Guest, and it should bring up a "Track IP" page.

Go down to "Look up IP on a regional whois-server" and click on the bolded link. From here you will be taken to a lookup page and it will tell you information about that IP address, including who it is registered to. If it's google or yahoo, etc. then it will tell you this.
Title: Re: Web Crawlers
Post by: CB on January 08, 2012, 08:46:42 pm
Thanks.
Title: Re: Web Crawlers
Post by: CB on January 13, 2012, 12:57:22 am
I know how to get to and open the pages with regards to the IP address, but what are the tell tale signs to look for once the page is opened to know if it's a legit Human looking at the site and one that is a Bot? I can't make heads or tails from all the jargon on the page.
Title: Re: Web Crawlers
Post by: aIURbliS on January 14, 2012, 01:07:27 am
We can't really differentiate between human and bot guests, only by known ip addresses that we know if it's a bot
Title: Re: Web Crawlers
Post by: CB on January 14, 2012, 01:20:27 am
Quote
...only by known ip addresses that we know if it's a bot


Really not sure what you mean here. The way it's worded is that you are able to look at an IP address and know it's a Bot. So, if that's the case, then what is it in the address that tells you it's a Bot?
Title: Re: Web Crawlers
Post by: aIURbliS on January 14, 2012, 02:23:06 am
Well some people try to track what ip addresses are Google Bots for an example. They make a list and we read that list to know which ip addresses a Google Bot uses. However it ip addresses change from time to time unfortunately. Here's something I found on Google's website: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=80553