Ticket #38 (assigned enhancement)

Opened 2 years ago

Last modified 2 months ago

ADD HERE UNIDENTIFED BOTS / SEARCH ENGINE / BROWSERS / OS

Reported by: admin Owned by: admin
Priority: minor Milestone: 1.7.*
Component: general Version: 1.7
Keywords: govt, bots Cc:

Description

Hi, if you want add new bots, search engines, browsers and operating systems that WassUp actually doesn't recognize, please add a reply here with their informations, I will add them in the next releases. Thank you

Attachments

baidu_spider.txt (271 bytes) - added by w3566391169900405@… 15 months ago.
Baidu Search Engine Spider information.

Change History

  Changed 2 years ago by admin

  • status changed from new to assigned

  Changed 2 years ago by hellioness1@…

Hi Michele, I have another one for you. I think this is Google's "Advanced Search". I clicked on the referrer link and the results do show my site on page 1.

-Helene D.


Unidentified Search Engine:

122.107.190.187 /imagegallery/screencaps/ 2008-02-19 03:25:46 Referrer: http://www.google.com/search?as_q=Supernatural screencaps Skin&hl=en&num=100&btnG=Google Search&as_epq=&as_oq=&as_eq=&lr=&cr=&as_ft=i&as_filetype=&as_qdr=all&as_nlo=&as_nhi=&as_occt=any&as_dt=i&as_sitesearch=&as_rights=&safe=images Hostname: c122-107-190-187.eburwd5.vic.optusnet.com.au UserAgent?: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12 us OS: Mac OS X BROWSER: Mozilla Firefox 2.0.0.12

  • 03:24:09 ->/imagegallery/screencaps/?album=3&gallery=23

  Changed 2 years ago by admin

Added thank you

  Changed 2 years ago by selveste.radiohode@…

I found the following undetected searh engines: http://verden.abcsok.no http://www.kvasir.no/nettsok/ http://www.start.no/sok/

And there are more of these. "sok" means "search" in norwegian. (Or rather "søk".)

  Changed 2 years ago by DeuceD

i have tons of spammer hits from IP's like:

bzq-XXX-XXX-XXX-XXX.red.bezeqint.net

i ve found that it's included among spammers with dynamic IP in this list:

http://www.elandsys.com/resources/antispam/blacklist.html

so i think that red.bezeqint.net should be identified as spammer in Wassup

  Changed 2 years ago by admin

Added, thank you

  Changed 2 years ago by hellioness1@…

Here are some more unidentifieds search eng/bots/feeds/etc.
Unidentified Search Engine:
{{{ 205.152.238.75 /imagegallery

2008-03-10 21:51:20 Referer: http://www.dogpile.com/dogpile/ws/results/Web/Supernatural Creatures/1/302349/RightNav/Relevance/iq=true/zoom=off/_iceUrlFlag=7?_IceUrl=true Hostname: 205.152.238.75 OS: Windows XP BROWSER: MSIE 7.0

21:51:45 ->/imagegallery/?album=1&gallery=15

68.161.191.245 /blog 2008-03-05 11:39:41 Referrer: http://search.earthlink.net/search?q=dean and sam in the target changing room&area=earthlink-ws&channel=narrowband&FD=0& Hostname: pool-68-161-191-245.nycmny.east.verizon.net UserAgent?: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/523.12.2 (KHTML, like Gecko) Version/3.0.4 Safari/523.12.2 OS: Mac OS X BROWSER: Safari

11:39:32 ->/category/fan-fiction 11:39:41 ->/blog

80.192.125.69 /tag/visitor_information 2008-03-07 17:28:39 Referrer: http://search.mywebsearch.com/mywebsearch/GGimage.jhtml?pg=GGmain&action=click&searchfor="sam" "supernatural"&tpr=null&ss=sub&st=kwd&ptnrS=ZNxmk571DRGB&ct=CI Hostname: 80-192-125-69.cable.ubr11.edin.blueyonder.co.uk UserAgent?: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebProducts?; .NET CLR 1.1.4322) OS: Windows XP BROWSER: MSIE 7.0}}}

Unidentified Feed Reader:
{{{ 206.196.125.91 /feed

2008-02-29 07:16:47 Referer: Direct hit Hostname: icerocket.com UserAgent?: BlogSearch/1.1-pre-$Rev: 118 $-svn +http://www.icerocket.com/ OS: N/A N/A BROWSER: N/A

71.246.93.35 /feed/atom 2008-03-04 00:17:02 Referer: Direct hit Hostname: pool-71-246-93-35.bltmmd.east.verizon.net UserAgent?: Apple-PubSub/61 OS: N/A N/A BROWSER: N/A (This is the Safari built-in feed reader for Safari 3 Beta on Windows)}}}

Unidentified Bots:
{{{ 71.41.200.74 /robots.txt

2008-03-07 17:10:17 Referer: Direct hit Hostname: rrcs-71-41-200-78.sw.biz.rr.com UserAgent?: MLBot (www.metadatalabs.com) OS: N/A N/A BROWSER: N/A}}}

Undetected Browser:
{{{ 208.80.193.42 /

2008-02-29 06:21:22 Referer: Direct hit Hostname: static-208-80-193-42.as13448.com UserAgent?: Mozilla/5.0 (compatible; Konqueror/3.1-rc3; i686 Linux; 20020924) OS: Linux BROWSER: N/A}}}

  Changed 2 years ago by anonymous

This one just visited my blog, never seen it before: UserAgent?: Mozilla/5.0 (compatible; AboutUsBot/0.9; +http://www.aboutus.org/AboutUsBot)

in reply to: ↑ 9   Changed 22 months ago by anonymous

Replying to anonymous:

Search engine of bluewin.ch: http://search.bluewin.ch/bw/search/web/de/result.jsp?query=gott+auf+planeten&service=search&mode=simple&region=world&pager.offset=20

A couple of bots/crawlers/spiders that keep popping up:

ENOM: hostname match for *.lmhost3.com Discovery Engine: referrer match for *discobot* and/or *discoveryengine.com*

  Changed 22 months ago by anonymous

another one for user-agent matching: http://www.proximic.com

  Changed 22 months ago by hellioness1@…

Michele, I submitted a new list of search engines/bots/feeds but it got rejected as spam. Please retrieve the comment from your akismet log. Thanks. -Helene.

  Changed 22 months ago by anonymous

This leadership position gives us the opportunity http://www.domainkr.cn to work with all of our clients to guarantee that your door will meet your exact needs. After we gather specifications and ideas from you, our award winning designers will create your door guaranteeing that the final product is both technically accurate and beautiful. http://www.buygoldbullion.cn Once this is done our designers pass it over to our manufacturing facility. There, your wrought iron door will be created by skilled artisans, http://www.fencelist.com.cn giving you a one-of-a-kind finished product, with its own copyright and trademark to guarantee your doors authenticity. This process is what makes each of our doors unique, and makes your door a true work of art. If you think that a wrought iron door might be the perfect http://www.show-china.com.cn way to finish your home or business, please visit our portfolio to see our full line of wrought iron door grills. When you are done please visit us at one of our showrooms or request more information by filling out our online form. http://www.oblog.com.cn/u/renxianying/index.html http://treadmilll.bokee.com/

  Changed 21 months ago by admin

  • milestone changed from 1.5.* to 1.6.*

  Changed 21 months ago by anonymous

Hostname: google.uni-koeln.de

  • User Agent: gsa-crawler (Enterprise; M2-BNUZRREKCA6JB; rrzk-wwwadmin@…)

  Changed 19 months ago by reharmonizer@…

Ever since upgrading to 1.6.1 there are a couple of bots that show up on the Latest Hits display even when I select "No spider":

TAILRANK IP: host-nnnn.tailrank.com (64.34.195.xxx, but I'm not sure what the exact IP range is) User Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:Tailrank (Spinn3r 2.2); http://spinn3r.com/robot) Gecko/20021130

CRAWL.YAHOO.NET IP: rz311510.crawl.yahoo.net (67.195.54.250--again I'm not sure of the range) User Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.4) Gecko/20080721 BonEcho/2.0.0.4

I assume there must be a list of patterns that are used to identify spiders, etc. If so, why not expose it for updating between WassUp releases and/or for user customization?

follow-up: ↓ 18   Changed 19 months ago by admin

Thank you, added.

in reply to: ↑ 17   Changed 19 months ago by anonymous

Replying to admin:

Thank you, added.

When will there be an update for the crawl.yahoo.com and tailrank.com bots showing up in normal activity stats? It is driving me nuts as i cannot see thru them to the organic visitors.

  Changed 18 months ago by anonymous

  • type changed from task to enhancement

Hello, that would be nice to have cuill.com (http://www.cuil.com/info/webmaster_info/) crawler added.

Thank you. William Maddler http://www.eth0.it

  Changed 18 months ago by anonymous

hosts of crawlers:

*.cuill.com
bigfinder.de
metager2.de

searchengines:

"Metager|metager.de|eingabe|",
"ICQ Search|search.icq.com|q|",
"T-Online|suche.t-online.de|q|"

Thank you. Sven, http://www.wappler.eu

  Changed 18 months ago by dictionary

Some Czech search engines

Seznam CZ | http://*.seznam.cz/?q=wassup
Zoznam SK | http://www.zoznam.sk/hladaj.fcgi?co=odkazy&fsearch=&scope=all&a=search&s=wassup

  Changed 18 months ago by anonymous

Jyxo CZ | http://jyxo.1188.cz/s?q=wassup&d=cz
Atlas CZ | http://search.atlas.cz/?q=wassup&t=a

  Changed 18 months ago by anonymous

Zohoo CZ | http://www.zoohoo.cz/?q=wassup&c=2
OpenDir CZ | http://www.opendir.cz/od.x?cohledas=wassup&msearch=OpenDir

  Changed 18 months ago by anonymous

Beedly INT | http://www.beedly.us/search.php?q=wassup

  Changed 18 months ago by anonymous

Crawlers

screenshot(*any_number*).seznam.cz

  Changed 18 months ago by anonymous

Spider:

Baiduspider+(+http://www.baidu.com/search/spider.htm)

  Changed 18 months ago by anonymous

  • priority changed from minor to major

  Changed 17 months ago by anonymous

  • type changed from enhancement to task

  Changed 17 months ago by anonymous

Hello. Can you add

Tiscali CZ | http://hledani.tiscali.cz/web/search.php?lang=cs&query=wassup&kde=cz_internet

Thanks

  Changed 17 months ago by anonymous

kosmix.com

  Changed 17 months ago by anonymous

Hi, can you add thi search engine too ?

http://search1.incredimail.com/?q=mapy&lang=english&channel=1037137463

  Changed 16 months ago by anonymous

another search engine

http://search.conduit.com/Results.aspx?q=wassup

  Changed 16 months ago by PBurch

http://www.host-tracker.com is something we use to keep up with uptime. Wassup is recording them as regular visitors and it shouldn't.

  Changed 16 months ago by anonymous

another search engine: crawl-14.cuill.com *.cuill.com

thanks

Changed 15 months ago by w3566391169900405@…

Baidu Search Engine Spider information.

in reply to: ↑ description   Changed 15 months ago by w3566391169900405@…

  • priority changed from major to minor
  • type changed from task to defect
  • Visit type: Regular visitor
  • IP: 38.99.44.104
  • Hostname: crawl-13.cuill.com
  • Url Requested: /2008/12/15/
  • User Agent: Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)
  • Referrer:
  • OS: N/A
  • Wassup ID: 81ee85a5f47a6253c07a3fa71807a211
  • End timestamp: 2008-12-17 07:51:07 ( 1229500267 )

The browser that this bot (still yet to be identified by Wassup) is Opera, according to another site traffic analyzer software I have installed.

  Changed 15 months ago by w3566391169900405@…

  • keywords youdao, sragent, baidu added
  • priority changed from minor to major
  • version set to 1.6
  • Visit type: Feed - feed reader
  • IP: 122.49.118.84
  • Hostname: 122.49.118.84
  • User Agent: SragentRssCrawler?; sragent@…
  • Referrer:
  • OS: N/A
  • Wassup ID: 30d86117e8615b2728140f175eb5c4e6
  • End timestamp: 2008-12-17 23:27:43 ( 1229556463 )
  • Visit type: Regular visitor
  • IP: 61.135.249.202
  • Hostname: 61.135.249.202
  • User Agent: Mozilla/5.0 (compatible; YoudaoBot/1.0; http://www.youdao.com/help/webmaster/spider/; )
  • OS: N/A
  • Locale/Language: cn
  • Wassup ID: 83a54cd41b5d1a2deebe1ffe1bc460db
  • End timestamp: 2008-12-17 23:25:29 ( 1229556329 )

Also, another IP address associated with the Baidu spider: 61.135.168.39

  Changed 14 months ago by admin

1.6.4 version includes every spider/search listed here until now.

  Changed 14 months ago by spesielt@…

  • priority changed from major to trivial
  • type changed from defect to task

This is a bot: Hostname: mail3.q0.ru

ip: 81.176.229.194

  Changed 14 months ago by anonymous

MSN BOT

# Visit type: Regular visitor
# IP: 65.55.109.143
# Hostname: msnbot-65-55-109-143.search.msn.com
# User Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)
# Search Engine: Windows Live COM
# OS: Win2003
# Browser: IE 6.0
# Locale/Language: us

  Changed 14 months ago by can10eezcf00l@…

  • keywords spellsbook, freescripts added; youdao, sragent, baidu removed
  • priority changed from trivial to minor

Not sure if this is a bot or a hack attempt. But I thought I'd put it in here just to be identified, and to let people watch out if it's not just a simple bot attempt. If anyone knows what it is, please let me know.

# Visit type: Regular visitor # IP: 88.198.112.187 # Hostname: static.88-198-112-187.clients.your-server.de # Url Requested: / # User Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727) # Referrer: http://spellsbook.com/freescripts/[your url here]/ # OS: WinXP # Browser: IE 6.0 # Locale/Language: de

  Changed 14 months ago by scardovi@…

From IP crawl-66-249-72-145.googlebot.com I've many call as Nokia 6820. I think it's a Google spider for mobile sites.

  Changed 13 months ago by anonymous

My "feed=rss2" URL is getting constant hits from a Linux RSS client called Liferea 1.4 in bursts that happen once in about 5 minutes.

It would be great if you could list this client as a spider so that it would be possible to filter this from my the log results, as it fills up the logs and makes Wassup nearly impossible to use to track actual visitors.

  Changed 12 months ago by anonymous

  Changed 10 months ago by dez@…

Cuil.com

User Agent:
User Agent: Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)

  Changed 10 months ago by w3566391169900405@…

  • keywords hacker, bots added; spellsbook, freescripts removed

Unidentified (U/I) bots - hit up robots.txt 61.135.220.17 61.135.220.18 61.135.220.19 61.135.220.20 61.135.220.22 61.135.220.24 61.135.220.27 61.135.220.76 61.135.220.83 61.135.220.179 61.135.249.15 61.135.249.23 61.135.249.26 61.135.249.62 61.135.249.86 61.135.249.145 61.135.249.147 61.135.249.169 61.135.249.178 209.249.53.137 209.249.53.203 209.249.53.221 216.129.119.43 220.181.61.215

61.247.221.83 - Unidentified (U/I) from Korea.

193.47.80.45 -from crawl09.exabot.com

Is it possible to just ID the ones that hit up your robots.txt file as a crawler automatically?

222.73.173.10 -hacker (hit up a bunch of random non-existent .asp files)

58.30.252.204 -hacker (hit up a bunch of random non-existent .rar files)

  Changed 10 months ago by dez@…

butterfly.topsy.com

New tool, and they have an explanation link in their user agent.

  Changed 10 months ago by w3566391169900405@…

  • keywords govt, added; hacker, removed
  • version changed from 1.6 to 1.7
  • milestone changed from 1.6.* to 1.7.*

Says the guy's from Washington DC. I think it actually might be from the government! Either case, it's a bot that went through too many pages on my website too quickly.

  • Visit type: Regular visitor
  • IP: 38.105.83.12
  • Hostname: 38.105.83.12
  • Url Requested: /
  • User Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2)
  • Referrer:
  • OS: Win2003
  • Browser: IE 7
  • Locale/Language: us
  • Wassup ID: f59c54d504b4e87808af74979c4c7734
  • End timestamp: 2009-05-26 11:46:36 ( 1243338396 )

  Changed 9 months ago by dez@…

Fairshare uses the following user agent:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1 + FairShare?-http://fairshare.cc)

All of their UserAgents? (Not sure if this is the only one) contain the:

"+ FairShare?-http://fairshare.cc" portion.

If you sign-up for Fairshare (good service) they check the feed quite often and have inflated my feed visits. Being able to throw these into the spider category would be very helpful

  Changed 8 months ago by anonymous

Visit type: Regular visitor IP: 216.129.119.11 Hostname: crawl-1c.cuil.com Browserkennung: Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html) Referrer: Betriebssystem: N/A Wassup ID: a50ce305cc6c159f09b0380f378f987d End timestamp: 2009-07-01 21:57:36 ( 1246485456 )

  Changed 8 months ago by tyler

The most popular Russian search engine - yandex (yandex.ru/yandsearch?text= )

  Changed 8 months ago by qcz

Twitterfeed's UA: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3 twitterfeed

  Changed 2 months ago by annonymous

  • type changed from task to enhancement

173.45.230.10 173-45-230-10.static.cloud-ips.com These come under a range of IP addresses. 173.45.229-230.0-254 All have the prefix of static.cloud-ips.com

Add/Change #38 (ADD HERE UNIDENTIFED BOTS / SEARCH ENGINE / BROWSERS / OS)

Author



Change Properties
<Author field>
Action
as assigned
as The resolution will be set. Next status will be 'closed'
to The owner will change. Next status will be 'new'
 
Note: See TracTickets for help on using tickets.