Articles » Linux Admin
Malbots
You have spent a lot of time or money or both in the construction of your website, you have struggled with the search engines for better results and despite all your efforts ... something is wrong!
You notice that without an increase of visitors per day, some days have a lot of traffic. As if suddenly the visitors appreciated the content and want to read every single page of your website! It is obvious that "someone" visits all of the pages one by one...
You should know that this happens because you just got discovered by the malbots of the internet!
If you have a close look in your cPanel statistics, an IP stands out from the rest, having visited all your pages. What can you do?
- IP: Mark the suspicious IP.
- Logs: Open the Apache log in a text editor (to do that you must have an FTP account). If you don't know where it is, ask the company that provides you with hosting services.
- Bot IP: Try to locate records of the "bad IP" in the txt file that you just opened.
- Identity: Find its identity. Here is an example of such a registration: 77.249.25.51 - - [11/Jan/2004:11:12:25 0200] "GET / HTTP/1.1" 500 - "-" "Java/1.6.0_21"
We realize that its identity is "Java/1.6.0_21". Using "Java /", we will try to exclude it from the .htaccess file. - .htaccess: lace the following code in the .htaccess (caution: replace the serial "BadBot" names with real names of bots):
RewriteEngine On
RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBot01 [OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBot02 [OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBot03 [OR]
....
RewriteCond %{HTTP_USER_AGENT} ^BadBot100
RewriteRule ^.* - [F,L] - Replacement: Change the names you see above in bold, with all the names of bots that you want to block.
Here are all known malbot names:
BlackWidow, CherryPicker, ChinaClaw, Crescent, Custo, DISCo, Download Demon, eCatch, EirGrabber, EmailCollector, EmailSiphon, EmailWolf, Express WebPictures, ExtractorPro, EyeNetIE, FlashGet, GetRight, GetWeb!, Go!Zilla, Go-Ahead-Got-It, GornKer, GrabNet, Grafula, HMView, RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
Image Stripper, Image Sucker, RewriteCond %{HTTP_USER_AGENT} Indy Library [NC,OR]
InterGET, Internet Ninja, Irvine, Java/, JetCar, JOC Web Spider, larbin, LeechFTP, Mass Downloader, Microsoft.URL, MIDown tool, Mister PiX, Mozilla.*NEWT, Navroad, NearSite, NetAnts, NetSpider, Net Vampire, NetZIP, NICErsPRO, Octopus, Offline Explorer, Offline Navigator, PageGrabber, Papa Foto, pavuk, pcBrowser, RewriteCond %{HTTP_USER_AGENT} dloader(NaverRobot), ReGet, SearchExpress, SiteSnagger, SmartDownload, SuperBot, SuperHTTP, Surfbot, Siphon, tAkeOut, Teleport Pro, VoidEYE, Web Image Collector, Web Sucker, WebAuto, WebBandit, WebCopier, WebFetch, WebGo IS, WebLeacher, WebReaper, WebSauger, Website eXtractor, Website Quester, WebStripper, WebWhacker, WebZIP, Wget, Widow, WWWOFFLE, Xaldon WebSpider, RewriteCond %{REQUEST_URI} /_vti_, RewriteCond %{REQUEST_URI} cltreq.asp$, RewriteCond %{REQUEST_URI} owssvr.dll$, Zeus, ZyBorg
Good luck!