这几天访问量突增,流量/IO一下上去了,于是检查Nginx日志,发现有好多都是Spider爬行记录。谷歌Spider与百度Spider居多,剩余的什么有道、微软Bing、Yahoo等七八个都在爬。而雅虎Spider貌似比其他更顽固。
我有一个go页面是做留言者网站跳转用的,所以Spider爬到该页面的时候多半是302,最后Spider都会绕行。唯独雅虎Spider不放弃,每天坚持N次才罢休
110.75.171.110 - - [20/Mar/2013:17:07:17 +0800] "GET /20120554.html HTTP/1.1" 200 7965 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:20 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:10:26 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:10:35 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:10:48 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:10:54 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:56 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:10:58 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:00 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:11:02 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:11:07 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:11:17 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.110 - - [20/Mar/2013:17:11:26 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:11:32 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:36 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.108 - - [20/Mar/2013:17:11:38 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.112 - - [20/Mar/2013:17:11:39 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.111 - - [20/Mar/2013:17:11:41 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.110 - - [20/Mar/2013:17:11:44 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.109 - - [20/Mar/2013:17:11:48 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.172.107 - - [20/Mar/2013:17:11:55 +0800] "GET /go/BeT0h1 HTTP/1.1" 302 5 "-" "Yahoo! Slurp China" -
110.75.173.195 - - [20/Mar/2013:17:59:07 +0800] "GET /201302280.html HTTP/1.1" 200 6544 "-" "Yahoo! Slurp China" -
以上是昨天的记录,基本上每天都有这种情况,再这样顽固下去我真要找一个对策,禁雅虎Spider,反正雅虎搜索也没啥人使用
最后附上上雅虎中国搜索小样: