正式开始进行COAP计划,尝试使用虫师文章中的方法,第一步尝试抓页面,没想到抓AMS页面直接出错,获取到如下信息:
An Error Occurred Setting Your User Cookie This site uses cookies to improve performance. If your browser does not accept cookies, you cannot view this site.
看起来需要python模拟cookie,查到了这篇资料: http://www.jb51.net/article/57161.htm
用firefox开发者工具查看到如下cookie于消息头: __utma=16122406.1750209584.1411475670.1431938642.1432023770.7; __utmz=16122406.1432023770.7.6.utmcsr=gfsoso.net|utmccn=(referral)|utmcmd=referral|utmcct=/; is_returning=1; MAID=dzAjIuUXaPeh6b+Yq+nUfw==; MAID=dzAjIuUXaPeh6b+Yq+nUfw==; __utma=204447755.516480521.1411475728.1433503003.1435135708.81; __utmz=204447755.1431959196.71.52.utmcsr=gfsoso.net|utmccn=(referral)|utmcmd=referral|utmcct=/scholar; __atuvc=2%7C18%2C0%7C19%2C10%7C20%2C14%7C21%2C1%7C22; __atssc=google%3B11; _ga=GA1.2.1750209584.1411475670; SERVER=WZ6myaEXBLEBggwSoVsKlQ==; SERVER=WZ6myaEXBLEBggwSoVsKlQ==; JSESSIONID=aaabcsH1XJkK18aZ3n83u; JSESSIONID=aaabcsH1XJkK18aZ3n83u; __utmb=204447755.6.10.1435135708; __utmc=204447755; __utmt=1
完全没头绪……
关于cookie的操作
http://www.jb51.net/article/57144.htm
换用urllib2,去请求该页面的cookie,然后输出html,done!这样就将页面获得了,还是很简单的。不过那些方法属性,确实不懂……