LZN's Blog CodePlayer

【COAP】python学习抓取页面之cookie处理

2015-06-24
LZN

正式开始进行COAP计划,尝试使用虫师文章中的方法,第一步尝试抓页面,没想到抓AMS页面直接出错,获取到如下信息:

An Error Occurred Setting Your User Cookie
This site uses cookies to improve performance. If your browser does not accept cookies, you cannot view this site.

看起来需要python模拟cookie,查到了这篇资料: http://www.jb51.net/article/57161.htm

用firefox开发者工具查看到如下cookie于消息头: __utma=16122406.1750209584.1411475670.1431938642.1432023770.7; __utmz=16122406.1432023770.7.6.utmcsr=gfsoso.net|utmccn=(referral)|utmcmd=referral|utmcct=/; is_returning=1; MAID=dzAjIuUXaPeh6b+Yq+nUfw==; MAID=dzAjIuUXaPeh6b+Yq+nUfw==; __utma=204447755.516480521.1411475728.1433503003.1435135708.81; __utmz=204447755.1431959196.71.52.utmcsr=gfsoso.net|utmccn=(referral)|utmcmd=referral|utmcct=/scholar; __atuvc=2%7C18%2C0%7C19%2C10%7C20%2C14%7C21%2C1%7C22; __atssc=google%3B11; _ga=GA1.2.1750209584.1411475670; SERVER=WZ6myaEXBLEBggwSoVsKlQ==; SERVER=WZ6myaEXBLEBggwSoVsKlQ==; JSESSIONID=aaabcsH1XJkK18aZ3n83u; JSESSIONID=aaabcsH1XJkK18aZ3n83u; __utmb=204447755.6.10.1435135708; __utmc=204447755; __utmt=1

完全没头绪……

关于cookie的操作

http://www.jb51.net/article/57144.htm

换用urllib2,去请求该页面的cookie,然后输出html,done!这样就将页面获得了,还是很简单的。不过那些方法属性,确实不懂……

QQ截图20150624220531


Comments

Content