回复 6# hijackle
我修改了一下,现在可以取到两千多条数据了。- @echo off
- sed "s#></TR>#>\n</TR>#; s#><TD#>\n<TD#g" web.html > web_0.html
- grep -v "<TR.*<TD" web_0.html | grep -A 9 "<TR bgColor=" > web_1.html
- sed -r "/^<TR/d; s#^</TR>.*##; /点击查看/ s/.*a href=(http[^ ]+) .*/点击查看 \1/; s/<[^>]+>//g" web_1.html > web_2.html
- gawk -v RS="\n\n+" "$1=$1" web_2.html | findstr "http:" | gawk "!a[$0]++" > web.txt
- del /f web_0.html web_1.html web_2.html
复制代码
|