[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
返回列表 发帖

[文本处理] [已解决]批处理如何提取这个文本中的字符串?

有多个文本,里面只有一行,格式如下。请问各位大牛,能否通过bat,提取“pdfnum“后面的8位数字,其他的乱七八糟的都不要。这个数字在不同文本中的数量是不一样的,至少有1个(含)以上。

{"indexCadalInfoList":[{"id":"539168","title":"古史研究","title_old":"古史研究·第二集·上册","title_keyword":"古史研究·第二集·上册","title_standard":"古史研究·第二集·上册","title_handle":"古史研究·第二集·上册","pdfnum":"06342858","pdfnum","status":"1","tag_library":"古史;研究;上册;三十年代;**二十六年;**;专著","borrow_times":null,"booklist_id":null,"catalogue":null,"collect_num":"0","click_num":"0","comment_num":"0","creator":"卫聚贤(编)","creator_old":"卫聚贤(编)","creator_keyword":"卫聚贤(编)","creator_stop":"卫聚贤(编)","subject":null,"subject_old":null,"description":"本书出版者不详。","description_standard":"本书出版者不详。","description_old":"本书出版者不详。","publisher":null,"publisher_old":null,"date":"1937","date_standard":"1937","date_old":"1937-04(**二十六年)","title_standard":"古史研究·第二集·下册","title_handle":"古史研究·第二集·下册","pdfnum":"06342859","status":"1","tag_library":"古史;研究;下册;三十年代;**二十三年;**;专著","borrow_times":null,"booklist_id":null,"catalogue":null,"collect_num":"0","click_num":"0","comment_num":"0","creator":"卫聚贤(编)","creator_old":"卫聚贤(编)","creator_keyword":"卫聚贤(编)","creator_stop":"卫聚贤(编)","subject":"史评-中国-古代-文集","subject_old":null,"description":"本书出版者不详。","description_standard":"本书出版者不详。",

  1. @echo off
  2. mode con lines=3000
  3. set info=互助互利,支付宝扫码头像,感谢打赏
  4. rem 有问题,可加QQ956535081及时沟通
  5. title %info%
  6. cd /d "%~dp0"
  7. powershell -NoProfile -ExecutionPolicy bypass ^
  8.     $enc=[Text.Encoding]::UTF8;[System.Collections.ArrayList]$s=@();^
  9.     $files=@(dir^|?{('.txt' -eq $_.Extension) -and ($_ -is [System.IO.FileInfo])});^
  10.     for($i=0;$i -lt $files.length;$i++){^
  11.         write-host $files[$i].Name;^
  12.         write-host ('-'*20);^
  13.         $text=[IO.File]::ReadAllText($files[$i].FullName,$enc);^
  14.         $m=[regex]::matches($text,'\"pdfnum\":\"(\d+)\"');^
  15.         foreach($j in $m){^
  16.             write-host $j.Groups[1].value;^
  17.             [void]$s.add($j.Groups[1].value);^
  18.         };^
  19.     };^
  20.     [IO.File]::WriteAllLines('#result.log', $s, [Text.Encoding]::Default);
  21. :end
  22. echo;%info%
  23. pause
复制代码
提供bat代写,为你省时省力省事,支付宝扫码头像支付
微信: unique2random

TOP

回复 2# zaqmlp


看不懂啊,大侠。是不是直接复制,保存成bat运行即可?

TOP

本帖最后由 batbat001 于 2019-10-24 20:03 编辑

回复 2# zaqmlp
成功了!!!感谢大佬!!!
怎样打赏?直接扫描头像即可吧?是否有微信的方式?

TOP

回复 4# batbat001
嗯,扫头像
提供bat代写,为你省时省力省事,支付宝扫码头像支付
微信: unique2random

TOP

回复 1# batbat001
  1. grep -Po "\"pdfnum\":\"[0-9]{8}\"" 1.txt | more > 2.txt
复制代码
推荐下载一个 grep 命令试试:
http://bcn.bathome.net/s/tool/index.html?key=grep
我帮忙写的代码不需要付钱。如果一定要给,请在微信群或QQ群发给大家吧。
【微信公众号、微信群、QQ群】http://bbs.bathome.net/thread-3473-1-1.html
【支持批处理之家,加入VIP会员!】http://bbs.bathome.net/thread-67716-1-1.html

TOP

回复 6# Batcher


大神确实厉害 ,多问一下,如果是要提取“title”后面的内容,应该怎么改?

TOP

使用 Raku Programming Language:
  1. my $line = '{"indexCadalInfoList":[{"id":"539168","title":"古史研究","title_old":"古史研究·第二集·上册","title_keyword":"古史研究·第二集·上册","title_standard":"古史研究·第二集·上册","title_handle":"古史研究·第二集·上册","pdfnum":"06342858","pdfnum","status":"1","tag_library":"古史;研究;上册;三十年代;**二十六年;**;专著","borrow_times":null,"booklist_id":null,"catalogue":null,"collect_num":"0","click_num":"0","comment_num":"0","creator":"卫聚贤(编)","creator_old":"卫聚贤(编)","creator_keyword":"卫聚贤(编)","creator_stop":"卫聚贤(编)","subject":null,"subject_old":null,"description":"本书出版者不详。","description_standard":"本书出版者不详。","description_old":"本书出版者不详。","publisher":null,"publisher_old":null,"date":"1937","date_standard":"1937","date_old":"1937-04(**二十六年)","title_standard":"古史研究·第二集·下册","title_handle":"古史研究·第二集·下册","pdfnum":"06342859","status":"1","tag_library":"古史;研究;下册;三十年代;**二十三年;**;专著","borrow_times":null,"booklist_id":null,"catalogue":null,"collect_num":"0","click_num":"0","comment_num":"0","creator":"卫聚贤(编)","creator_old":"卫聚贤(编)","creator_keyword":"卫聚贤(编)","creator_stop":"卫聚贤(编)","subject":"史评-中国-古代-文集","subject_old":null,"description":"本书出版者不详。","description_standard":"本书出版者不详。",';
  2. .say for $line.comb(/pdfnum'":"'<( \w+ )>'"'/);
复制代码

TOP

返回列表