[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
返回列表 发帖

[文本处理] 批处理-按关键字分割TXT文件

本帖最后由 qd2024 于 2023-2-2 14:32 编辑

TXT文件中,分割位置行首有关键字,如★★★★★

以“★★★★★”为标记,把一个TXT文件分割 为多个TXT文件,
生成的TXT文件,以“★★★★★”所在行为首行,以这一行的文字内容为文件名,
“★★★★★”写标记了开始位置,第2个“★★★★★”,即是第2个TXT文件的开始,也是第1个文件的结束,最后一个到文件尾
但生成的文行首行和文件名中都不包含“★★★★★”

下面的示例文本,最后生成4个txt文件
Module1 unit1 What a delicious smell
Module1 unit2 You sound just like me!
Module2 unit1 What are you doing
Module2 unit2 They have been to many interesting places.

如标点中有非法字符,就删除非法字符,不用写到文件名中。


谢谢



文本原文示例:
★★★★★Module1 unit1 What a delicious smell
Tony: Mnn…What a delicious smell! Your pizza looks so nice.
Betty: Thanks! Would you like to try some?
Tony: Yes, please. It looks lovely, it smells delicious and mm, it tastes good.
Daming: What’s that on top?
Betty: Oh, that’s cheese. Do you want to try a piece?
Daming: Ugh! No, thanks. I’m afraid I don’t like cheese. It doesn’t smell fresh. It smells too strong and it tastes a bit sour.
Betty: Well, my chocolate cookies are done now. Have a try!
Daming: Thanks! They taste really sweet and they feel soft in the middle.
Tony: Are you cooking lots of different things? You look very busy!
Betty: Yes, I am! There’s some pizza and some cookies, and now I’m making an apple pie and a cake.
Daming: Apple pie sounds nice. I have a sweet tooth, you know. Shall I get the sugar?
Betty: Yes, please. Oh, are you sure that’s sugar? Taste it first. It might be salt!
Daming: No, it’s OK. It tastes sweet. It’s sugar.
Tony: What’s this? It tastes sweet too.
Betty: That’s strawberry jam, for the cake.
Daming: Good, everything tastes so sweet! It’s my lucky day!
★★★★★Module1 unit2 You sound just like me!
Hi Lingling,
Thanks for your last message. It was great to hear from you, and I can't wait to meet you.
I hope you will know me from my photo when I arrive at the airport. I'm quite tall, with short fair hair, and I wear glasses. I'll wear jeans and a T-shirt for the journey, but I'll also carry my warm coat. I've got your photo - you look very pretty. I'm sure we'll find each other!
Thanks for telling me about your hobbies. You sound just like me! I spend a lot of time playing classical music with my friends at school, but I also like dance music - I love dancing! I enjoy sports as well, especially tennis. My brother is in the school tennis team - I'm very proud of him! He's good at everything, but I'm not. Sometimes I get bad marks at school, and I feel sad. I should work harder.
You asked me, "How do yo feel about coming to China? "Well, I often feel a bit sad at first when I leave my mum and dad for a few days, and I'm quite shy when I'm with strangers. I feel nervous when I speak Chinese, but I'll be fine in a few days. I'm always sorry when I don't know how to do things in the right way, so please help me when I'm with you in China! Oh, I'm afraid of flying too. But I can't tell you how excited I am about going to China!
See you next week!
★★★★★Module2 unit1 What are you doing?
Tony: Hi, Lingling. What are you doing?
Lingling: I'm entering a competition.
Tony: What kind of competition?
Lingling: A speaking competition.
Tony: "Great. "It'll help you improve your speaking. And maybe you will win a prize.
Lingling: The first prize is "My dream holiday".
Tony: Have you ever won any prizes before?
Lingling: No, I haven't. I've always wanted to go on a dream holiday. But I can't afford it. The plane tickets are too expensive.
Tony: Well, good luck! I've also entered lots of speaking competitions, but haven't won any prizes. I've stopped trying now.
Lingling: That's a pity. Have you ever thought about other kinds of competitions?
Tony: What do you mean?
Lingling: look! Here's a writing competition: Around the world in 80 Days. To win it, you need to write a short story about a place you've visited.
Tony: That sounds wonderful, but I haven't travelled much. How can I write about it?
Lingling: Don't worry. It doesn’t need to be true! You can make it up.
Tony: You're right. I'll try. I hope I will win, then I will invite you to come with me.
Lingling: Sorry! The first prize is only the book called Around the World in 80 Day!
★★★★★Module2 unit2 They have been to many interesting places.
Mike Robinson is a fifteen-year-old American boy and his sister Clare is fourteen. At the moment, Mike and Clare are in Cairo in Egypt, one of the biggest and busiest cities in Africa.
They moved here with their parents two years ago. Their father, Peter works for a very big company. "The company has offices in many countries, and it has sent Peter to work in Germany, France and China before. "Peter usually stays in a country for about two years. Then the company moves him again. His family always goes with him.
The Robinsons love seeing the world. They have been to many interesting places. For example, in Egypt, they seen the Pyramids, travelled on a boat on the Nile River, and visited the palaces and towers of ancient kings and queens.
Mike and Clare have also begun to learn the language of the country, Arabic. This language is different from English in many ways, and they find it hard to spell and pronounce the words. However, they still enjoy learning it. So far they have learnt to speak German, French, Chinese and Arabic. Sometimes they mix the languages. "It's really fun, "said Clare.
The Robinsons are moving again. The company has asked Peter to work back in the US. Mike and Clare are happy about this. They have friends all over the word, but they also miss their friends in the US. They are counting down the days.

楼主好像忘了有些字符是不能出现在文件名中的,比如生成的第3个文件,文件名中会出现字符"?"的,这是非法的。

TOP

回复 2# qixiaobin0715


    收到 谢谢
如有非法字符,直接删除就行

TOP

批处理保存为ANSI编码:
  1. @echo off
  2. setlocal enabledelayedexpansion
  3. findstr /n /rb "★★★★★" 1.txt>1.log
  4. for /f "delims=:" %%a in (1.log) do set _%%a=true
  5. del 1.log
  6. for /f "tokens=* delims=★" %%i in (1.txt) do (
  7.     set /a n+=1
  8.     if defined _!n! (
  9.         for /f "tokens=1,2" %%a in ("%%i") do set filename=%%a %%b.txt
  10.     )
  11.     echo,%%i>>!filename!
  12. )
  13. pause
复制代码

TOP

对示范中的文本来说,实际上不做标记也能实现。

TOP

回复 5# qixiaobin0715


    如果不做标记,怎样实现 ,我的目的是,给孩子把英语文章 一篇一篇 的放在单独的TXT文件里,
怎么做能更简单

感谢

TOP

回复 4# qixiaobin0715


提示   
系统找不到指定的路径。

不会用了

TOP

本帖最后由 qixiaobin0715 于 2023-2-2 15:29 编辑

你把源文件发到网盘上,帮你测试看看。
复制你的示范文本没问题。

TOP

回复 8# qixiaobin0715

大佬,我发现一个问题,像下面这行(倒数第11行),从"!The first"开始都被当成变量省略了
Lingling: Sorry! The first prize is only the book called Around the World in 80 Day!
bat小白,请多指教!谢谢!

TOP

回复 8# qixiaobin0715


    链接:https://pan.baidu.com/s/1HW-cqi8FXobeeYjs-3wkQw?pwd=28h1
提取码:28h1
--来自百度网盘超级会员V9的分享

TOP

  1. @echo off
  2. for /f "delims=" %%a in ('type "文本.txt" ^| findstr /n .*') do (
  3. set "line=%%a"
  4. setlocal enabledelayedexpansion
  5. set "line=!line:*:=!"
  6. set "line2=!line:★★★★★=!"
  7. if not "!line2!" equ "!line!" (
  8. if not "!line2!" equ "★★★★★=" (
  9. set "line2=!line2:?=!"
  10. >xxx.temp echo !line2!
  11. )
  12. )
  13. set /p filename=<xxx.temp
  14. if "!line2!" equ "★★★★★=" (
  15. (echo,!line!)>>"!filename!.txt"
  16. )
  17. if "!line2!" equ "!line!" (
  18. (echo,!line!)>>"!filename!.txt"
  19. )
  20. endlocal
  21. )
  22. del xxx.temp
  23. pause
复制代码

我想了很久,还是利用了临时文件...通用性不大...
bat小白,请多指教!谢谢!

TOP

处理一下特殊字符吧
  1. @echo off
  2. set/p w=<%~fs0 >nul
  3. set "s=★★★★★"
  4. setlocal enabledelayedexpansion
  5. for /f "tokens=*" %%i in (1.txt) do (
  6.     set "str=%%~i"
  7.     if "!str:~,5!" == "!s!" (
  8.        set "file=!str:*%s%=!.txt"
  9.        set "filename="
  10.        call :loop "!file!"
  11.     ) else if defined filename (>>"!filename!" echo;!str!)
  12. )
  13. pause & exit
  14. :loop
  15. for /f tokens^=1*delims^=:\/*?^<^>^" %%a in ("%~1") do (
  16.      set filename=!filename!%%a
  17.      call :loop "%%b"
  18. )
  19. exit /b
复制代码

TOP

回复 10# qd2024
可以这样处理:
1.打开要处理的word文件;
2.另存为中,选择文件类型为纯文本文件,对话框中文本编码中选择其它编码中的GB2312,确定;
3.保存的文本中删除Module1 unit1这一行前面的所有行,删除文本中每行前面的全角空格;
4.将下面代码保存为ANSI编码,运行批处理文件。
  1. @echo off
  2. findstr /n /rb "Module[0-9]*.unit[0-9]" 1.txt>1.log
  3. for /f "delims=:" %%a in (1.log) do set _%%a=true
  4. del 1.log
  5. for /f "tokens=1* delims=:" %%i in ('findstr /n .* 1.txt') do (
  6.     if defined _%%i set "filename=%%j.txt"
  7.     set "str=%%j"
  8.     setlocal enabledelayedexpansion
  9.     echo,!str!>>!filename!
  10.     endlocal
  11. )
  12. pause
复制代码

TOP

回复 1# qd2024

下载gawk( http://bcn.bathome.net/tool/4.1.3/gawk.exe ),确保文本及脚本都已以ANSI编码格式保存,执行后即可获取想要结果
  1. gawk -F"^★★★★★" "/^★★★★★/{F_n=gensub(/[!&<>/\|:*?\"]+/,\"\",\"g\",$2)}F_n{print $0^>F_n}" 文本.txt
复制代码

TOP

回复 13# qixiaobin0715


    成功 万分感谢

TOP

返回列表