找回密码
 注册
搜索
[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
查看: 17474|回复: 2

Chinese2PY 汉字转拼音

[复制链接]
发表于 2015-8-9 23:20:58 | 显示全部楼层 |阅读模式
  1. 用法
  2. Chinese2PY -f 文件
  3. Chinese2PY "字符串"
  4. 多音字会把拼音全部标出来(-_-||)
  5. 大部分数据来自这里
  6. http://pan.baidu.com/share/link?shareid=3895381237&uk=1124163200
  7. 值得吐槽的是,这个字库里有上万个字,但是竟然没有"特"......
  8. 无奈去网上又找了一个现代汉语词典整合
复制代码
exe下载地址 http://pan.baidu.com/s/1c0DEBF6
整合后的字库下载地址 http://pan.baidu.com/s/1i3nLYjb
-------------
优化了代码
PS 转了一个20M的小说耗时13s,但是打开发现中文标点什么的不对劲,原来npp把它竟然识别成了UTF-8...改成以GB2312编码就正常了
发表于 2016-1-18 23:56:03 | 显示全部楼层
看起来不用第三方命令行,表示效率不高,还得附带字典文件,不过人们的智慧是无穷大。
发表于 2024-6-25 14:34:49 | 显示全部楼层
本帖最后由 娜美 于 2024-7-2 21:41 编辑

楼主, 这个项目还在维护吗
找了很多圈,  都试了一遍,  感觉还是楼主你这个最好了!  可以支持多音字转换,   拼音词库全,  很多生僻字也可以转换
但有一些要改进

1.  楼主的拼音词库有一些错误
http://pan.baidu.com/s/1i3nLYjb
  1. ○ ling
  2. 慌 .huang表示难以忍受
  3. 呒 
复制代码
另外
从单个汉字转换出来的拼音看起来好像不是标准拼音规范
  1. be
  2. ceok
  3. ceom
  4. ceon
  5. ceor
  6. cis
  7. dem
  8. dim
  9. eo
  10. eol
  11. eos
  12. gib
  13. go
  14. hal
  15. hol
  16. hwa
  17. jou
  18. kal
  19. kos
  20. kweok
  21. meo
  22. myeo
  23. myeon
  24. myeong
  25. nem
  26. neus
  27. ngag
  28. ngai
  29. ngam
  30. nung
  31. oes
  32. ol
  33. on
  34. pak
  35. peol
  36. phas
  37. phdeng
  38. phoi
  39. phos
  40. ppun
  41. ram
  42. saeng
  43. sal
  44. sed
  45. sei
  46. seo
  47. seon
  48. sol
  49. tae
  50. tol
  51. uu
  52. zo
复制代码
又试着抽取了多音字转换部分拼音查看
多音字转换出来的部分拼音看起来有一些也好像不是拼音规范,不过还好,只是在生僻字上有些小小问题

  1. 髟 bia | bian | biao | piao | shankun        其中‘shankun‘不是拼音规范
  2. 欕 eom | yan        其中‘eom‘不是拼音规范
  3. 甴 gad | you | zha        其中‘gad‘不是拼音规范
  4. 哼 heng | hng        其中‘hng‘不是拼音规范
  5. 乧 dou | dul        其中‘dul‘不是拼音规范
  6. 甴 gad | you | zha        其中‘gad‘不是拼音规范
  7. 櫷 gui | kwi        其中‘kwi‘不是拼音规范
  8. 浼 mei | mel        其中‘mel‘不是拼音规范
  9. 嗯 en | n | ng        其中‘n和ng‘不是拼音规范
  10. 昷 on | wen        其中‘on‘不是拼音规范
  11. 挼 luo | rua | ruo | sui        其中‘rua‘不是拼音规范
  12. 乷 sal | sha        其中‘sal‘不是拼音规范
  13. 涁 lin | qin | sei | shen        其中‘sei‘不是拼音规范
  14. 垈 dai | tae        其中‘tae‘不是拼音规范
  15. 折 she | shw | ti | zhe        其中‘shw‘不是拼音规范
  16. 獤 dun | ton        其中‘ton‘不是拼音规范
  17. 膸 sui | wie        其中‘wie‘不是拼音规范
  18. 曱 yue | zad        其中‘zad‘不是拼音规范
  19. 咗 zo | zuo        其中‘zo‘不是拼音规范
  20. 褡 d | da        其中‘d‘不是拼音规范
  21. 乁 i | ji | yi        其中‘i‘不是拼音规范
  22. 嗯 en | n | ng        其中‘ng‘  和 ”n‘不是拼音规范
  23. 瑁 mao | q        其中‘q‘不是拼音规范

复制代码
2. 转换多音字的分隔符最好转用 ","中逗号分开      有距离看起来容易分辩, 再加中括号更容易分辨。
  1. 例如: 这是一个字的多音字, 有空间距离感看起来容易分辩
  2. 【yan,yao,yin】 ri shen

复制代码
多音字分隔使用如果 "|"   距离过于紧密, 看起来容易眼花潦乱
  1. huang|kang yan|yao|yin ri shen
复制代码
3.  建议后续继承维护者将拼音词库与代码分离, 可以方便编辑拼音词库/更正/添加 等操作

  1. 在楼主的拼音词库基础上 新增几行拼音词库

  2. 慌 huang
  3. 欸 ei
  4. 睖 ling
  5. 碐 ling
  6. 稜 ling
  7. 羐 ling
  8. 誒 ei
  9. 诶 ei
复制代码
提供420种标准拼音规范,基本可以覆盖所有生僻字了

  1. "a","wen","ming","hua","wei","xiao","hai","guo","hong","jun","yu","jian","chun","ping","zhi","lin","yun","jin","rong","yong","xin","dong","ying","cheng","li","long","de","feng","jie","fang","hui","qing","zhong","min","sheng","guang","qiang","yan","xiang","xiu","ling","fei","liang","jia","xing","mei","bao","xue","bo","bin","ya","jiang","peng","chao","xia","rui","fu","zheng","zhen","lan","song","an","juan","tao","qiu","gang","jing","zi","shi","chang","yuan","yi","bing","tian","qin","wu","xu","ze","yang","quan","you","hao","gui","kai","qun","yue","ning","ai","ren","si","shun","xian","pei","shan","gen","da","kun","yin","dan","shu","chuan","lian","xi","ting","fen","ji","zong","na","meng","chen","fa","xiong","cai","shao","qi","ke","le","ru","lei","kang","he","yao","zhao","wan","heng","hu","ju","mao","han","nan","shuang","qiong","gao","en","lai","cui","zeng","sen","shui","zhe","zhang","dao","su","huai","zu","fan","qiao","ye","shou","qian","cun","wang","run","kui","huan","ding","cong","ran","tong","zhan","zhou","jiao","zhu","ben","e","bi","bai","zhuo","nian","jiu","lu","lun","di","chong","xuan","tie","shang","shuai","ni","biao","man","hang","ruo","ri","deng","can","guan","tai","tang","liu","nai","bang","hou","neng","er","xun","zuo","san","kuan","mu","gong","miao","chu","teng","shen","sai","pin","bei","dian","dai","pu","zai","sha","duo","ceng","suo","chan","zan","ge","shuo","geng","jiong","huang","duan","zhuang","nv","huo","chi","pan","lie","she","ci","bu","gai","jue","tuan","dun","gan","lang","nong","gu","luo","kong","sun","po","chuang","ce","zun","kan","pi","lou","mou","cen","ma","cang","mi","dang","te","tu","lv","ang","sui","ti","kou","dui","nuo","mang","ao","dou","ou","shuan","niu","rang","la","mo","die","zhuan","rou","sang","kuo","xie","ka","du","luan","ku","mian","zao","chai","tuo","cao","wa","qu","tan","zhun","mai","kao","chui","bian","nuan","keng","piao","wai","kuang","cha","ban","kuai","fo","diao","sa","ba","liao","rao","men","leng","ta","lao","zuan","pian","che","zhai","ha","pai","gou","wo","tou","zhui","nen","se","re","sao","beng","nie","qia","ga","hei","pang","niang","zui","chou","niao","zou","weng","zha","que","sou","qie","nei","tiao","ken","cuo","tui","nao","tun","hun","nu","hen","shai","reng","ruan","nang","me","miu","cou","ne","suan","pao","o","gun","pie","guai","bie","pen","gua","cu","mie","pa","seng","gei","kua","zang","za","fou","zhuai","diu","cuan","zhua","ca","ei","chuo","yo","shua","pou","nin","zei","chuai","zen","lo","nou","dei","den","ron","chua","dia","eng","lia","ho","ki","ko","so","to","ra","ro","tei","lue","nue","nun","shei","zhei","lve","nve"
复制代码
不知道有人愿意来继承维护更新这个项目不
您需要登录后才可以回帖 登录 | 注册

本版积分规则

Archiver|手机版|小黑屋|批处理之家 ( 渝ICP备10000708号 )

GMT+8, 2026-3-16 21:22 , Processed in 0.018871 second(s), 8 queries , File On.

Powered by Discuz! X3.5

© 2001-2026 Discuz! Team.

快速回复 返回顶部 返回列表