|
|
发表于 2018-1-30 11:00:14
|
显示全部楼层
更新:
1. chdir 路径 or 题主路径,这样不改代码就可以在我的目录和题主的环境上运行- chdir '.\TEDTXTUNICODE' or
- chdir 'C:\Users\CH6\Desktop\TEDTXTUNICODE' or quit( $! );
复制代码 2. 增加时间轴匹配,时间对应才合并,如果没有匹配到时间会提示 missing $time at $filename
=info
523066680@163.com
匹配时间轴,改善输出提示
=cut
use Encode;
use File::Basename;
use Term::ReadKey;
STDOUT->autoflush(1);
chdir '.\TEDTXTUNICODE' or
chdir 'C:\Users\CH6\Desktop\TEDTXTUNICODE' or quit( $! );
my $path_eng = '.\eng1246';
my $path_chs = '.\chs1203';
my $path_merge = '.\merge';
mkdir $path_merge unless -e $path_merge;
my ($en, $cn, $merge);
for my $cn ( glob "$path_chs\\*.txt" )
{
$en = "$path_eng\\". basename($cn);
$merge = "$path_merge\\". basename($cn);
merge( $en, $cn, $merge ) if ( -e $en );
}
quit("Done");
sub merge
{
my ( $en, $cn, $merge ) = @_;
my ( %ha, %hb, $mix );
print "Processing $merge\n";
load( \%ha, $en );
load( \%hb, $cn );
$mix = "";
for my $time ( sort keys %ha )
{
unless ( exists $hb{$time} )
{
print " missing $time at $cn\n";
next;
}
$mix .= $ha{$time} ." ". $hb{$time} ."\r\n";
}
open $fh, ">:raw", $merge;
print $fh "\xff\xfe". encode('utf16-le', $mix);
close $fh;
}
sub load
{
my ( $href, $file ) = @_;
open my $fh, "<:encoding(utf16-le)", $file;
my @arr = <$fh>;
close $fh;
for my $id ( 0 .. $#arr )
{
if ( $arr[$id] =~/(\d+:\d+:\d+).*\d+:\d+:\d+/ )
{
$href->{$1} = $arr[$id+1];
$href->{$1} =~s/\r?\n//;
}
}
}
sub quit
{
print $_[0];
ReadKey -1;
exit;
} |
测试《雷神》字幕(手动去掉了一个时间轴)
Processing .\merge\Thor.txt
missing 00:00:54 at .\chs1203\Thor.txt
.\merge\Thor.txt
Now, I know what you're thinking. 我知道你在想什么
"Oh, no! Thor's in a cage. How did this happen?" 不 托尔被关在笼子里了 怎么回事
Well, sometimes you have to get captured 有时 你得先被抓住
just to get a straight answer out of somebody. 才能从某人那里问出个所以然来
It's a long story, but basically, I'm a bit of a hero. 说来话长 但其实 我算是个英雄
分享个雷神字幕的示例包裹,含目录结构 |
|