读取文件名
Perl 不会尝试解码内置函数或模块返回的文件名。表示文件名的此类字符串应始终显式解码,以便 Perl 将它们识别为 Unicode。
use v5.14;
use Encode qw(decode_utf8);
# Ensure that possible error messages printed to screen are converted to UTF-8.
# For this to work: Check that you terminal emulator is using UTF-8.
binmode STDOUT, ':utf8';
binmode STDERR, ':utf8';
# Example1 -- using readdir()
my $dir = '.';
opendir(my $dh, $dir) or die "Could not open directory '$dir': $!";
while (my $filename = decode_utf8(readdir $dh)) {
# Do something with $filename
}
close $dh;
# Example2 -- using getcwd()
use Cwd qw(getcwd);
my $dir = decode_utf8( getcwd() );
# Example3 -- using abs2rel()
use File::Spec;
use utf8;
my $base = 'ø';
my $path = "$base/b/æ";
my $relpath = decode_utf8( File::Spec->abs2rel( $path, $base ) );
# Note: If you omit $base, you need to encode $path first:
use Encode qw(encode_utf8);
my $relpath = decode_utf8( File::Spec->abs2rel( encode_utf8( $path ) ) );
# Example4 -- using File::Find::Rule (part1 matching a filename)
use File::Find::Rule;
use utf8;
use Encode qw(encode_utf8);
my $filename = 'æ';
# File::Find::Rule needs $filename to be encoded
my @files = File::Find::Rule->new->name( encode_utf8($filename) )->in('.');
$_ = decode_utf8( $_ ) for @files;
# Example5 -- using File::Find::Rule (part2 matching a regular expression)
use File::Find::Rule;
use utf8;
my $pat = '[æ].$'; # Unicode pattern
# Note: In this case: File::Find::Rule->new->name( qr/$pat/ )->in('.')
# will not work since $pat is Unicode and filenames are bytes
# Also encoding $pat first will not work correctly
my @files;
File::Find::Rule->new->exec( sub { wanted( $pat, \@files ) } )->in('.');
$_ = decode_utf8( $_ ) for @files;
sub wanted {
my ( $pat, $files ) = @_;
my $name = decode_utf8( $_ );
my $full_name = decode_utf8( $File::Find::name );
push @$files, $full_name if $name =~ /$pat/;
}
注意:如果你担心文件名中的 UTF-8 无效,则上述示例中 decode_utf8( ... )
的使用应该可以替换为 decode( 'utf-8', ... )
。这是因为 decode_utf8( ... )
是 decode( 'utf8', ... )
的同义词,编码 utf-8
和 utf8
之间存在差异(请参阅下面的备注以获取更多信息),其中 utf-8
对 utf8
的接受程度更严格。