source: trunk/third/perl/lib/utf8.pm @ 14545

Revision 14545, 1.8 KB checked in by ghudson, 24 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r14544, which included commits to RCS files with non-trunk default branches.
Line 
1package utf8;
2
3$utf8::hint_bits = 0x00800000;
4
5sub import {
6    $^H |= $utf8::hint_bits;
7    $enc{caller()} = $_[1] if $_[1];
8}
9
10sub unimport {
11    $^H &= ~$utf8::hint_bits;
12}
13
14sub AUTOLOAD {
15    require "utf8_heavy.pl";
16    goto &$AUTOLOAD;
17}
18
191;
20__END__
21
22=head1 NAME
23
24utf8 - Perl pragma to enable/disable UTF-8 in source code
25
26=head1 SYNOPSIS
27
28    use utf8;
29    no utf8;
30
31=head1 DESCRIPTION
32
33WARNING: The implementation of Unicode support in Perl is incomplete.
34See L<perlunicode> for the exact details.
35
36The C<use utf8> pragma tells the Perl parser to allow UTF-8 in the
37program text in the current lexical scope.  The C<no utf8> pragma
38tells Perl to switch back to treating the source text as literal
39bytes in the current lexical scope.
40
41This pragma is primarily a compatibility device.  Perl versions
42earlier than 5.6 allowed arbitrary bytes in source code, whereas
43in future we would like to standardize on the UTF-8 encoding for
44source text.  Until UTF-8 becomes the default format for source
45text, this pragma should be used to recognize UTF-8 in the source.
46When UTF-8 becomes the standard source format, this pragma will
47effectively become a no-op.
48
49Enabling the C<utf8> pragma has the following effects:
50
51=over
52
53=item *
54
55Bytes in the source text that have their high-bit set will be treated
56as being part of a literal UTF-8 character.  This includes most literals
57such as identifiers, string constants, constant regular expression patterns
58and package names.
59
60=item *
61
62In the absence of inputs marked as UTF-8, regular expressions within the
63scope of this pragma will default to using character semantics instead
64of byte semantics.
65
66    @bytes_or_chars = split //, $data;  # may split to bytes if data
67                                        # $data isn't UTF-8
68    {
69        use utf8;                       # force char semantics
70        @chars = split //, $data;       # splits characters
71    }
72
73=head1 SEE ALSO
74
75L<perlunicode>, L<bytes>
76
77=cut
Note: See TracBrowser for help on using the repository browser.