.

　晋太元中，武陵人捕鱼为业。缘溪行，忘路之远近。忽逢桃花林，夹岸数百步，中无杂树，芳草鲜美，落英缤纷。渔人甚异之，复前行，欲穷其林。　　林尽水源，便得一山，山有小口，仿佛若有光。便舍船，从口入。初极狭，才通人。复行数十步，豁然开朗。土地平旷，屋舍俨然，有良田、美池、桑竹之属。阡陌交通，鸡犬相闻。其中往来种作，男女衣着，悉如外人。黄发垂髫，并怡然自乐。　　见渔人，乃大惊，问所从来。具答之。便要还家，设酒杀鸡作食。村中闻有此人，咸来问讯。自云先世避秦时乱，率妻子邑人来此绝境，不复出焉，遂与外人间隔。问今是何世，乃不知有汉，无论魏晋。此人一一为具言所闻，皆叹惋。余人各复延至其家，皆出酒食。停数日，辞去。此中人语云：“不足为外人道也。”(间隔一作：隔绝) 　　既出，得其船，便扶向路，处处志之。及郡下，诣太守，说如此。太守即遣人随其往，寻向所志，遂迷，不复得路。　　南阳刘子骥，高尚士也，闻之，欣然规往。未果，寻病终。后遂无问津者。 . Prv8 Shell

Server : Apache
System : Linux srv.rainic.com 4.18.0-553.47.1.el8_10.x86_64 #1 SMP Wed Apr 2 05:45:37 EDT 2025 x86_64
User : rainic ( 1014)
PHP Version : 7.4.33
Disable Function : exec,passthru,shell_exec,system
Directory : /usr/share/perl5/pod/

Upload File :

Current File : //usr/share/perl5/pod/perlhacktut.pod

=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlhacktut.pod

=head1 NAME

perlhacktut - Walk through the creation of a simple C code patch

=head1 DESCRIPTION

This document takes you through a simple patch example.

If you haven't read L<perlhack> yet, go do that first! You might also
want to read through L<perlsource> too.

Once you're done here, check out L<perlhacktips> next.

=head1 EXAMPLE OF A SIMPLE PATCH

Let's take a simple patch from start to finish.

Here's something Larry suggested: if a C<U> is the first active format
during a C<pack>, (for example, C<pack "U3C8", @stuff>) then the
resulting string should be treated as UTF-8 encoded.

If you are working with a git clone of the Perl repository, you will
want to create a branch for your changes. This will make creating a
proper patch much simpler. See the L<perlgit> for details on how to do
this.

=head2 Writing the patch

How do we prepare to fix this up? First we locate the code in question
- the C<pack> happens at runtime, so it's going to be in one of the
F<pp> files. Sure enough, C<pp_pack> is in F<pp.c>. Since we're going
to be altering this file, let's copy it to F<pp.c~>.

[Well, it was in F<pp.c> when this tutorial was written. It has now
been split off with C<pp_unpack> to its own file, F<pp_pack.c>]

Now let's look over C<pp_pack>: we take a pattern into C<pat>, and then
loop over the pattern, taking each format character in turn into
C<datum_type>. Then for each possible format character, we swallow up
the other arguments in the pattern (a field width, an asterisk, and so
on) and convert the next chunk input into the specified format, adding
it onto the output SV C<cat>.

How do we know if the C<U> is the first format in the C<pat>? Well, if
we have a pointer to the start of C<pat> then, if we see a C<U> we can
test whether we're still at the start of the string. So, here's where
C<pat> is set up:

    STRLEN fromlen;
    char *pat = SvPVx(*++MARK, fromlen);
    char *patend = pat + fromlen;
    I32 len;
    I32 datumtype;
    SV *fromstr;

We'll have another string pointer in there:

    STRLEN fromlen;
    char *pat = SvPVx(*++MARK, fromlen);
    char *patend = pat + fromlen;
 +  char *patcopy;
    I32 len;
    I32 datumtype;
    SV *fromstr;

And just before we start the loop, we'll set C<patcopy> to be the start
of C<pat>:

    items = SP - MARK;
    MARK++;
    SvPVCLEAR(cat);
 +  patcopy = pat;
    while (pat < patend) {

Now if we see a C<U> which was at the start of the string, we turn on
the C<UTF8> flag for the output SV, C<cat>:

 +  if (datumtype == 'U' && pat==patcopy+1)
 +      SvUTF8_on(cat);
    if (datumtype == '#') {
        while (pat < patend && *pat != '\n')
            pat++;

Remember that it has to be C<patcopy+1> because the first character of
the string is the C<U> which has been swallowed into C<datumtype!>

Oops, we forgot one thing: what if there are spaces at the start of the
pattern? C<pack("  U*", @stuff)> will have C<U> as the first active
character, even though it's not the first thing in the pattern. In this
case, we have to advance C<patcopy> along with C<pat> when we see
spaces:

    if (isSPACE(datumtype))
        continue;

needs to become

    if (isSPACE(datumtype)) {
        patcopy++;
        continue;
    }

OK. That's the C part done. Now we must do two additional things before
this patch is ready to go: we've changed the behaviour of Perl, and so
we must document that change. We must also provide some more regression
tests to make sure our patch works and doesn't create a bug somewhere
else along the line.

=head2 Testing the patch

The regression tests for each operator live in F<t/op/>, and so we make
a copy of F<t/op/pack.t> to F<t/op/pack.t~>. Now we can add our tests
to the end. First, we'll test that the C<U> does indeed create Unicode
strings.

t/op/pack.t has a sensible ok() function, but if it didn't we could use
the one from t/test.pl.

 require './test.pl';
 plan( tests => 159 );

so instead of this:

 print 'not ' unless "1.20.300.4000" eq sprintf "%vd",
                                               pack("U*",1,20,300,4000);
 print "ok $test\n"; $test++;

we can write the more sensible (see L<Test::More> for a full
explanation of is() and other testing functions).

 is( "1.20.300.4000", sprintf "%vd", pack("U*",1,20,300,4000),
                                       "U* produces Unicode" );

Now we'll test that we got that space-at-the-beginning business right:

 is( "1.20.300.4000", sprintf "%vd", pack("  U*",1,20,300,4000),
                                     "  with spaces at the beginning" );

And finally we'll test that we don't make Unicode strings if C<U> is
B<not> the first active format:

 isnt( v1.20.300.4000, sprintf "%vd", pack("C0U*",1,20,300,4000),
                                       "U* not first isn't Unicode" );

Mustn't forget to change the number of tests which appears at the top,
or else the automated tester will get confused. This will either look
like this:

 print "1..156\n";

or this:

 plan( tests => 156 );

We now compile up Perl, and run it through the test suite. Our new
tests pass, hooray!

=head2 Documenting the patch

Finally, the documentation. The job is never done until the paperwork
is over, so let's describe the change we've just made. The relevant
place is F<pod/perlfunc.pod>; again, we make a copy, and then we'll
insert this text in the description of C<pack>:

 =item *

 If the pattern begins with a C<U>, the resulting string will be treated
 as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string
 with an initial C<U0>, and the bytes that follow will be interpreted as
 Unicode characters. If you don't want this to happen, you can begin
 your pattern with C<C0> (or anything else) to force Perl not to UTF-8
 encode your string, and then follow this with a C<U*> somewhere in your
 pattern.

=head2 Submit

See L<perlhack> for details on how to submit this patch.

=head1 AUTHOR

This document was originally written by Nathan Torkington, and is
maintained by the perl5-porters mailing list.

haha - 2025