NAME
    Lingua::JA::NormalizeText - text normalizer

SYNOPSIS
      use Lingua::JA::NormalizeText;
      use utf8;

      my @options = ( qw/nfkc decode_entities/, \&dearinsu_to_desu );
      my $normalizer = Lingua::JA::NormalizeText->new(@options);

      print $normalizer->normalize('曈乓�䎚爀���扼���𨳍�瓐��♥');
      # -> 曈乓�䎚���喋�剹�怒�扼�仮䐥

      sub dearinsu_to_desu
      {
          my $text = shift;
          $text =~ s/�扼���𨳍�瓐��/�扼��/g;

          return $text;
      }

    # or

      use Lingua::JA::NormalizeText qw/nfkc decode_entities/;
      use utf8;

      my $text = '曈乓�䎚爀���扼���𨳍�瓐��♥';
      print dearinsu_to_desu( decode_entities( nfkc($text) ) );
      # -> 曈乓�䎚���喋�剹�怒�扼�仮䐥

      sub dearinsu_to_desu
      {
          my $text = shift;
          $text =~ s/�扼���𨳍�瓐��/�扼��/g;

          return $text;
      }

DESCRIPTION
    Lingua::JA::NormalizeText normalizes text.

METHODS
  new(@options)
    Creates a new Lingua::JA::NormalizeText instance.

    The following options are available.

      OPTION                 SAMPLE INPUT        OUTPUT FOR SAMPLE INPUT
      ---------------------  ------------------  -----------------------
      lc                     DdD                 ddd
      uc                     DdD                 DDD
      nfkc                   ��                  �剹�� (length: 2)
      nfkd                   ��                  ���踺�� (length: 3)
      nfc
      nfd
      decode_entities        &hearts             �䐥
      strip_html             <em>��</em>             ��    
      alnum_z2h              嚗∴慰嚗��𡢅�𡜐��        ABC123
      alnum_h2z              ABC123              嚗∴慰嚗��𡢅�𡜐��
      space_z2h
      space_h2z
      katakana_z2h           �譌�~�譌��            嚝𠺪膚嚝𠺪膚
      katakana_h2z           嚚踝蔑嚝𠺪蔑嚚踝蔑嚝𠺪蔑            �嫘�潦�譌�潦�嫘�潦�譌��
      katakana2hiragana      �㻫�喋��              �晞�瓐��
      hiragana2katakana      �晞�瓐��              �㻫�喋��
      unify_3dots            �胯��������          �胯����
      wave2tilde             ��                  嚚�
      tilde2wave             嚚�                  ��
      wavetilde2long         ��, 嚚�              ��
      wave2long              ��                  ��
      tilde2long             嚚�                  ��
      fullminus2long         ���                   ��
      dashes2long            ��                   ��
      drawing_lines2long     ��                   ��
      unify_long_repeats     �氬�~�潦�潦��          �氬�~��
      nl2space               \n                  (space)
      unify_long_spaces      (space)(space)      (space)
      remove_head_space      (space)��(space)��  ��(space)��
      remove_tail_space      ����(space)(space)  ����
      modernize_kana_usage   �僐�啜�㻫��            ���扎����

    The order these options are applied is according to the order of the
    elements of @options. (i.e., The first element is applied first, and the
    last element is applied finally.)

    External functions are also addable. (See dearinsu_to_desu function of
    SYNOPSIS section)

  normalize($text)
    normalizes $text.

AUTHOR
    pawa <pawapawa@cpan.org>

SEE ALSO
LICENSE
    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.