Skynet

---------- ---------- 我的新 blog : liukaiyi.cublog.cn ---------- ----------

:: 管理

112 Posts :: 1 Stories :: 49 Comments :: 0 Trackbacks

对应脚本运用：
1. shell 统筹管理脚本的运行。合理结合 crontab , ps -ef ,kill 等命令。
2. perl 处理短小快。
3. python 有比较复杂结构和逻辑的。

本文主要介绍 perl 的行级命令使用，力求短小快

：

#最简单的

$ perl -e 'print "Hello World\n"'

#处理文件行

$ perl -n -e 'print $_' file1

#编码转换

#如果有需要在使用下 encode("UTF-8", decode("GBK",$_));在 linux 下默认 utf-8

perl -MEncode -ne 'print decode("GBK",$_);' file.txt

#正则使用
# if($_=~/.*\/(.*)$/){ print $1 ;} 这是perl 巨方便的地方 $1 xx

# next LINE 跳到下一个循环

$ perl -n -e 'next LINE unless /pattern/; print $_'

#去除换行 chomp

perl -e 'print split(/\n/,"asg\n");'

#像 awk 一样 Begin End

$ perl -ne 'END { print $t } @w = /(\w+)/g; $t += @w' file.txt

#像 awk -F"x" 一样切割行

#-a 打开自动分离 (split) 模式

#@F 为切割后的数组

perl -F'\t' -ane '

if($F[1]=~/侃侃/ and $F[2]=~/爱情啊/){

print "$F[3]\t$F[4]\t$F[5]\n"

}

' all2_data.sort.st

实际处理：

perl -F'\|\|' -ane '
my $actor,$music ;
if  ( $F[3] ){
  $music=$F[2];
  $actor=$F[3];
}else{
  $music=$F[0];
  $actor=$F[1];
}
  $music =~ tr/[A-Z]/[a-z]/;
  $music =~ s/\s*(.*)\s*$.*$/\1/g;
  $actor =~ tr/[A-Z]/[a-z]/;
  $actor =~ s/\s*(.*)\s*$.*$/\1/g;
print "$actor-$music\n";
' ring.utf8.txt  |sort -u  > ring.actor_music.sort.utf8.txt &
wc -l ring.actor_music.sort.utf8.txt

#像 sed 一样替换
# -i 和 sed 一样 perl 输出会替换调源 file.txt

$ perl -i -pe 's/\bPHP\b/Perl/g' file.txt

#外部传参

perl -ne'print "$ARGV[0]\t$ARGV[1]\n" ' file.txt 'par1' 'par2'
#结果 .. par1 par2 ..

# 查询出重复列次数，并列举出来
cut -d"     "  -f 2 .collection_mobile.data |perl -ne '
   END{
     while (($key,$value)=each(%a)){print $key,"=",$value,"\n";};
   }BEGIN{ %a =(); }
   chomp;
   $a{$_}+=1;
'

结果
Ring=532895
CRBT=68500
RingBoxes=880
Song=96765

#一些实际使用 :)

find . -name "*.mp3" | perl -pe 's/.\/\w+-(\w+)-.*/$1/' | sort | uniq
perl -F'\t' -ane 'if($F[1]=~/$ARGV[0]/ and $F[2]=~/$ARGV[1]/){print "$F[3]\t$F[4]\t$F[5]\n"}' all2_data.sort.st '侃侃' '爱情啊'

#与 find 合用 -e $ARGV[0] 批量把 excel 倒成文本格式
find . -maxdepth 1 -name "*xls" -exec perl -e '

require("/home/xj_liukaiyi/src/perl/excel/excelUtil.pl");
my $file=$ARGV[0];
sub myRead{
  my $str = "";
  for $tmp(@_){
    $str="$str$tmp\t";
  }
  $str="$str\n";
  system "echo \"$str\" >> $file.data ";
}
&parse_excel("$file",0,\&myRead);
print "$file\n";

' {} \;

参考：
http://www.ibm.com/developerworks/cn/linux/sdk/perl/l-p101/index.html
http://bbs.chinaunix.net/viewthread.php?tid=499434

整理 www.blogjava.net/Good-Game

posted on 2009-04-01 14:12 刘凯毅阅读(1996) 评论(2) 编辑收藏所属分类: perl

Feedback

# re: perl 的幸福生活 2009-04-15 11:49 刘凯毅

所谓的多线程哦
sed -n '1,1000p' 什么的就可以了
:)
ls mp3/ |sed -n '4000,6000p'|perl -ne '
require "/home/xj_liukaiyi/src/perl/util/perlUtil.pl";
my $tmp=$_;
chomp($tmp);
my $to="yd_MP3_stereo_48kbps";
&set_log_input_file("log_mp3_48");
unless ( (-e "$to/$tmp") && ($tmp=~/.*\..*/) ){
&system_util("lame -S --resample 44.1 --abr 16 -m s -b 48 \"./mp3/$tmp\" \"./$to/$tmp\" ");
}
' &

回复更多评论

# re: perl 使用 2009-09-01 23:46 skynet

交集
cat ddata | perl -ne 'BEGIN{
$p1="p1";
$p2="p2";

$ssplit="\t";

}END{
my @inter = grep {$a{$_}} keys %b; # 求交集
print $p1,"=",join(",",keys %a),"\n";
print $p2,"=",join(",",keys %b),"\n";
print "交集：",scalar @inter," \n";
}
chomp;
@lis=split /$ssplit/;
if( $lis[1] eq $p1 ){
$a{$lis[0]}++;
}
if( $lis[1] eq $p2 ){
$b{$lis[0]}++;
}

'
回复更多评论

新用户注册刷新评论列表


只有注册用户登录后才能发表评论。




网站导航: 博客园博客园最新博文博问管理
相关文章: shell txt 分析小结 perl 使用小结 hadoop streaming( hadoop + perl )小试部分高级正则特性使用 perl vs php (转) perl 使用 perl 一些有用的 util perl 工具小脚本 soap (java,perl,要写代码还不过 100 char) Memcached 对话 Google ProtocolBuffers (perl)

Skynet

常用链接

留言簿(13)

我参与的团队

随笔分类

随笔档案

相册

搜索

最新评论

阅读排行榜

评论排行榜

Feedback