PIG-安装使用
16 June 2013
author : xiajun
一.资料:
http://mfido-sina-cn.iteye.com/blog/1454873安装指南
http://pig.apache.org/docs/r0.11.1/func.html内置函数</br> 二.内置函数使用:
id name sex age address
1 zs F 25 bj
2 ls M 22 hb
3 ww F 23 hn
4 zl M 28 bj
5 kk M 25 hn
命令:
u = load 'user.txt' using PigStorage(',') as (id:int,name:chararray,sex:chararray,age:int,address:chararray); //load 装载数据 PigStorage 列的分隔符
g = group u by address;//根据address进行分组
a = foreach g generate u.address ,AVG(u.age);//avg 时必须经过分组 generate 后是分组列
dump a; //将变量a的值输出到屏幕
store a into '/home/xxx/xxx';//将变量保存到系统路径</br>
describe u;//查看u的结构。
CONCAT (expression, expression)
X = foreach u generate CONCAT(name,address);//将name和address进行连接 结果如:zsbj 注意 concat 是大写
COUNT(expression)
b = group u by address;
x = foreach b generate COUNT(u);//必须对u进行分组
COUNT_STAR(expression)
X = FOREACH b GENERATE COUNT_STAR(u);
MAX(expression)
X = FOREACH B GENERATE group, MAX(u.age);
limit = LIMIT u 2;//查询前2条数据
x = order u by age;
x = filter u by age>30;
内连接Inner Join
tmp_table_left_join = JOIN u BY age LEFT OUTER,u1 BY age;//注意join大写 右连接
tmp_table_right_join = JOIN u BY age RIGHT OUTER,u1 BY age;
去重复
tmp_table_distinct = DISTINCT tmp_table_distinct;
blog comments powered by Disqus