Hive는 아래와 같이 2가지 방식에 데이터타입을 지원
1. Primitive data types
1) Numeric data type
- tinyint : 1-byte
- smallint : 2-byte
- int : 4-byte
- bigint : 8-byte
- float : 4-byte single-precision
- double : 8-byte double-precision
- decimal : 17-byte (38 digits)
create table test (id bigint, price decimal(10,2));
2) String data type
- string : sequence of characters
- varchar : variable-length character (1 ~ 65355 )
- char : fixed-length character ( 1 ~ 255 )
3) Date/Time data type
- Timestamp : YYYY-MM-DD HH:MM:SS.fffffffff
- Date : YYYY-MM-DD
(Date 타입은 Date, Timestamp, String으로 변환가능)
create table test (id int, created_dt date, update_dt timestamp);
4) etc
- Boolean : true or false
- Binary : sequence of bytes
2. Complex data tyeps
1) Struct : similar c struct type.
struct < column : data_type(primitive type), .........>
struct_coumn.column로 참조
create table test1 (id int, name struct<first:string, last:string>);
2) Map : key-value pairs
3) Array : ordered sequece of similar elements like string..
t[ 'a', 'b','c' ], t[0], t[1]
4) UNIONTYPE : store different data types.
uniontype<data type,.....>
create table test2(c1 uniontype<int, double, array<string>, struct<age:int, country:string>>);
3. operators
1) 비교 연산
- A = B
- A != B
- A <=> B : A and B NUL 이면 true
- A <> B
- A< B : A or B가 NULL이면 NULL, A < B이면 TRUE
- A <= B
- A >= B
- A between B and C : A or B or C 가 NULL 이면 NULL
- A not between B and C
- A is null
- A is not null
- A like B : A or B가 NULL이면 NULL
- A not like B
- A RLIKE B : A or B가 NULL이면 NULL, A의 substring이 B에 매칭되면 true
- A regexp B
2) 산술 연산
A or B중 NULL이면 결과 NULL이고 모든 연산의 결과는 숫자
A + B , A-B, A*B...
hive> select 10+10 from test;
3) 논리 연산
연산자는 AND, OR이며 모든 결과는 TRUE, FALSE이며, operand중 NULL이면
결과도 NULL
A and B, A && B, A OR B, ! A , A IN (v1, v2..), A in (subquery)
A not in (v1, v2..), A not in (subquery), exists (subquery),
not exists (subquery)
4) 복잡 연산
A[i], B[key], C.a
select A[2] from test;
select C.a from test;
