Unable to load data in Hive partitioned table

If you want to load data into a Hive partition, you have to provide the value of the partition itself in the LOAD DATA query. So in this case, your query would be something like this.

LOAD DATA INPATH '/salary_data.csv' overwrite into table employee partition (JOB_TITLE_SUBGROUP="Value");

Where "Value" is the name of the partition in which you are loading your data. The reason is because Hive will use "Value" to create the directory in which your .csv is going to be stored, which will be something like this: .../employee/JOB_TITLE_SUBGROUP=Value. I hope this helps.

Check the documentation for details on the LOAD DATA syntax.

EDITED

Since the table has dynamic partition, one solution would be loading the .csv into an external table (e.g. employee_external) and then execute an INSERT command like this:

INSERT OVERWRITE INTO TABLE employee PARTITION(JOB_TITLE_SUBGROUP)
SELECT CASE_NUMBER, CASE_STATUS, (...), JOB_TITLE_SUBGROUP
FROM employee_external

I might be little late to reply but can try below steps:

Set below properties first :

Ø set hive.exec.dynamic.partition.mode=nonstrict;
Ø set hive.exec.dynamic.partition=true;

Create temp table first:

CREATE EXTERNAL TABLE IF NOT EXISTS employee_temp(
ID STRING,
Name STRING,
Salary STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
tblproperties ("skip.header.line.count"="1");

Load Data in temporary table:

hive> LOAD DATA INPATH 'filepath/employee.csv' OVERWRITE INTO TABLE employee;

Create Partitioned Table:

CREATE EXTERNAL TABLE IF NOT EXISTS employee_part(
ID STRING,
Name STRING)
PARTITIONED BY (Salary STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
tblproperties ("skip.header.line.count"="1");

Load Data into partitioned table from intermediate / temp table:

INSERT OVERWRITE TABLE employee_part PARTITION (SALARY) SELECT * FROM employee;

Unable to load data in Hive partitioned table

Tags:

Hadoop

Hive

Related

Recent Posts