hive locks

hive locks

How Table Locking Works in Hive

Exclusive locks are not acquired when using dynamic partitions



beeline commands for Hive

How to use beeline commands to access hive database and tables ?

beeline commands

To connect hive server2 on hive server:

beeline -u jdbc:hive2://localhost:10000

To run a query from shell prompt:

beeline -u jdbc:hive2://localhost:10000 -e “show databases;”

Run silent mode to suppress messages and just get query output:

beeline -u jdbc:hive2://localhost:10000 –silent ¬†-e “show databases;”

Change output format from table to csv:

beeline -u jdbc:hive2://localhost:10000 –silent –outputformat=csv2 -e “show databases;”

Turn off the header too:

beeline -u jdbc:hive2://localhost:10000 –silent –outputformat=csv2 –showheader=false -e “show databases;”

More to come keep looking this space … ūüôā

Reference Outputs:

[cloudera@quickstart Downloads]$ beeline -u jdbc:hive2://localhost:10000 -e “show databases;” –silent

scan complete in 7ms

Connecting to jdbc:hive2://localhost:10000

Connected to: Apache Hive (version 1.1.0-cdh5.13.0)

Driver: Hive JDBC (version 1.1.0-cdh5.13.0)


INFO  : Compiling command(queryId=hive_20190601201515_a226e5a1-40d4-408e-b591-9d89877f25cc): show databases

INFO  : Semantic Analysis Completed

INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)

INFO  : Completed compiling command(queryId=hive_20190601201515_a226e5a1-40d4-408e-b591-9d89877f25cc); Time taken: 0.184 seconds

INFO  : Concurrency mode is disabled, not creating a lock manager

INFO  : Executing command(queryId=hive_20190601201515_a226e5a1-40d4-408e-b591-9d89877f25cc): show databases

INFO  : Starting task [Stage-0:DDL] in serial mode

INFO  : Completed executing command(queryId=hive_20190601201515_a226e5a1-40d4-408e-b591-9d89877f25cc); Time taken: 0.084 seconds



| database_name  |


| default        |


1 row selected (0.851 seconds)

Beeline version 1.1.0-cdh5.13.0 by Apache Hive

Closing: 0: jdbc:hive2://localhost:10000

$ beeline -u jdbc:hive2://localhost:10000 –silent -e ¬†“show databases;”


| database_name  |


| default        |


[cloudera@quickstart Downloads]$ beeline -u jdbc:hive2://localhost:10000 –silent –-outputformat=csv2 -e “show databases;”



[cloudera@quickstart Downloads]$beeline -u jdbc:hive2://localhost:10000 –silent –outputformat=csv2 –showheader=false -e “show databases;”




How to deploy custom jar files in apache hive (hortonworks hdp)

Below activity need to be performed in all hive servers, hive metastore and hive client nodes.

  1. Create the folder if not exists ¬†“/usr/hdp/”
  2. copy the custom build jar into this folder “customserde.jar”
  3. Restart the hive service
  4. verify with “ps -ef|grep -hive|grep customserde”. Hive process should have loaded this file along with path in section “–hiveconf hive.aux.jars.path=”

Hive Tuning

10 Best Practices for Apache Hive


hadoop:Hive: HiveQL DDL

Create/Describe/ALTER/Drop database

  1. Create Database ‘test’ with db properties and location

hive> create database IF NOT EXISTS test
¬†¬†¬† > COMMENT “this is test database”
¬†¬†¬† > LOCATION ‘/tmp’
¬†¬†¬† > WITH DBPROPERTIES ( ‘createdBy’ = ‘RajuKV’, ‘createdOn’ = ‘2016-02-08’);
Time taken: 2.277 seconds

2. To see databases

hive> show databases;
Time taken: 5.082 seconds, Fetched: 4 row(s)

3. How to use a database ?

hive> use test;
Time taken: 0.376 seconds

Note: There is no way to find out existing database, hence better to use this in every script to ensure you are using right database.

4. To see database properties like comment, location and privileges;

hive> describe database test;
test    this is test database    hdfs://    hive    USER  

Note: This one did not display DBProperties set while creating test database.

5. To get Database DB Properties use extended functionality in describe.

hive>describe database extended test;
test    this is test database    hdfs://    hive    USER    {createdOn=2016-02-08, createdBy=RajuKV}

6. Time to clean up, how to delete a database or how to drop a database ? (Do not test on real environments)

hive> drop database if exists test;
Time taken: 0.737 seconds

7. How to alter/modify database properties ?

hive> ALTER DATABASE testdb
¬†¬†¬† > SET DBPROPERTIES ( ‘created’ = ‘RAJUKV’, ‘date’ = ‘2016-02-08’);
Time taken: 0.564 seconds

Create/Describe/ALTER/Drop TABLES

  1. How to create Hive Manage table with CSV store format ?

hive> create table IF NOT EXISTS testdb.mytable ( Name string, ID int )
¬†¬†¬† > COMMENT ‘this is test table in testdb’
    > STORED AS textfile;
Time taken: 0.702 seconds

hive> use testdb;
Time taken: 0.316 seconds
hive> show tables;
Time taken: 0.188 seconds, Fetched: 1 row(s)

See table extended porperties (notice many default properties not specified while creating table)

hive> describe extended mytable;
name                    string
id                      int

Detailed Table Information    Table(tableName:mytable, dbName:testdb, owner:hive, createTime:1454921870, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:id, type:int, comment:null)], location:hdfs://, inputFormat:org.apache.hadoop.mapred.TextInputFormat,, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1454921870, comment=this is test table in testdb}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.603 seconds, Fetched: 4 row(s)

As it’s empty table, folder created with table name under /tmp is empty

hive> dfs -ls /tmp/mytable/;

Load data into table

create/copy/download data file, here I created mytable.csv file with 4 rows delimited by ‘,’

hive> dfs -cat /tmp/mytable.csv;
hive> load data inpath ‘/tmp/mytable.csv’
    > overwrite into table mytable;
Loading data to table testdb.mytable
Table testdb.mytable stats: [numFiles=1, numRows=0, totalSize=38, rawDataSize=0]
Time taken: 1.165 seconds

Above command simply moved csv file from /tmp to /tmp/mytable folder in hdfs

hive> dfs -ls /tmp/mytable;
Found 1 items
-rwx-wx-wx   1 hive hadoop         38 2016-02-08 09:31 /tmp/mytable/mytable.csv

Run a query and see for the data

hive> select * from mytable;
Raju    90
Sateesh    92
Sreeni    99
Bairi    95
Time taken: 0.231 seconds, Fetched: 4 row(s)

When we loaded with another csv file, it moved existing csv file and copied loaded file.

hive> load data inpath ‘/tmp/tsttable.csv’
> overwrite into table mytable;
Loading data to table testdb.mytable
Moved: ‘hdfs://’ to trash at: hdfs://
Table testdb.mytable stats: [numFiles=1, numRows=0, totalSize=38, rawDataSize=0]

 TO change delimter of the table;

hive> alter table mytable
¬†¬†¬† > set SERDEPROPERTIES (‘field.delim’ = ‘:’);
Time taken: 0.456 seconds

What happend to existing csv data ? let’s check

hive> dfs -cat /tmp/mytable/tsttable.csv
> ;

data file is still csv file. So Alter will modify only metadata not the data file. Can I access the data now as metadat delimiter modified to “:” but data file has “,” delimter ? let’s check

hive> select * from mytable;
Raju,80    NULL
Sateesh,82    NULL
Sreeni,89    NULL
Bairi,85    NULL
Time taken: 0.304 seconds, Fetched: 4 row(s)

Data fetched is wrong as it’s looking for ‘colon’ delimiter in data file.

Hive has no intelligence to look for data integrity, it’s loading data without cross verification. When table metadata is defined with “:” delimiter, it accepted .csv file to load data.

hive> load data inpath ‘/tmp/mytable.csv’
> overwrite into table mytable;
Loading data to table testdb.mytable
Moved: ‘hdfs://’ to trash at: hdfs://
Table testdb.mytable stats: [numFiles=1, numRows=0, totalSize=38, rawDataSize=0]
Time taken: 1.536 seconds
hive> dfs -ls /tmp/mytable/
> ;
Found 1 items
-rwx-wx-wx   1 hive hadoop         38 2016-02-08 14:12 /tmp/mytable/mytable.csv