What is the difference between the hive jdbc client and the hive metastore java api?

as far as I understand there are 2 ways to connect to Hive

  1. using hive metastore server, which then connects in the background to a relational db such as mysql for schema manifestation. This runs on port 9083, generally.
  2. hive jdbc server, called HiveServer2, which runs on port 10001, generally...

Now, in the earlier editions of hive, hiveserver2 used to be not so stable and in fact it's multi-threading support was also limited. Things have probably improved in that arena, I'd imagine.

So for JDBC api - yes, it would let you communicate using JDBC and sql.

For the metastore connectivity, there appear to be 2 features.

  1. to actually run SQL queries - DML
  2. to perform DDL operations.

DDL -

for DDL, the metastore APIs come in handy, org.apache.hadoop.hive.metastore.HiveMetaStoreClient HiveMetaStoreClient class can be utilized for that purpose

DML -

what I have found useful in this regard is the org.apache.hadoop.hive.ql.Driver https://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ql/Driver.html hive.ql.Driver class This class has a method called run() which lets you execute a SQL statement and get the result back. for e.g. you can do following

Driver driver = new Driver(hiveConf);
HiveMetaStoreClient client = new HiveMetaStoreClient(hiveConf);
SessionState.start(new CliSessionState(hiveConf));
driver.run("select  * from employee);
// DDL example
client.dropTable(db, table);