>hive.metastore.connect.retries:Number
of retries while opening a connection to metastore.
>hive.metastore.client.connect.retry.delay:
Number of seconds for the client to wait between consecutive connection
attempts
>hive.metastore.batch.retrieve.max: Maximum
number of objects (tables/partitions) can be retrieved from metastore in one
batch. The higher the number, the less the number of round trips is needed to
the Hive metastore server, but it may also cause higher memory requirement at
the client side.
>Javax.jdo.option.ConnectionURL: JDBC connect string for a JDBC
metastore.
>javax.jdo.option.ConnectionDriverName: Driver class name for a JDBC
metastore.
>hive
-S -e "describe formatted <table_name> ;" | grep 'Location' |
awk '{ print $NF }'
>Hive.server2.table.type.mapping = classic(HIVE : Exposes the hive's native table tyes
like MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW
CLASSIC : More generic types like TABLE and VIEW)
CLASSIC : More generic types like TABLE and VIEW)
>Hive.security.authenticator.manager = org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator.(OR)
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
>hive.security.authorization.manager to org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory. This
will ensure that any table or views created by hive-cli have default privileges
granted for the owner.
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
>hive.security.metastore.authenticator.manager=
Set
to org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator .
>hive.security.metastore.authorization.manager=
Add org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly
to hive.security.metastore.authorization.manager. (It takes a comma separated list, so
you can add it along with StorageBasedAuthorization parameter, if you want to enable that as well).
MetaStoreAuthzAPIAuthorizerEmbedOnly: This setting
disallows any of the authorization api calls to be invoked in a remote
metastore. HiveServer2 can be configured to use embedded metastore, and that
will allow it to invoke metastore authorization api. Hive cli and any other
remote metastore users would be denied authorization when they try to make
authorization api calls. This restricts the authorization api to privileged
HiveServer2 process. You should also ensure that the metastore rdbms access is
restricted to the metastore server and hiverserver2.
You can Set to org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider .
org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly
DefaultHiveMetastoreAuthorizationProvider
This tells Hive which metastore-side
authorization provider to use. The default setting uses DefaultHiveMetastoreAuthorizationProvider, which implements the standard Hive grant/revoke model. To
use an HDFS permission-based model (recommended) to do your authorization,
use StorageBasedAuthorizationProvider as instructed above.
Storage based authorization :
When a user runs a Hive query or command, the privileges
granted to the user and her
"current roles"
are checked
Users, who have access to the Hive CLI, HDFS commands,
Pig command line, 'hadoop jar' command, etc., are considered privileged users. In
an organization, it is typically only the teams that work on ETL workloads that
need such access. These tools don't access the data through HiveServer2, and as
a result their access is not authorized through SQL Standard Based Hive
Authorization model. For Hive CLI, Pig, and MapReduce users access to Hive
tables can be controlled using storage based authorization enabled on the
metastore server.
Note, that through the use of HDFS ACL (available in Hadoop 2.4 onwards) you have a lot of
flexibility in controlling access to the file system, which in turn provides
more flexibility with Storage Based Authorization. This functionality is
available as of Hive 0.14 >HiveServer2 has an API that understands rows and columns (through the use of SQL), and is able to serve just the columns and rows that your SQL query asked for.
SQL
Standards Based Authorization (introduced in Hive 0.13.0, HIVE-5837) can
be used to enable fine grained access control. It is based on the SQL standard
for authorization, and uses the familiar grant/revoke statements to control
access. It needs to be enabled through HiveServer2 configuration.
>>> That is, you can have storage based
authorization enabled for metastore API calls (in the Hive metastore) and have
SQL standards based authorization enabled in HiveServer2 at the same time.
SQL Standard Based Hive Authorization model:
When a user runs a Hive query or command, the privileges
granted to the user and her "current roles"
are checked. The user can be any user that the hiveserver2 authentication mode
supports.
To provide security through this option, the client will
have to be secured. This can be done by allowing users access only through Hive
Server2, and by restricting the user code and non-SQL commands that can be run.
The checks will happen against the user who submits the request, but the query
will run as the Hive server user.
Most users such as business analysts tend to use SQL and
ODBC/JDBC through HiveServer2 and their access can be controlled using this SQL
Standard Based Hive Authorization model.
Commands such as dfs, add, delete, compile, and reset are disabled when this authorization is enabled
The set commands used to change Hive
configuration are restricted to a smaller safe set. This is controlled using
the hive.security.authorization.sqlstd.confwhitelist configuration parameter in hive-site.xml.
Privileges to add or drop functions and macros are
restricted to the admin role.
To enable users to use functions, the ability to
create permanent functions has been added. A
user in the admin role can run commands to create these
functions, which all users can then use.
The Hive transform clause is
also disabled when this authorization is enabled.
The privileges(SELECT ● INSERT ● UPDATE ● DELETE ● ALL PRIVILEGES ) apply to table and views. The above privileges are
not supported on databases.
Database ownership is considered for certain
actions.
URI is another object in Hive, as Hive allows the
use of URI in SQL syntax.
The above privileges are not applicable on URI
objects. URI used are expected to point to a file/directory in a file system. Authorization is
done based on the permissions the user has on the file/directory.
Object Ownership
For certain actions, the ownership of the object (table/view/database) determines if you are authorized to perform the
action.
The user who creates the table, view or database
becomes its owner. In the case of tables and views, the owner gets all the
privileges with grant option.
The user who created the database becomes the owner
role can also be the owner of a database. The "
alter database
" command can be used
to set the owner of a database to a role.Users and Roles
Privileges can be granted to users as well as
roles.
Users can belong to one or more roles.
Users can belong to one or more roles.
There are two roles with special meaning – public and
admin.
All users belong to the public role. You use this role in your grant statement to grant a privilege to all users.
When a user runs a Hive query or command, the privileges granted to the user and her "current roles" are checked. The current roles can be seen using the "
All users belong to the public role. You use this role in your grant statement to grant a privilege to all users.
When a user runs a Hive query or command, the privileges granted to the user and her "current roles" are checked. The current roles can be seen using the "
show
current roles;
" command. All of the user's roles except for
the admin role will be in the current roles by default,
although you can use the "set
role
" command to set a specific role as the current role. Users who do the work of a database administrator are expected to be added to the admin role.
They have privileges for running additional commands such as "
create role
" and "drop role
". They can also access
objects that they haven’t been given explicit access to. However, a user who
belongs to the admin role needs to run the "set role
" command before getting
the privileges of the admin role, as this role is not in
current roles by default.
IMPORTANT HINT:SQL Standards Based Authorization
is disabled For HIVE CLI. This is because secure access control is not possible
for the Hive command line using an access control policy in Hive, because users
have direct access to HDFS and so they can easily bypass the SQL standards
based authorization checks or even disable it altogether. Disabling this avoids
giving a false sense of security to users.
COMMANDS:
>CREATE ROLE role_name;
>DROP ROLE role_name;
>SHOW CURRENT ROLES;
>SET ROLE (role_name|ALL|NONE);
>SHOW ROLES;
>GRANT role_name [, role_name] ...
TO principal_specification [, principal_specification]
...
[ WITH ADMIN OPTION ];
principal_specification
: USER
user
| ROLE
role
>REVOKE [ADMIN OPTION FOR] role_name
[, role_name] ...
FROM principal_specification [,
principal_specification] ... ;
principal_specification
: USER
user
| ROLE
role
>SHOW
ROLE GRANT (USER|ROLE) principal_name;
>0:
jdbc:hive2://localhost:10000> GRANT role1 TO USER user1;
No rows affected (0.058 seconds)
>SHOW
PRINCIPALS role_name;
>0:
jdbc:hive2://localhost:10000> SHOW PRINCIPALS role1;
>GRANT
priv_type [, priv_type ] ...
ON
table_or_view_name
TO
principal_specification [, principal_specification] ...
[WITH
GRANT OPTION];
>REVOKE [GRANT OPTION FOR]
priv_type
[, priv_type ] ...
ON
table_or_view_name
FROM
principal_specification [, principal_specification] ... ;
principal_specification
:
USER user
|
ROLE role
priv_type
:
INSERT | SELECT | UPDATE | DELETE | ALL
>SHOW
GRANT [principal_name] ON (ALL| ([TABLE] table_or_view_name)
>0: jdbc:hive2://localhost:10000> show
grant user ashutosh on table hivejiratable;
>0:
jdbc:hive2://localhost:10000> show grant user ashutosh on all;
>0: jdbc:hive2://localhost:10000>
show grant on table hivejiratable;
Actions
|
|||||
CREATE TABLE
|
ALTER TABLE DROP PARTITION
|
ALTER INDEX PROPERTIES
|
CREATE MACRO
|
SHOW COLUMNS
|
|
DROP TABLE
|
ALTER TABLE (all of them except the ones above)
|
SELECT
|
DROP MACRO
|
SHOW TABLE STATUS
|
|
DESCRIBE TABLE
|
TRUNCATE TABLE
|
INSERT
|
MSCK (metastore check)
|
SHOW TABLE PROPERTIES
|
|
SHOW PARTITIONS
|
CREATE VIEW
|
UPDATE
|
ALTER DATABASE
|
CREATE TABLE AS SELECT
|
|
ALTER TABLE LOCATION
|
ALTER VIEW PROPERTIES
|
DELETE
|
CREATE DATABASE
|
CREATE INDEX
|
|
ALTER PARTITION LOCATION
|
ALTER VIEW RENAME
|
LOAD
|
EXPLAIN
|
DROP INDEX
|
|
ALTER TABLE ADD PARTITION
|
DROP VIEW PROPERTIES
|
SHOW CREATE TABLE
|
DROP DATABASE
|
ALTER INDEX REBUILD
|
|
DROP VIEW
|
CREATE FUNCTION
|
||||
ANALYZE TABLE
|
DROP FUNCTION
|
>>>
EXPLAIN
[EXTENDED|DEPENDENCY|AUTHORIZATION] query(
shows all entities that need to
be authorized to execute a query, as well as any authorization failures.)