Knowledge Builders

which component of hive maintains a session handle and any session statistics

by Jeanie Kirlin Published 2 years ago Updated 2 years ago

Hive Driver
The Driver piece manages the lifecycle of a Hive
Hive
Hive is used to describe an artificial/man-made structure to house a honey bee nest. Several species of Apis live in colonies, but for honey production the western honey bee (Apis mellifera) and the eastern honey bee (Apis cerana) are the main species kept in hives.
https://en.wikipedia.org › wiki › Beehive
QL statement as it moves through Hive. It maintains a session handle and any session statistics. The Query Compiler compiles HiveQL queries into a DAG of MapReduce tasks.
Jul 5, 2020

Full Answer

How does hive execute a query?

Interface of the Hive such as Command Line or Web user interface delivers query to the driver to execute. In this, UI calls the execute interface to the driver such as ODBC or JDBC. Driver designs a session handle for the query and transfer the query to the compiler to make execution plan.

What is the interface of hive?

Interface of the Hive such as Command Line or Web user interface delivers query to the driver to execute. In this, UI calls the execute interface to the driver such as ODBC or JDBC.

What is the job execution flow in hive with Hadoop?

In the above diagram along with architecture, job execution flow in Hive with Hadoop is demonstrated step by step . Interface of the Hive such as Command Line or Web user interface delivers query to the driver to execute. In this, UI calls the execute interface to the driver such as ODBC or JDBC.

What is the use of a database in a hive?

Hive selects corresponding database servers to stock the schema or Metadata of databases, tables, attributes in a table, data types of databases, and HDFS mapping. Execution of the execution plan made by the compiler is performed in the execution engine.

What is hive server?

What is hiveserver2 metastore?

How long is a timeout in HS2?

How much GB does HS2 need?

What is a session pool?

Does HS2 connect to DataNodes?

See 3 more

About this website

What are the main components of Hive?

The major components of Hive and its interaction with the Hadoop is demonstrated in the figure below and all the components are described further:User Interface (UI) – ... Hive Server – It is referred to as Apache Thrift Server. ... Driver – ... Compiler – ... Metastore – ... Execution Engine –

What are the three components of Hive architecture?

The major components of Apache Hive are:Hive Client.Hive Services.Processing and Resource Management.Distributed Storage.

Which Hive component is responsible for execution and optimization of queries?

The conjunction part of HiveQL process Engine and MapReduce is Hive Execution Engine. Execution engine processes the query and generates results as same as MapReduce results. It uses the flavor of MapReduce.

Which of the following component is present in Hive user interface?

These hive clients are hive thrift client,hive JDBC driver,hive ODBC driver. b) Hive Web User interface-The user interface provided by hive are Hive Web UI or Hive HD Insight. We can submit hive queries directly to hive server with help of these Web UI.

What is Hive used as Mcq?

Hive is a platform used to develop the SQL typescripts to do MapReduce operations. And, the hive is a data warehouse infrastructure tool to process the structured data in Hadoop. It resides on top of the Hadoop to summarize Big Data and makes querying and analyzing easy.

Which of the following is the key components of Hive architecture Mcq?

The different components of Hive architecture are: User Interface: It provides an interface between user and hive. User Interface allows users to submit queries to the system. It creates a session handle to the query and sends it to the compiler to generate an execution plan for it.

Which component is responsible for compilation optimization and execution?

Hive Execution Engine This component is responsible for executing the execution plan created by the compiler.

What are the components used in Hive query processor?

Following are the components of a Hive Query Processor:Parse and Semantic Analysis (ql/parse)Metadata Layer (ql/metadata)Type Interfaces (ql/typeinfo)Sessions (ql/session)Map/Reduce Execution Engine (ql/exec)Plan Components (ql/plan)Hive Function Framework (ql/udf)Tools (ql/tools)More items...•

Which of the following components provides a way of integrating Hive with other applications?

Hive Server: The component that provides a trift interface and a JDBC/ODBC server and provides a way of integrating Hive with other applications.

What task does Hive perform Mcq?

Hive is a platform used to perform SQL commands to do MapReduce operations.

What is used to manage Hive views?

When a query references a view, the information in its definition is combined with the rest of the query by Hive's query planner. Logically, you can imagine that Hive executes the view and then uses the results in the rest of the query.

Which of the following are the two default table properties in Hive?

Hive automatically adds two table properties: last_modified_by holds the username of the last user to modify the table, and last_modified_time holds the epoch time in seconds of that modification.

What are the three different modes in which Hive can be run?

Hadoop can run in 3 different modes.Standalone(Local) Mode. By default, Hadoop is configured to run in a no distributed mode. It runs as a single Java process. ... Pseudo-Distributed Mode(Single node) Hadoop can also run on a single node in a Pseudo Distributed mode. ... Fully Distributed Mode.

What are the components used in Hive query processor?

Following are the components of a Hive Query Processor:Parse and Semantic Analysis (ql/parse)Metadata Layer (ql/metadata)Type Interfaces (ql/typeinfo)Sessions (ql/session)Map/Reduce Execution Engine (ql/exec)Plan Components (ql/plan)Hive Function Framework (ql/udf)Tools (ql/tools)More items...•

What are the components of Hadoop?

There are three components of Hadoop. Hadoop HDFS - Hadoop Distributed File System (HDFS) is the storage unit of Hadoop. Hadoop MapReduce - Hadoop MapReduce is the processing unit of Hadoop. Hadoop YARN - Hadoop YARN is a resource management unit of Hadoop.

What are the components of pig execution environment?

Let us take a look at the major components.Parser. Initially the Pig Scripts are handled by the Parser. ... Optimizer. The logical plan (DAG) is passed to the logical optimizer, which carries out the logical optimizations such as projection and pushdown.Compiler. ... Execution engine. ... Atom. ... Tuple. ... Bag. ... Map.More items...

Solved: how to find the hive-sitexml and hdfs-site.xml fil ...

I am trying to enable the sentry service with cloudera hadoop and there are some xml side properties that need to be plugged into hive-site.xml and hdfs-site.xml file along with sentry.xml file. We have three master nodesa and 4 data nodes. i did find the files in the following on every single ...

hadoop - HiveServer2 - Hue connections - Stack Overflow

On the other hand, the "queries getting stuck" because the query processing engine had crashed and burned (OOM) but the Thrift server was still alive and accepting connections, is something that rings a bell.Not only we restarted HS2 every day, but we also ran a "canary" every 15 min to check the primary HS2 was responding (i.e. accepting and processing new connections).

How to set idle Hive jdbc connection out from java code using hive jdbc

The hive.server2.idle.session.timeout causes a session to be terminated when it is not accessed for the specified duration. However, hive.server2.idle.session.timeout needs to be specified with hive.server2.session.check.interval set to a positive value. basically, we need to be specify a number of session checks within the timeout interval to cause the session to be closed.

Solved: How to get number of live connections with HiveSer ... - Cloudera

@nyakkanti. I think there is no direct way to do this but you can try to grep some log text. or . take a heap dump of hiveserver2 and use object query language ( SELECT toString(s.username) FROM INSTANCEOF org.apache.hive.service.cli.session.HiveSessionImpl s ) to know live sessions at that moment. to run object query language you can use jvisualvm or Eclipse MAT.

Using the HiveServer2 JDBC new connections to Hadoop may hang but ... - IBM

Cause. In BigInsights, the default maximum number of HiveServer2 worker threads is 100; when this threshold has been reached a new connection attempt will hang.

Hive - Start HiveServer2 and Beeline - Spark by {Examples}

HiveServer2 is the second generation of the Hive server, the first being HiveServer1 which has been deprecated and will be removed in future versions of Hive, so let's start using HiveServer2. In this Hive article, I will explain what is HiveServer2, how to start, accessing Web UI, benefits using HiveServer2, and finally using the Beeline Command Interface. Prerequisites: Have Hive installed ...

What database is used in Apache Hive?

Apache Hive uses Derby database by default. However, this database has limitation such as multi-user access. Any JDBC compliant database such as MySQL, Oracle can be used for Metastore. The key attributes that should be configured for Hive Metastore are given below: HIVE Components.

What is the conjunction part of HiveQL?

The conjunction part of HiveQL process Engine and MapReduce is Hive Execution Engine. It processes the query and generates results same as MapReduce results. It basically uses the flavor of MapReduce.

What is a user interface?

The user interface is for users to submit queries and other operations to the system. Hive includes mainly three ways to communicate to the Hive drivers.

What are hive clients?

a) Hive Clients- Different application can interact with hive with help of different hive clients provided by hive.These hive clients are hive thrift client,hive JDBC driver,hive ODBC driver.

What is a hive server?

Hive Server: It runs Hive as a server exposing a thrift service, which enables access from a range of clients written in different languages.

What is the default shell provided by hive to execute the hive queries or commands directly?

c) CLI- This is the default shell provided by hive to execute the hive queries or commands directly.

Which layer contains HDFS?

i) Storage Layer- This contains HDFS or HBASE for the storage of data in the tables.

What is a shell in Hadoop?

Shell: A shell is the command line interface which allows interactive queries similar to MySQL shell connected to the database. It also supports web and JDBC clients. Driver, compiler and execution engine take the hiveql scripts and run in Hadoop environment.

Indicating Progress and Statistics

Visual progress indications and statistics for the previously described Message Analyzer operations are displayed in the Session Explorer window. To provide these indications when loading data, capturing data, or manipulating data, Session Explorer utilizes the following progress and statistics indicator components:

Observing Progress Indicator Behaviors

The behavior of Session Explorer progress indicators varies depending on the tasks you are performing, as follows:

Manipulating data

Progress indications are also displayed whenever you apply or remove view Filters, Groups, Viewpoints, Pattern Matching, view Layouts, and Time Filters to or from a set of displayed messages, respectively, or when you toggle Operations from the Viewpoints drop-down list on the Message Analyzer Filtering toolbar.

What is hive server?

HiveServer2 is a thrift server which is a thin Service layer to interact with the HDP cluster in a seamless fashion. #N#It supports both JDBC and ODBC driver to provide a SQL layer to query the data.#N#An incoming SQL query is converted to either TEZ or MR job, the results are fetched and send back to client. No heavy lifting work is done inside the HS2. It just acts as a place to have the TEZ/MR driver, scan metadata infor and apply ranger policy for authorization.#N#HiveServer 2 maintains two type pools#N#1. Connection pool : Any incoming connection is handled by the HiveServer2-handler thread and is kept in this pool which is unlimited in nature but restricted by number of threads available to service an incoming request . An incoming request will not be accepted if there are no free HiveServer2-handler thread to service the request. The total no of threads that can be spawnned with the HS2 is controlled by parameter hive.server2.thrift.max.worker.thread.

What is hiveserver2 metastore?

HiveServer2 has embedded metastore which Interacts with the RDBMS to store the table schema info. The Database is available to any other service through HiveMetaStore. HiveMetastore is not used by HiveServer2 directly.

How long is a timeout in HS2?

1. client have a frequent a timeout : HS2 can never deny a connection hence timeout is a parameter set of client and not on HS2. Most of the Tools have 30 second as timeout, increase it to 90-150 seconds depending on your cluster usage pattern. 2.

How much GB does HS2 need?

40 connections will need somewhere around 16 GB of HS2 heap. Anything more than this need fine tuning of GC and horizontal scalling.

What is a session pool?

2. Session pool (per queue) : this is the number of concurrent sessions that can be active. Whenever a connection in connection pool executes a SQL query, and empty slot in the Session queue is found and the sql statement is executed. The queue is maintained for each queue defined in "Default query queue". In this example only Default Queue is defined with the session per queue =3. More than 3 concuurent queries (Not connections) will have to wait until one of the slots in the session pool is empty.

Does HS2 connect to DataNodes?

3. HS2 also has connects to DataNodes directly time to time to service request like "Select * from Table Limit N"

Greg Hart

What values do you have for the following properties in your Hiveserver2 configuration? I've added the values from the Kylo sandbox for reference.

Binh Nguyen Van

I am having this issue also, and it happening quite often now. In my cluster, I set hive.server2.session.check.interval=60000 and hive.server2.idle.session.timeout=2400000 I take a look at the code and I think it may because of this method in RefreshableDataSource class

Greg Hart

The validation query that Kylo uses can only test if the connection is open but not the session. To test if the session is valid, the validation query needs to select from a table in Hive. Could you try changing your validation query to select from an existing Hive table to see if that fixes your issue?

Binh Nguyen Van

I am using SELECT 1 in the test query and when I run it in Beeline I do get Invalid Session exception so I think Hive validates session when it runs that SQL but I will change to SELECT 1 FROM [existing table] to see if it changes anything.

Binh Nguyen Van

After I changed the test query, the issue is still happening and it happens more often when I increase maximum timer-driven thread count from 10 to 20.

Binh Nguyen Van

I made a change and created a pull request for this so please take a look.

What is hive server?

HiveServer2 is a thrift server which is a thin Service layer to interact with the HDP cluster in a seamless fashion. #N#It supports both JDBC and ODBC driver to provide a SQL layer to query the data.#N#An incoming SQL query is converted to either TEZ or MR job, the results are fetched and send back to client. No heavy lifting work is done inside the HS2. It just acts as a place to have the TEZ/MR driver, scan metadata infor and apply ranger policy for authorization.#N#HiveServer 2 maintains two type pools#N#1. Connection pool : Any incoming connection is handled by the HiveServer2-handler thread and is kept in this pool which is unlimited in nature but restricted by number of threads available to service an incoming request . An incoming request will not be accepted if there are no free HiveServer2-handler thread to service the request. The total no of threads that can be spawnned with the HS2 is controlled by parameter hive.server2.thrift.max.worker.thread.

What is hiveserver2 metastore?

HiveServer2 has embedded metastore which Interacts with the RDBMS to store the table schema info. The Database is available to any other service through HiveMetaStore. HiveMetastore is not used by HiveServer2 directly.

How long is a timeout in HS2?

1. client have a frequent a timeout : HS2 can never deny a connection hence timeout is a parameter set of client and not on HS2. Most of the Tools have 30 second as timeout, increase it to 90-150 seconds depending on your cluster usage pattern. 2.

How much GB does HS2 need?

40 connections will need somewhere around 16 GB of HS2 heap. Anything more than this need fine tuning of GC and horizontal scalling.

What is a session pool?

2. Session pool (per queue) : this is the number of concurrent sessions that can be active. Whenever a connection in connection pool executes a SQL query, and empty slot in the Session queue is found and the sql statement is executed. The queue is maintained for each queue defined in "Default query queue". In this example only Default Queue is defined with the session per queue =3. More than 3 concuurent queries (Not connections) will have to wait until one of the slots in the session pool is empty.

Does HS2 connect to DataNodes?

3. HS2 also has connects to DataNodes directly time to time to service request like "Select * from Table Limit N"

1.Architecture and Working of Hive - GeeksforGeeks

Url:https://www.geeksforgeeks.org/architecture-and-working-of-hive/

35 hours ago  · Interface of the Hive such as Command Line or Web user interface delivers query to the driver to execute. In this, UI calls the execute interface to the driver such as ODBC or …

2.What are the different components of a Hive architecture?

Url:https://data-flair.training/forums/topic/what-are-the-different-components-of-a-hive-architecture/

8 hours ago  · Apache Hive components Hive User Interfaces (UI) The user interface is for users to submit queries and other operations to the system. Hive includes mainly three ways to …

3.Hive Statistics: Why Useful - Cloudera Community - 246634

Url:https://community.cloudera.com/t5/Community-Articles/Hive-Statistics-Why-Useful/ta-p/246634

18 hours ago  · Driver: The driver is the component which receives the queries. This component implements the notion of session handles and provides execute and fetch APIs modeled on …

4.Viewing Session Statistics and Progress - Message …

Url:https://docs.microsoft.com/en-us/message-analyzer/viewing-session-statistics-and-progress

9 hours ago  · Hive Driver: The Hive driver receives the HiveQL statements submitted by the user through the command shell and creates session handles for the query. Hive Compiler: …

5.HiveServer2 configurations deep dive - Cloudera …

Url:https://community.cloudera.com/t5/Community-Articles/HiveServer2-configurations-deep-dive/ta-p/248615

34 hours ago Statistics is a metadata of Hive data. Hive supports statistics at the table, partition, and column level. These statistics serve as an input to the Hive Cost-Based Optimizer (CBO), which is an …

6.Invalid hive session handler - Google Groups

Url:https://groups.google.com/g/kylo-community/c/YGr1Hv_-Rgo

2 hours ago  · Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

7.[AMBARI-14109] Invalid session handle error in hive view …

Url:https://issues.apache.org/jira/browse/AMBARI-14109

6 hours ago  · Visual progress indications and statistics for the previously described Message Analyzer operations are displayed in the Session Explorer window. To provide these indications …

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1 2 3 4 5 6 7 8 9