Setting JDBC Fetch Size in Hibernate

In my personal experience, far more database performance problems are actually application side problems, particularly with Java and Hibernate.

A key goal is to minimize network traffic between Oracle and the application by setting appropriate fetch size..

What are your needs and expectations?

When working with a query, some basic questions that should be asked:

  1. How many rows do I expect to be returned to the application? 1 row, 10 rows, 1000 rows, 1,000,000 rows?
  2. For DML, (INSERT/UPDATE/DELETE/MERGE), how many rows do I expect to be affected?
  3. How often do I expect my query to run?

Fetch size and Round Trips

When an application is performing a SELECT and the number of rows to be returned is large, the application “fetches” rows in batches.  Each batch is a “round trip” to Oracle to “fetch” the next batch of rows.

Depending on how the application code is written, the code may either a) fetch all of the rows or b) process each batch before requesting the next batch.

The number of rows retrieved by each fetch is called the “fetch size”.   The Oracle JDBC driver default is 10 rows.

The amount of time required for each round trip is included in the total elapsed time for the query.   As such, a query may appear to be slow when in fact Oracle is simply waiting for the next fetch.  Or the total number of fetches may be excessively large, e.g. 20M rows = 2M round trips across the network if the fetch size = 10.

In Hibernate, you can specify the desired fetch size in the DB model class via:

org.hibernate.fetchSize (Long – number of records)

Hibernate provides the value of this hint to the JDBC driver to define the number of rows the driver shall receive in one batch. This can improve the communication between the JDBC driver and the database, if it’s supported by the driver

However, be aware that you should not automatically specify a large fetchSize as this results in pre-allocation of memory for the total possible number of rows in each fetch.  This is less of a problem in 12c.

Developing with Oracle Advanced Queuing – Part 8, Structures, Performance Tips and Misc.

AQ Table Structures

A multi-consumer AQ table has 7 underlying tables, both heap and index organized.  The main table with message data for all queues has the same name as specified in CREATE_QUEUE_TABLE, e.g. ORDERS_QUEUETABLE.  Other tables have names beginning with AQ$, e.g. AQ$_ORDERS_QUEUETABLE_H

A single consumer AQ table creates a single table with main table name; the index structure may vary.

Performance Tips for Dequeuing

Using certain features, such as correlation id or transaction grouping, may require additional indexes on the main table.  To change the behavior of the queries used by AQ, statistics need to be gathered, as AQ tables are exempt from automatic statistics gathering.  However, generating appropriate statistics in a production environment can be problematic due to the volatile nature of queues; stopping the queues to  allow messages to build up in order to gather statistics is probably not acceptable to the DBAs.  Statistics can either be created manually, or, in a development or QA environment, messages can be enqueued without dequeuing.  The statistics can then be imported into production for the table.  It’s also a good idea to lock the statistics, just to be safe.

 Query to be Tuned

Finding the underlying dequeuing query for tuning is not immediately obvious.  Look in appropriate V$ or GV$ views or AWR report for the following pattern:

SELECT     /*+ FIRST_ROWS(1) */
tab.ROWID,

tab.user_data
FROM -- the name of the main queue table
WHERE q_name = :1 AND (state = :2 and …
ORDER BY q_name, …
FOR UPDATE SKIP LOCKED;

FOR UPDATED SKIP LOCKED is the “secret sauce” for AQ’s performance.  It performs SELECT FOR UPDATE only on rows that are not currently locked!!!  It also apparently only locks rows when they are fetched, but this has been difficult to confirm.

More Stuff

It is not possible to cover all of the capabilities and functionality of Oracle AQ in this series.  Some other features of potential interested include:

  • AQ automatically manages space, perform COALESCE as well as removing messages that have passed their retention periods.
  • There are numerous APIs for managing all aspects of AQ.
  • AQ can propagate messages via external protocols and gateways
  • AQ can be accessed via SOAP

AQ can retain the entire history of a message for non-repudiation, logging, etc

Developing with Oracle Advanced Queuing – Part 7, Message Propagation

Propagation

Messages may be pushed to other queues via propagation.  Those queues typically, but not always, exist in another database or an external messaging system; the latter is beyond the scope of this paper.  Propagation may also be to queues in the same database.  The messages are ultimately processed by consumers of the destination queue(s); propagated messages are considered processed upon completion of propagation.  Propagation may push messages to multiple queues in multiple targets (fan out).  Messages may also be propagated from multiple sources into a single queue.  The destination queue may be single or multi-consumer, but must be of the same payload type.  Propagation is performed by scheduled jobs.  A propagation window is a period of time in which propagation can occur, i.e. effectively scheduling the job.

There are two basic modes for propagation between databases:

  • Queue to dblink – Effectively deprecated.
  • Queue to queue – the target queues are specified.

The API to schedule propagation is:

DBMS_AQADM.SCHEDULE_PROPAGATION (
queue_name IN VARCHAR2,
destination IN VARCHAR2 DEFAULT NULL,
start_time IN DATE DEFAULT SYSDATE,
duration IN NUMBER DEFAULT NULL,
next_time IN VARCHAR2 DEFAULT NULL,
latency IN NUMBER DEFAULT 60,
destination_queue IN VARCHAR2 DEFAULT NULL);

The parameters are:

  • queue_name – the name of the queue to be propagated.
  • destination – destination dblinks.
  • start_time – the start time for the propagation, i.e. the time when the job will first be schedule.
  • duration – how long propagation lasts in seconds. NULL means the propagation lasts forever (or until stopped or altered).
  • next_time – a calendar expression (as used by DBMS_SCHEDULER) for the next propagation window.
  • latency – if no messages, how many seconds to wait until checking the queue for message to be propagated. 0 results in propagation as soon as a message is available.

Other APIs to manage propagation are:

  • ALTER_PROPAGATION_SCHEDULE
  • DISABLE_PROPAGATION_SCHEDULE
  • ENABLE_PROPAGATION_SCHEDULE
  • SCHEDULE_PROPAGATION
  • VERIFY_QUEUE_TYPES