Developing with Oracle Advanced Queuing – Part 6, Dequeuing Messages

Now that we know how to create tables and queues and enqueue messges, let’s look at how to dequeue them!

Dequeuing Features

Oracle AQ provides very high performance and functionality.  Key features include:

  • Concurrent dequeues
  • Multiple dequeue methods and options
  • Array dequeue
  • Message navigation
  • Waiting for messages
  • Retries with delays
  • Transaction protection
  • Exception queues

DBMS_AQ.DEQUEUE

The PL/SQL API is:

DBMS_AQ.DEQUEUE(
queue_name IN VARCHAR2,
dequeue_options IN dequeue_options_t,
message_properties OUT message_properties_t,
payload OUT "type_name",
msgid OUT RAW);

Note that message_properties_t is used for both enqueue and dequeue operations.

DEQUEUE_OPTIONS_T

TYPE DEQUEUE_OPTIONS_T IS RECORD (
consumer_name VARCHAR2(30) DEFAULT NULL,
dequeue_mode BINARY_INTEGER DEFAULT REMOVE,
navigation BINARY_INTEGER DEFAULT NEXT_MESSAGE,
visibility BINARY_INTEGER DEFAULT ON_COMMIT,
wait BINARY_INTEGER DEFAULT FOREVER,
msgid RAW(16) DEFAULT NULL,
correlation VARCHAR2(128) DEFAULT NULL,
deq_condition VARCHAR2(4000) DEFAULT NULL,
signature aq$_sig_prop DEFAULT NULL,
transformation VARCHAR2(61) DEFAULT NULL,
delivery_mode PLS_INTEGER DEFAULT PERSISTENT);

The DBMS.AQ.DEQUEUE_OPTIONS_T specifies the dequeuing options as described below:

  • consumer_name – the name of the subscriber.
  • dequeue_mode – modes includes:
    • REMOVE (with data) – this is the typical dequeuing method.  The message may remain in the queue table for history based on retention period, but it not eligible for future dequeuing (unless via msg_id).
    • REMOVE_NODATA – no data is returned, but the message is removed from queue. This may be used for selective cleanup.
    • BROWSE – reads the message data, but does not actually dequeue the message. The message remains available for future processing (unless dequeued by another process).  Browsing may not be repeatable, and as such there are numerous “gotchas” to be aware of.
  • navigation – there are two methods for navigation when dequeuing.
    • FIRST_MESSAGE – This creates a “snapshot” (effectively a cursor); note that this only retrieves messages that were enqueued at the time of the dequeue call.
    • NEXT_MESSAGE – If FIRST_MESSAGE was used, this retrieves the next message in the snapshot. See Default Dequeuing
  • wait – if no messages are available, the consumer may wait for the next message. The options are:
    • FOREVER – waits forever, which is the default.  Typically used for high frequency queues.  Note that this blocks the process.
    • NO_WAIT – don’t wait for next message. Typically used for deferred or batch operations, which are initiated by jobs scheduled at regular intervals.
    • Number – the wait time in seconds. Process is blocked while waiting.

The next message is dequeued on wake up.

NOTE BENE: Oracle AQ also offers the ability for a process to listen on multiple queues.

Dequeue Methods

There are several methods for dequeuing messages.  The default is to dequeue individual messages based on the sort order specified when the AQ table was created.

NOTE BENE: the most efficient navigation method for dequeuing based on the sort order is to use NEXT_MESSAGE without FIRST_MESSAGEFIRST_MESSAGE always performs a query.  However, if NEXT_MESSAGE is used without FIRST_MESSAGE, it will only perform one SELECT in the session; subsequent calls are simple fetches.

Other methods are:

  • Correlation ID – dequeue series of message based on correlation as follows:
  • Get correlation id by dequeuing using FIRST_MESSAGE. Dequeue additional messages via NEXT_MESSAGE using the value of correlation until no more messages remain.

The specification for correlation may use pattern matching (%,_).

This method typically requires the addition of an index and generation of statistics to force the underlying queries to use the index on the correlation column..

  • Transaction group – similar to correlation, but uses transaction_group set by producer. Should use array dequeuing,  but may use same loop as Correlation ID above, but specifying the transaction_group.  Pattern matching may also be used.
  • deq_condition– similar to SQL WHERE clause, accesses contents of payload object elements or other columns. See documentation for more details about specifying columns and payload elements.  Note that using the method supersedes all other methods.
  • msgid – dequeue a single message by system-assigned RAW value. This typically requires browsing the queue(s), and is usually used for cleanup and corrections.

Dequeue visibility

Messages may be dequeued in the following modes:

  • IMMEDIATE – Messages are removed from the queue in an autonomous transaction. If the application does not have retry capabilities, this will typically offer better performance and scalability
  • ON_COMMIT (transaction protection) – Messages are removed from the queue on COMMIT of the transaction. The dequeue operation is treated in the same manner as an INSERT/UPDATE/DELETE.  If the transaction fails, either due to ROLLBACK, system failure or shutdown, the retry count is incremented.  If the retry count is exceeded, the message is moved to the exception queue, otherwise it remains in the original queue.  Note that a system failure or shutdown may not increment the retry count.  If retry_delay was specified when the queue was created, the message will not be available for dequeuing for the specified number of seconds.

Message Expiration

If expiration is specified in message_properties_t.expiration, all consumers must dequeue the message before expiration time. Otherwise, the message is moved to the exception queue.  It is generally a good practice to specify expiration for multi-consumer queues, as not all consumers may be active, which would result in the message remaining in the queue indefinitely.<cod

Developing with Oracle Advanced Queuing – Part 5, Enqueuing Messages

There is a wide range of options for enqueuing messages.  These options include, but are not limited to:

  • Enqueuing a single message.
  • Enqueuing an array of messages.
  • Message Grouping, which treats all messages enqueued in a single transaction as a group.
  • Sender Identification.
  • Time Specification and Scheduling of message delivery.
  • Correlation Identifier, which allows multiple messages queued with a user defined identifier to be dequeued together.

Enqueuing Message

The following PL/SQL API is used to enqueue messages:

DBMS_AQ.ENQUEUE(
queue_name IN VARCHAR2,
enqueue_options IN enqueue_options_t,
message_properties IN message_properties_t,
payload IN "type_name",
msgid OUT RAW);

where:

  • queue_name – the name of the queue in which the message is to be enqueued.
  • payload – the type definition of the payload, typically, but not limited to, a PL/SQL abstract type
  • msg_id – the unique identifier of the message
  • enqueue_options (a record)

The DBMS_AQ.ENQUEUE_OPTIONS_T record contains the options for enqueuing the message as described below:

TYPE SYS.ENQUEUE_OPTIONS_T IS RECORD (
visibility BINARY_INTEGER DEFAULT ON_COMMIT,
relative_msgid RAW(16) DEFAULT NULL,
sequence_deviation BINARY_INTEGER DEFAULT NULL,
transformation VARCHAR2(61) DEFAULT NULL,
delivery_mode PLS_INTEGER NOT NULL DEFAULT PERSISTENT);

The attributes are:

  • visibility, two options:
    • ON_COMMIT – the message is enqueued as part of the transaction, i.e. enqueuing the message is completed by COMMIT.
    • IMMEDIATE – the message is enqueued immediately in an autonomous transaction.
  • transformation – Specifies a transformation function to be performed before enqueuing.
  • delivery_mode, two options
    • PERSISTENT – the message is stored in the queue table.
    • BUFFERED – the message is only maintained in memory, and may be lost in the event of system failure or database shutdown.
  • sequence_deviation – deprecated as of 10.2
  • relative_msg_id – effectively deprecated.
  • sequence_deviation – effectively deprecated.

DBMS_AQ.MESSAGE_PROPERTIES_T

The DBMS_AQ.MESSAGE_PROPERTIES_T record is used for both enqueuing and dequeuing operations

TYPE message_properties_t IS RECORD (
priority BINARY_INTEGER NOT NULL DEFAULT 1,
delay BINARY_INTEGER NOT NULL DEFAULT NO_DELAY,
expiration BINARY_INTEGER NOT NULL DEFAULT NEVER,
correlation VARCHAR2(128) DEFAULT NULL,
attempts BINARY_INTEGER,
recipient_list AQ$_RECIPIENT_LIST_T,
exception_queue VARCHAR2(61) DEFAULT NULL,
enqueue_time DATE,
state BINARY_INTEGER,
sender_id SYS.AQ$_AGENT DEFAULT NULL,
original_msgid RAW(16) DEFAULT NULL,
signature aq$_sig_prop DEFAULT NULL,
transaction_group VARCHAR2(30) DEFAULT NULL,
user_property SYS.ANYDATA DEFAULT NULL
delivery_mode PLS_INTEGER NOT NULL DEFAULT DBMS_AQ.PERSISTENT);

The relevant enqueue attributes are:

  • priority – the priority of the message. This is only relevant if the sorting method specified for the table includes the priority.
  • delay – specifies number of seconds before a message is available for dequeuing. Default is 0 (NO_DELAY)
  • expiration – the number of seconds a message is available for dequeuing (after delay). If the message is not dequeued by all subscribers, it will be moved to the exception queue with a status of EXPIRED.  This is necessary for multi-consumer queues, as not all subscribers may be able to dequeue the message.  Default is the constant NEVER.
  • delivery_modeDBMS_AQ.BUFFERED or DBMS_AQ.PERSISTENT, determines if the message is buffered or persistent.  The default is persistent.
  • correlation – the ID used for dequeuing by correlation ID. This is a producer supplied value, which allows a logical grouping of messages.  Unlike a transaction group, the messages need not be enqueued in a single transaction or by the same producer.

Oracle TIMESTAMP Internal Formats

I’ve encountered (and suffered from) a great deal of confusion regarding the internal format of TIMESTAMP.   Is it variable, like NUMBER or totally dependent on precision and time zone specification, or some combination?

The answer is the classic, “it depends”.

If “WITH TIME ZONE” is specified, the length is 13, independent of precision and data.

If the specification is simply “TIMESTAMP(0)” the length is 7, either with no timezone specified or WITH LOCAL TIMEZONE specified.

Otherwise, the length is dependent on the data..

Let’s create a table with various combinations.
CREATE TABLE TEST_TS_SIZE
(
SOURCE_DATA VARCHAR2 (15),
TS TIMESTAMP,
TS_1 TIMESTAMP (1),
TS_0 TIMESTAMP (0),
TS_3 TIMESTAMP (3),
TS_6 TIMESTAMP (6),
TS_0_LOCAL TIMESTAMP (0) WITH LOCAL TIME ZONE,
TS_1_LOCAL TIMESTAMP (1) WITH LOCAL TIME ZONE,
TS_3_LOCAL TIMESTAMP (3) WITH LOCAL TIME ZONE,
TS_6_LOCAL TIMESTAMP (6) WITH LOCAL TIME ZONE,
TS_0_TZ TIMESTAMP (0) WITH TIME ZONE,
TS_1_TZ TIMESTAMP (1) WITH TIME ZONE,
TS_3_TZ TIMESTAMP (3) WITH TIME ZONE,
TS_6_TZ TIMESTAMP (6) WITH TIME ZONE
);

Populate the table with SOURCE_DATA of SYSTIMESTAMP, which may have fractional seconds, and SYSDATE, with precision of seconds.
INSERT INTO TEST_TS_SIZE
(SELECT 'SYSTIMESTAMP',
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP,
SYSTIMESTAMP
FROM DUAL);
INSERT INTO test_ts_size
(SELECT 'SYSDATE',
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE,
SYSDATE
FROM DUAL);
COMMIT;

For all other combinations, the length is dependent on the data.   If the data does not contain fractional seconds, the length is 7, otherwise it’s 11.

Query the table for columns without precision or timezone specification:
SELECT SOURCE_DATA, SUBSTR (DUMP (TS), 1, 14) AS "DUMP TS" FROM TEST_TS_SIZE;

Note that when provide data with non-zero fractional second, e.g. SYSTIMESTAMP as SOURCE_DATA, the length jumps to 11.

SOURCE_DATA DUMP TS
SYSTIMESTAMP Typ=180 Len=11
SYSDATE Typ=180 Len=7:

Query the table for columns with precision, but no timezone specification:

SELECT SOURCE_DATA,
SUBSTR (DUMP (TS_0), 1, 14) "DUMP TS_0",
SUBSTR (DUMP (TS_1), 1, 14) "DUMP TS_1",
SUBSTR (DUMP (TS_3), 1, 14) "DUMP TS_3",
SUBSTR (DUMP (TS_6), 1, 14) "DUMP TS_4"
FROM TEST_TS_SIZE;

Note that when we go from TIMESTAMP(0) to TIMESTAMP(1) and provide data with non-zero fractional second, e.g. SYSTIMESTAMP as SOURCE_DATA, the length jumps to 11.

SOURCE_DATA DUMP TS_0 DUMP TS_1 DUMP TS_3 DUMP TS_4
SYSTIMESTAMP Typ=180 Len=7: Typ=180 Len=11 Typ=180 Len=11 Typ=180 Len=11
SYSDATE Typ=180 Len=7: Typ=180 Len=7: Typ=180 Len=7: Typ=180 Len=7:

Now let’s query using WITH LOCAL TIMEZONE columns (which uses the client side time rather than database time).  Note that the lengths are the same as when no timezone specifier was use.

SELECT SOURCE_DATA,
SUBSTR (DUMP (TS_0_LOCAL), 1, 14) "DUMP TS_0_LOCAL",
SUBSTR (DUMP (TS_1_LOCAL), 1, 14) "DUMP TS_1_LOCAL",
SUBSTR (DUMP (TS_3_LOCAL), 1, 14) "DUMP TS_3_LOCAL",
SUBSTR (DUMP (TS_6_LOCAL), 1, 14) "DUMP TS_6_LOCAL"
FROM TEST_TS_SIZE;

SOURCE_DATA DUMP TS_0_LOCAL DUMP TS_1_LOCAL DUMP TS_3_LOCAL DUMP TS_6_LOCAL
SYSTIMESTAMP Typ=231 Len=7: Typ=231 Len=11 Typ=231 Len=11 Typ=231 Len=11
SYSDATE Typ=231 Len=7: Typ=231 Len=7: Typ=231 Len=7: Typ=231 Len=7:

Finally, query using  WITH TIMEZONE.

SELECT SOURCE_DATA,
SUBSTR (DUMP (TS_0_TZ), 1, 14) "DUMP TS_0_TZ",
SUBSTR (DUMP (TS_1_TZ), 1, 14) "DUMP TS_1_TZ",
SUBSTR (DUMP (TS_3_TZ), 1, 14) "DUMP TS_3_TZ",
SUBSTR (DUMP (TS_6_TZ), 1, 14) "DUMP TS_6_TZ"
FROM TEST_TS_SIZE;

Length is always 13 WITH TIMEZONE:

SOURCE_DATA DUMP TS_0_TZ DUMP TS_1_TZ DUMP TS_3_TZ DUMP TS_6_TZ
SYSTIMESTAMP Typ=181 Len=13 Typ=181 Len=13 Typ=181 Len=13 Typ=181 Len=13
SYSDATE Typ=181 Len=13 Typ=181 Len=13 Typ=181 Len=13 Typ=181 Len=13

Continue reading “Oracle TIMESTAMP Internal Formats”

Developing with Oracle Advanced Queuing – Part 4, Creating Queues

Queues are created via:

DBMS_AQADM.CREATE_QUEUE (
queue_name IN VARCHAR2,
queue_table IN VARCHAR2,
queue_type IN BINARY_INTEGER DEFAULT NORMAL_QUEUE,
max_retries IN NUMBER DEFAULT NULL,
retry_delay IN NUMBER DEFAULT 0,
retention_time IN NUMBER DEFAULT 0,
dependency_tracking IN BOOLEAN DEFAULT FALSE,
comment IN VARCHAR2 DEFAULT NULL,
auto_commit IN BOOLEAN DEFAULT TRUE);

The parameters are described below:

  • queue_name – the name of the queue.
  • queue_table – the name of the AQ table holding queue.
  • queue_type – NORMAL_QUEUE or EXCEPTION_QUEUE.
  • max_retries – the maximum number of dequeue retries before moving to exception queue
  • retry_delay – after a failure (usually ROLLBACK), the number of seconds before message will be available for dequeuing again.
  • retention_time – the time the message remains in the queue table after dequeuing.
  • dependency_tracking – not currently implemented
  • comment – Queue documentation, which is kept in the data dictionary.
  • auto_commit – deprecated;

UNDERSTANDING THE DEVELOPER INEXPERIENCED IN ORACLE TECHNOLOGIES (DIO)

An anti-pattern is a common design, SQL or application development practice that results in poor performance.   This post discusses why anti-patterns are so common in today’s information technology.

Even though OO theory is in fact older than relational theory, originating with Simula-67 in 1967, versus the early ‘70s for Codd’s and Date’s works, OO didn’t really become the dominant force until the ‘90s, resulting in the common view that RDBMS are old and outdated.

The majority of today’s developers receive little, if any, training in relational theory or practice in college. Their training is almost totally in procedural, object oriented (OO) languages, primarily Java, possibly with a smattering of functional languages and techniques.

Further, most of today’s developers have had no real training in RDBMS theory, SQL, Oracle or good design practices. They typically pick up Oracle and SQL by a little bit of reading, but never delve deeply into the techniques. They pick up random bits of mis-information, which they take as gospel. We will refer to them as Developers Inexperienced in Oracle technologies (DIO).

This results in a common set of anti-patterns, which I’ve seen in many different environments which result in both poor performance in the database and complex, poorly performing application code..

Common anti-patterns include:

  • Overly Generic Data Models – In particular, variants of the OBJECT/INSTANCE and ATTRIBUTE/ATTRIBUTE_VALUE table approach.   Very often, this will involve poorly managed data, e.g., surrogate keys for the same meta-data that is not consistent across environments, overloaded columns requiring data conversion, lack of integrity constraints and inappropriate indexing.
  • Fat, Unnormalized Tables – Normalization is seldom taught these days, and, when it is, it is all too often perceived as a purely theoretical exercise. Tables with dozens, and even 100s, of columns are all too common. Making matters worse is the overuse of CLOBs/. This is further exacerbated by making such tables generic, e.g., a table named DATA_OBJECT with 15 number columns named “NUMBER_COL_1” thru “NUMBER_COL_15”, 15 such columns for various sizes of VARCHAR2, DATE, etc.   In fact, this model was even worse! There were a 6 more tables, DATA_OBJECT_1 through DATA_OBJECT_6, all of which had 6 recursive foreign keys and 6 foreign keys back to DATA_OBJECT. The data model diagram looks like a tarantula or chip diagram.
  • Fear of Joins – It’s amazing, and a bit scary, how often I’ve heard the justification “I didn’t want to do any joins” as the reason for poor design.   Or “joins are bad”, so “we avoid them at all costs”. One of my favorite jokes, “A DBA asked a developer why he wrote Cartesian join that brought the system to its knees. The developer replied ‘I wanted to be sure Oracle could handle something simple before I gave it anything more complex’”!
  • SQL Query Cost – While it seems a bit contradictory, one of the most common causes of poor performance is the DIO’s lack of understanding of the relative cost of a SQL query, typically resulting in issuing far more queries than are necessary. While the individual query may be very efficient, the sheer volume results in a severe performance problem.
  • Iterative vs. Set World View – The procedural developer thinks in terms of iterating over a collection, e.g., performing the same operation on each individual element of the collection. The SQL performance expert thinks in terms of (result-) sets. The net effect is that the DIO all too often performs operations, such as de-facto joins, in application code that should be performed in the database, resulting in executing far more queries than are necessary. My favorite, real world example is the DIO that performed 80,000 SELECTs taking almost 4 hours. The same operation performed with a single query took approximately 30 seconds.   All of the processing that was performed in the application was performed by a CASE clause in the SELECT. (Even sadder, the development manager wanted to schedule a meeting to determine which approach was better, as the CASE clause was “complicated”.)
  • Unmanaged Surrogate Keys – One of the fundamental tenets of relational design is that the PK value for the same “row” is the same everywhere, e.g. the value of COMPANY_ID for a company is the same in all environments. However, since so much data is generated by application code on the fly, this tenet is often violated, leading to a wide variety of performance and consistency issues. Imagine a BUSINESS_SERVICE reference table with 30 rows where the PK value is different across the development, QA, integration and production environments (except this is a real example, not imaginary)! Typically, there will be another lookup column(s) with a unique index/key that is supposed to be consistent across environments. Be aware that such lookup columns all too frequently use mixed case strings.
  • Widespread use of Dummy Values instead of NULL – DIOs are typically very uncomfortable with 3-valued logic and unsure when it is appropriate to use NULL.   They are also unsure about the interactions between NULL and indexes. As such, they typically use dummy values where NULL would be a better choice.
  • Lack of Documentation

I will cover specific instances and SQL Quick Fixes in subsequent posts.

SQL Quick Fix #1 – Never UPDATE Primary Key (or unchange) Columns

Never UPDATE primary key (PK) columns, not even to their current value.  Not only is it bad practice and bad design, it results in unnecessary checking for child rows if there are foreign keys (FK) referencing the primary key.

An UPDATE statement should only update the columns that are actually receiving new values.   When a column is updated, Oracle does not distinguish between a changed value and already existing value.  (This by design for reasons having to do with read consistency).

Such UPDATEing is part of a common anti-pattern where all of the columns in a row are UPDATEd, even if the values of some columns aren’t changed.

Hibernate does this by default e.g. dynamic-update=false will result in all of the columns being included in the UPDATE statement.  You can configure dynamic-update properties via annotation or XML mapping file.  It should be set to true.