Troubleshooting Quartz Trigger Jobs
In Verify Privilege Vault 10.7 and newer, all background operations have moved to Quartz. All scheduled background operations have become jobs registered in tables inside the Verify Privilege Vault database.
Quartz-Related Tables
All Quartz-related table names start with "QRTZ_ "
Trigger Job Components
Each background job runs as a consumer, similar to how a distributed engine (DE) works, in the background worker. Each job has a corresponding trigger job registered in Quartz. Quartz executes the trigger job at the configured time and interval. The trigger job puts a message on the backbone bus, which is picked up and executed by the background worker:
Simple Trigger Job
There are two kinds of registered trigger jobs, simple triggers and cron triggers. A simple trigger fires on an interval. For example, on premises the heartbeat job runs every 30 seconds:
In this case, you can ignore the TRIGGER_GROUP and REPEAT_COUNT columns. The latter column limits the number repetitions, which we do not do. The REPEAT_INTERVAL is in milliseconds, so 30000 is 30 seconds. TIMES_TRIGGERED is a counter of trigger firings. There are two trigger jobs for heartbeat. The local job handles secrets that do not run through a distributed engine while the other heartbeat handles secrets that do run through a distributed engine.
Cron Trigger Job
Cron jobs run at a specific time using a variation of the well-known cron syntax. For example:
In this example, the trigger ExpiringLicenseTaskTriggerJob runs every day at 12:00 PM UTC.
The Quartz jobs are run within the background scheduler role, which runs on every node that has the background-worker-enabled bit set to true. Quartz coordinates trigger-job execution through the database to ensure that the same trigger-job is run just once, even if the Quartz scheduler is running on multiple nodes simultaneously.
Procedures
Viewing Trigger Job State (QRTZ_TRIGGERS Table)
The current state of each trigger-job can be seen in the QRTZ_TRIGGERS table:
Viewing Trigger Firing Times
In the previous figure, the NEXT_FIRE_TIME and PREV_FIRE_TIME columns are not human readable. To convert them to something you can read, run the following SQL query:
SELECT TOP 1000
[TRIGGER_NAME]
,[TRIGGER_STATE]
,[TRIGGER_TYPE]
,NEXT_FIRE_IN = DATEDIFF(second, GETUTCDATE(), CAST(NEXT_FIRE_TIME/864000000000.0 - 693595.0 AS DATETIME))
,UTCNOW = GETUTCDATE()
,NEXT_FIRE_TIME2 = CAST(NEXT_FIRE_TIME/864000000000.0 - 693595.0 AS DATETIME)
,[PREV_FIRE_TIME2] = CAST(PREV_FIRE_TIME/864000000000.0 - 693595.0 AS DATETIME)
,[START_TIME2] = CAST(START_TIME/864000000000.0 - 693595.0 AS DATETIME)
,[END_TIME2] = CAST(END_TIME/864000000000.0 - 693595.0 AS DATETIME)
,[PRIORITY]
FROM [dbo].[QRTZ_TRIGGERS]
This query shows the NEXT_FIRE_TIME in seconds and the other fields as DateTimes, along with the current UTC date to make it easier to compare to the current time.
The TRIGGER_STATE is normally WAITING. If the trigger job is currently running, the state will be ACQUIRED. If there was an error running the trigger job, the state will be ERROR. Once a trigger job enters an ERROR state it will not run again. To address this, Verify Privilege Vault automatically looks for any trigger jobs in the ERROR state every 10 minutes and changes their state to WAITING. If the NEXT_FIRE_TIME is a large negative number, it indicates that the scheduler role is not running—take a look at the Verify Privilege Vault-BSSR.log file to see its status.
Adjusting Trigger Job Frequency
Verify Privilege Vault populates the Quartz schedules when it starts up and creates them if they do not exist. Verify Privilege Vault does not update pre-existing schedules, so it is possible to change the frequency of a job in the table and have the system run with that frequency from that point on.
IBM Security controls those schedules and in future releases may not recognize changes you have made to those schedules. Furthermore, we do not test with a variety of schedules and have no plans to do so. Therefore, adjusting the schedules is a risky undertaking that may cause issues within Verify Privilege Vault. If you must change the tables, you must bounce IIS on all the nodes where the scheduler runs, after you change the table values.
To trigger an infrequent task just once, without altering its schedule, follow these steps:
- Update the NEXT_RUN_TIME value in the QRTZ_TRIGGERS table to match the NEXT_RUN_TIME value from a trigger that fires frequently, such as one of the heartbeat triggers.
- Recycle the scheduler role by running Recycle Background Processes on the Diagnostics page,
-
Reset the background worker (inside the application pool) in one of the following ways:
- Click Recycle Background Processes on the Diagnostics page. This shuts down and restarts the scheduler.
- Recycle the application pool. This restarts Verify Privilege Vault in its entirety.
- Reset IIS. From Verify Privilege Vault’s perspective this is nearly the same as the second option.