Monday 7 December 2015

Event Queues Part Uno

WHY SHOULD I KEEP AS LESS RECORDS AS POSSIBLE

CMS Tuning Guide [2.3 Chapter] recommends to keep less than 1000 records in Event Queue in ideal world.

The more records exist in EventQueue table, the more time and resources it takes for SQL Server to find fresh rows raised by other instances.

Taking into account that the aforementioned SQL would be executed every 2 seconds by every Sitecore instance, keeping obsolete entries in database could lead to MSSQL overheat.

A symptom would be SQL Timeout errors in Sitecore Logs:

Message: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

Source: Sitecore.Kernel
   at Sitecore.Data.DataProviders.Sql.DataProviderCommand.ExecuteReader()
.....
   at Sitecore.Eventing.EventQueue.ProcessEvents(Action`2 handler)
   at Sitecore.Eventing.EventProvider.RaiseQueuedEvents()

WHO CLEANS UP ?

Sitecore ships with Sitecore.Tasks.CleanupEventQueue agent defined in web.config.

It can remove one day old records.

A more aggressive cleanup policy was implemented in 7.2 Update-4 [ref. #392673]. It allows to specify number of minutes to keep. Yippee!

Although one can always configure custom SQL Agent to manually cleanup Event Queue, but Sitecore stock mechanism does its job pretty well.

AM I AFFECTED?

A handy SQL Query would help to answer the question:

SELECT SUBSTRING(p.[Key],9,100) AS [Instance], CONVERT(BINARY(8), CAST(CAST(p.[Value] AS NVARCHAR(10)) AS int )) AS [LastProcessedStamp],
(SELECT COUNT(*) FROM [EventQueue] WHERE [Stamp] > CONVERT(INT, CAST(p.[Value] AS NVARCHAR(8)))) AS [TODO],
(CASE WHEN (q.[Created] is null) THEN
(
CONVERT(VARCHAR(24),(SELECT MAX([Created])-MIN([Created]) FROM EventQueue),20)
)
ELSE
CONVERT(VARCHAR(24),(SELECT top(1) [Created] AS TopCreated FROM EventQueue order by [Stamp] desc) - (q.[Created]),20)
end) AS [ProcessingDelay],
 SUBSTRING(q.[EventType],0, CHARINDEX(',',q.[EventType])) AS [LastEventType],
 q.[InstanceName] as [RaisedByInstance],
 q.[UserName] as [RaisedByUser],
 q.[Created] as [RaisedTime],
 q.[InstanceData] as [LastEventData],
 q.[Id] as [LastEqID]

FROM  Properties p 
FULL join EventQueue q
ON q.[Stamp] = CONVERT(BINARY(8), CAST(CAST(p.[Value] AS NVARCHAR(10)) AS int ))
WHERE p.[Key] LIKE 'EQStamp%'
order by TODO

Sitecore keeps last processed event stamp in Properties table as number in decimal, whereas SQL stores stamp as TimeStamp (shown as hex)

Given SQL query would try to find last processed event details, and number of events that were created later ( pending events ).

ProcessingDelay column is a good measure showing how much time would it take for a published item to appear in live site. It is calculated as timespan between now, and last processed event creation time.

WHY COULD DELAY APPEAR ?

Common sense rule - events are raised faster then processed.

Publishing, or content editing can be performed in a few threads, whereas events processing is done in a single Heartbeat thread.


It could also take some time to find & remove modified item entry from large Sitecore caches that are accessed by many threads simultaneously.

It is expected that some time could be required for Sitecore servers to process all the events during massive publish operations.


HOW TO FIGHT BACK ?

Raise less events, write less data, stop storing legacy data in EventQueue.

In next articles I will show basic techniques on how to eliminate processing delays.

No comments:

Post a Comment