Replaying workloads with the RML Utilities

In today’s weblog posting I want to talk about replaying SQL Server workloads with the RML Utilities. RML Utilities stands for Replay Markup Language Utilities and are provided by Microsoft as a free download. You can download them here: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=7EDFA95A-A32F-440F-A3A8-5160C8DBE926. I have first encountered the RML Utilities when I read the book “Professional SQL Server 2008 Internals and Troubleshooting” (ISBN 978-0470484289) written by Christian Bolton, Justin Langford, Brent Ozar, James Rowland-Jones, and Steven Wort.

The RML Utilities contains a replaying engine that replays SQL Server workloads in real-time. The above mentioned book states the following on page 462 about those tools: “RML Utilities allow you to replay traces at the same rate they were captured. That means you get a better representation of true user activity and server workload than you do with other tools that condense the statements and execute them in sequence“.

That is also the real advantage of the RML Utilities: Replaying traces at the same rate they were captured! I have several clients that have really hard problems when they want to replay workloads, e.g. in a test environment when they have applied new indexes for performance optimization. SQL Server Profiler just replays workloads, but you don’t get the identical workload at the same rate replayed, so you are not able to reproduce some scenarios like locking/blocking.

For a recent customer engagement I’ve prepared a demo where I wanted to show how you can replay a workload that was generated from a .NET application. This sounds very easy, especially when you have read chapter 12 of the above mentioned book, but there are really some awful pitfalls that you have to know when you are working with the RML Utilities and SQL Server workloads generated by .NET applications. With this weblog posting I want to show you those pitfalls and what workarounds you have to do to get everything working fine.

Let’s start by the configuration needed to capture a SQL Trace for the RML Utilities. Everything that you have to do is to setup a server-side SQL Trace and capture everything in a .TRC file. The biggest problem here is the fact that you have to capture a lot of different events, which means that your .TRC files are getting really large. I’ve seen traces with several 100 GB of data in it. The RML Utilities are coming with a SQL script file that you can use as a basis for capturing a SQL Trace (TraceCaptureDef.sql):

CREATE PROCEDURE #tmpPPEventEnable @TraceID int, @iEventID int
AS
BEGIN

    set nocount on


    declare @iColID        int
    declare @iColIDMax    int
    declare @on bit

    set @on= 1
    set @iColID = 1
    set @iColIDMax = 64

    while(@iColID <= @iColIDMax)
    begin
        exec sp_trace_setevent @TraceID, @iEventID, @iColID, @on
        set @iColID = @iColID + 1
    end
END
GO

-- Create a Queue
declare @rc int
declare @TraceID int
declare @maxfilesize bigint
set @maxfilesize = 250-- An optimal size for tracing and handling the files 

-- Please replace the text InsertFileNameHere, with an appropriate
-- file name prefixed by a path, e.g., c:\MyFolder\MyTrace. The .trc extension
-- will be appended to the filename automatically. 

exec @rc = sp_trace_create @TraceID output, 2 /* rollover*/, N'InsertFileNameHere', @maxfilesize, NULL
if (@rc != 0) goto error 

declare @off bit

set @off = 0 

-- Set the events
exec #tmpPPEventEnable @TraceID, 10 -- RPC Completed
exec #tmpPPEventEnable @TraceID, 11 -- RPC Started
 
declare @strVersion varchar(10)

set @strVersion = cast(SERVERPROPERTY('ProductVersion') as varchar(10))
if( (select cast( substring(@strVersion, 0, charindex('.', @strVersion)) as int)) >= 9)
begin
    exec sp_trace_setevent @TraceID, 10, 1, @off        --        No Text for RPC, only Binary for performance
    exec sp_trace_setevent @TraceID, 11, 1, @off        --        No Text for RPC, only Binary for performance
end

exec #tmpPPEventEnable @TraceID, 44 -- SP:StmtStarting
exec #tmpPPEventEnable @TraceID, 45 -- SP:StmtCompleted
exec #tmpPPEventEnable @TraceID, 100 -- RPC Output Parameter
exec #tmpPPEventEnable @TraceID, 12 -- SQL Batch Completed
exec #tmpPPEventEnable @TraceID, 13 -- SQL Batch Starting
exec #tmpPPEventEnable @TraceID, 40 -- SQL:StmtStarting
exec #tmpPPEventEnable @TraceID, 41 -- SQL:StmtCompleted
exec #tmpPPEventEnable @TraceID, 17 -- Existing Connection
exec #tmpPPEventEnable @TraceID, 14 -- Audit Login
exec #tmpPPEventEnable @TraceID, 15 -- Audit Logout
exec #tmpPPEventEnable @TraceID, 16 -- Attention
exec #tmpPPEventEnable @TraceID, 19 -- DTC Transaction
exec #tmpPPEventEnable @TraceID, 50 -- SQL Transaction
exec #tmpPPEventEnable @TraceID, 50 -- SQL Transaction
exec #tmpPPEventEnable @TraceID, 181 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 182 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 183 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 184 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 185 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 186 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 187 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 188 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 191 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 192 -- Tran Man Event
exec #tmpPPEventEnable @TraceID, 98 -- Stats Profile
exec #tmpPPEventEnable @TraceID, 53 -- Cursor Open
exec #tmpPPEventEnable @TraceID, 70 -- Cursor Prepare
exec #tmpPPEventEnable @TraceID, 71 -- Prepare SQL
exec #tmpPPEventEnable @TraceID, 73 -- Unprepare SQL
exec #tmpPPEventEnable @TraceID, 74 -- Cursor Execute
exec #tmpPPEventEnable @TraceID, 76 -- Cursor Implicit Conversion
exec #tmpPPEventEnable @TraceID, 77 -- Cursor Unprepare
exec #tmpPPEventEnable @TraceID, 78 -- Cursor Close
exec #tmpPPEventEnable @TraceID, 22 -- Error Log
exec #tmpPPEventEnable @TraceID, 25 -- Deadlock
exec #tmpPPEventEnable @TraceID, 27 -- Lock Timeout
exec #tmpPPEventEnable @TraceID, 60 -- Lock Escalation
exec #tmpPPEventEnable @TraceID, 28 -- MAX DOP
exec #tmpPPEventEnable @TraceID, 33 -- Exceptions
exec #tmpPPEventEnable @TraceID, 34 -- Cache Miss
exec #tmpPPEventEnable @TraceID, 37 -- Recompile
exec #tmpPPEventEnable @TraceID, 39 -- Deprocated Events
exec #tmpPPEventEnable @TraceID, 55 -- Hash Warning
exec #tmpPPEventEnable @TraceID, 58 -- Auto Stats
exec #tmpPPEventEnable @TraceID, 67 -- Execution Warnings
exec #tmpPPEventEnable @TraceID, 69 -- Sort Warnings
exec #tmpPPEventEnable @TraceID, 79 -- Missing Col Stats
exec #tmpPPEventEnable @TraceID, 80 -- Missing Join Pred
exec #tmpPPEventEnable @TraceID, 81 -- Memory change event
exec #tmpPPEventEnable @TraceID, 92 -- Data File Auto Grow
exec #tmpPPEventEnable @TraceID, 93 -- Log File Auto Grow
exec #tmpPPEventEnable @TraceID, 116 -- DBCC Event
exec #tmpPPEventEnable @TraceID, 125 -- Deprocation Events
exec #tmpPPEventEnable @TraceID, 126 -- Deprocation Final
exec #tmpPPEventEnable @TraceID, 127 -- Spills
exec #tmpPPEventEnable @TraceID, 137 -- Blocked Process Threshold
exec #tmpPPEventEnable @TraceID, 150 -- Trace file closed
exec #tmpPPEventEnable @TraceID, 166 -- Statement Recompile
exec #tmpPPEventEnable @TraceID, 196 -- CLR Assembly Load

--    Filter out all sp_trace based commands to the replay does not start this trace
--    Text filters can be expensive so you may want to avoid the filtering and just
--    remote the sp_trace commands from the RML files once processed.
exec sp_trace_setfilter @TraceID, 1, 1, 7, N'%sp_trace%'

-- Set the trace status to start
exec sp_trace_setstatus @TraceID, 1

/*
exec sp_trace_setstatus 2, 0
exec sp_trace_setstatus 2, 2
*/

print 'Issue the following command(s) when you are ready to stop the tracing activity'
print 'exec sp_trace_setstatus ' + cast(@TraceID as varchar) + ', 0'
print 'exec sp_trace_setstatus ' + cast(@TraceID as varchar) + ', 2'

goto finish

error:
select ErrorCode=@rc

finish:
--select * from ::fn_trace_geteventinfo(@TraceID)
select * from sys.traces
go

As you can see from the previous listing, you have to collect a lot of different events from SQL Trace that are needed for the replaying functionality. If you don’t capture them, you are not able to replay the captured workload. A good practice here is to use rollover files, to keep the size of the .TRC files manageable.

After you have configured and started your SQL Trace, you can use your .NET application that interacts with SQL Server. As soon as you have captured the relevant workload you can stop and delete the SQL Trace from SQL Server. Inside your file system you have now your .TRC files with the captured workload.

And now the real fun begins. If you follow the descriptions from chapter 14, you will fail immediately in replaying the captured workload. The problem is that the RML Utilities have several bugs/problems, when they have to replay workloads that were captured from .NET applications that are connecting to SQL Server through MARS (Multiple Active Result Sets). I have encountered those problems, because for my demonstration I have used a .NET application (which uses MARS) that I have written in the year 2006. I just wanted to be realistic as possible J. All those problems are described in the following weblog posting from the CSS SQL Server Engineers: http://blogs.msdn.com/b/psssql/archive/2009/01/21/prb-rml-utilities-readtrace-and-how-to-workaround-mars.aspx.

The first thing that you have to do with your SQL Trace workload is to load it into a trace table in a SQL Server database through the following command:

SELECT * INTO TraceTable FROM fn_trace_gettable(‘D:\Workload.trc’, 0)

Please keep in mind that this process could also take some time when you are working with large .TRC files. After you have loaded the trace to your trace table, you have to change it as described in the above mentioned weblog posting:

UPDATE TraceTable
SET SPID =
CASE
    WHEN RequestID = 0 THEN
        SPID
    ELSE
        32768 + (SPID * 20) + (RequestID % 20)
END,
RequestID = 0
GO

Finally you can open the trace table in SQL Server Profiler and export everything to a new .TRC file in the file system. Again this could take a large amount of time when you are working with a large trace table.

In the next step you can now convert the changed .TRC file to RML files that can be replayed against a SQL Server instance. For that purpose you can use the readtrace.exe utility along with the trace flag T28. See the following command line:

readtrace.exe -Id:\MyNewTrace.trc -od:\PerfTesting_Output -T28

The problem that I encountered with this approach is the fact that I get RML files that can’t be replayed against SQL Server. Here is the reason: each RML file includes the SPID in the file name, and I got RML files like “SQL33333.rml” – and 33333 is an invalid SPID in SQL Server, because SQL Server can “only” handle 32767 connections. For that reason I have changed the file names to SPIDs that are supported by SQL Server. If you have a large amount of RML files, it would also make sense to write a simple program which renames the RML files to valid file names.

After I have changed each file name to a valid SPID, you are now able to replay the workload with the ostress.exe utility:

ostress.exe -id:\PerfTesting_Output\*.rml -mreplay -csample.ini

As you can see you have to do a lot of things when you want to replay workloads from .NET applications against SQL Server that are using MARS. With this approach it is very easy to check in a test environment what the impacts of some configuration changes are or how your application reacts when you are adding or deleting indexes.

-Klaus

1 thought on “Replaying workloads with the RML Utilities”

  1. Nice post. I tried experimenting with OStress and the RML tools this weekend and am having issues getting it to replay the trace with real timings. I set up my trace and manually run 4 queries (pausing a few seconds between each) and then convert to .RML via ReadTrace.EXE. When I replay the trace with ostress and monitor via profiler there is no delay between the 4 queries.

    I’ve tried running them from seperate connections as well as from the one. I have also verified that the batch times are recorded and displayed correctly when Reporter opens after ReadTrace. I am using -T89 to enable exact sequencing mode, but there is no difference in the results

    Do you have you any suggestions?

Comments are closed.