CPU affinity in Oracle comparison to SQL Server

 
OS: REL7 64 (24 core), DB version: Oracle 12.1.0.2.0
OS: WINDOWS SERVER 2016 64 (4 core), DB version: SQL Server 2016

In the current topic I am going to explain CPU affinity in Oracle with comparison to SQL Server.

It is known that we can use SQL Server Management Studio to make CPU affinity in SQL Server .Please see below.

Even by using query like below, we can find out on which core a process is running on in SQL Server.

SELECT	ipir.command, ipir.scheduler_id, os.cpu_id, .........
 FROM sys.dm_exec_requests ipir  INNER JOIN sys.dm_os_schedulers os
  ON ipir.scheduler_id = os.scheduler_id

But what about Oracle? Which approach should we use in Oracle? Does Oracle have such abilities to do CPU affinity?

Lets make some test and get answer for questions mentioned above.

LSCPU command shows 24 core in the system.

Lets set CPU_COUNT parameter for the database to 1.

The question is: If we set CPU_COUNT to 1 which of 24 processors will be used by process and can we make CPU affinity by Oracle itself?

Lets make a test by using steps below.

  1. Open SQLPLUS and run SELECT * FROM DBA_OBJECTS for emulation some activity.
  2. Find the PID of SQLPLUS process and use it in the TOP command.
  3. In the TOP command activate P column if it is not activated (press F then J then ENTER to activate CPU P column).
  4. Watch the CPU usage by process in TOP GUI.

From the last image we see (P column) that the process uses CPU 7 but after a while it will change to CPU 6 and etc…

As I understood which PROCESSOR will be used by process considers randomly (might be by using system internal algorithm).

Even TASKSET command shows us that no CPU affinity is configured for the process:  pid 23033’s current affinity list: 0-23

Conclusion: By setting CPU_COUNT we just set the number of using CPU (as expected) and which CPU will be used by process considers randomly (might be by using system internal algorithm).

There are two approaches for CPU caging “partition” and “over-subscribe“.

For maximum isolation one can use “partition” approach. With the partition approach, the sum of the CPU_COUNTs is less than or equal to the number of CPUs.
Example: suppose the total number of CPUs (i.e. CPU threads/cores) is 24.
We could set CPU_COUNT=8 for database , CPU_COUNT=8 for database B, and CPU_COUNT=8 for database C.
The disadvantage of the partition approach is that any CPU unused by one database instance cannot be used by another.

For better CPU utilization efficiency, use the “over-subscribe” approach.
Example: for a server with 24 CPUs, you could set CPU_COUNT=24 for database A, database B, and database C.
The sum of the CPU_COUNTs is 72, which is greater than the number of CPUs.
Therefore, if all databases are using their full CPU allocation, there will be some CPU contention.

However, by using TASKSET utility we can configure CPU affinity for Oracle process. Thanks for you attention!

Appendix: MOS: Doc ID 1585184.1. Using PROCESSOR_GROUP_NAME to bind a database instance to CPUs or NUMA nodes on Linux.

By using this MOS DOC you can achieve your target by creating Linux CGROUP group. This is an another solution.

 

TRANSACTION_MUTEX wait in MS SQL Server

A couple of days ago, I was informed about performance degradation which periodically happened  in our production system of MS SQL Server 2012 instance running under Windows Server 2012R2. QUERYSTORE would be good solution for monitoring performance for the situation, however it comes  in SQL Server 2014 version.  Because there is not QUERYSTORE technology in SQL Server 2012, lets create manual sampling by using  BRENTOZAR  script sp_BlitzFirst. I have created JOB which runs every 15 minute and calls the sp_BlitzFirst. Please see below:

USE [msdb]
GO
DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'DBAtools_Sampling',
@enabled=0,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'No description available.',
@category_name=N'[Uncategorized (Local)]',
@owner_login_name=N'SA', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode<> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Step1',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=N'EXEC sp_BlitzFirst
@Seconds = 30,
@OutputDatabaseName = ''DBAtools'',
@OutputSchemaName = ''dbo'',
@OutputTableName = ''BlitzFirst'',
@OutputTableNameFileStats = ''BlitzFirst_FileStats'',
@OutputTableNamePerfmonStats = ''BlitzFirst_PerfmonStats'',
@OutputTableNameWaitStats = ''BlitzFirst_WaitStats'',
@OutputTableNameBlitzCache = ''BlitzCache''',
@database_name=N'DBAtools',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Schedule1',
@enabled=1,
@freq_type=4,
@freq_interval=1,
@freq_subday_type=4,
@freq_subday_interval=15,
@freq_relative_interval=0,
@freq_recurrence_factor=0,
@active_start_date=20190620,
@active_end_date=99991231,
@active_start_time=0,
@active_end_time=235959,
@schedule_uid=N'1d27b7fb-ae33-4334-94f2-7b9101692760'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO

About  sp_BlitzFirst  you can read from BRENTOZR  blog. It creates a couple of tables and views in the given database for periodically sampling (30 seconds in our situation) from SQL Server internal performance views like  dm_os_wait_stats, dm_io_virtual_file_stats and etc.

After sampling we can select the blitzfirst_waitstats  table and see which are the most occurred waits.

select [wait_type],
 sum([wait_time_ms]) / 1000 / 3600 wt
  from [dbatools].[dbo].[blitzfirst_waitstats] group by [wait_type] 
    order by wt desc

wait_type	         wt
TRANSACTION_MUTEX	30024
BACKUPIO	        2504
TRACEWRITE	        1792
ASYNC_NETWORK_IO	1179
SOS_SCHEDULER_YIELD	906
CXPACKET	        753
LCK_M_IS	        696
LCK_M_SCH_M	        630
WRITELOG	        513
LCK_M_SCH_S	        452
LCK_M_IX	        319
...........

As we see the TRANSACTION_MUTEX is the top wait. The Books Online definition of TRANSACTION_MUTEX is that it “occurs during synchronization of access to a transaction by multiple batches. Lets go deeper and catch the query which generates TRANSACTION_MUTEX wait. For doing that I am going to use Extended Events technology.  

First of all I got the code of TRANSACTION_MUTEX wait 

SELECT
[map_key]
FROM sys.dm_xe_map_values
WHERE [name] = N'wait_types'
AND [map_value] = N'TRANSACTION_MUTEX';

map_key
---------
210

Then created Extended Event for catching Query which generate TRANSACTION_MUTEX wait.

CREATE EVENT SESSION [InvestigateWaits] ON SERVER 
ADD EVENT sqlos.wait_info(
    ACTION(sqlserver.database_id,sqlserver.plan_handle,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_stack,sqlserver.username)
    WHERE ([wait_type]=(210) AND [opcode]=(1))) 
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=51200 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=5 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

Activate created Extended Event and open Watch Live Data page.


We should see window like below.

After monitoring some period of time, At last I have got that query. The query and its plan shown below.

SELECT 
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT 
        COUNT(1) AS [A1]
        FROM  [dbo].[VW_DOCS] AS [Extent1]
        LEFT OUTER JOIN [dbo].[DOCS_READSTATUS] AS [Extent2] ON ([Extent1].[DocId] = [Extent2].[DocId]) AND ([Extent2].[WorkplaceId] = 24)
        WHERE ([Extent1].[DocDoctypeId] = 1) AND ([Extent1].[DocPeriodId] = 3) AND (([Extent1].[DocOrganizationId] = 24) OR ( EXISTS (SELECT 
            1 AS [C1]
            FROM  [dbo].[DOCS_EXECUTOR] AS [Extent3]
            INNER JOIN [dbo].[DOCS_DIRECTIONS] AS [Extent4] ON [Extent3].[ExecutorDirectionId] = [Extent4].[DirectionId]
            WHERE ([Extent1].[DocId] = [Extent3].[ExecutorDocId]) AND ((([Extent3].[ExecutorWorkplaceId] = 24) AND (1 = [Extent4].[DirectionConfirmed])) OR (([Extent3].[ExecutorOrganizationId] = 24) AND (1 = [Extent4].[DirectionConfirmed])))
        ))) AND (( NOT ((2 =  CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocDeleted] IS NOT NULL))) OR ((2 =  CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocInsertedById] = 24))) AND (1 = 1)
    )  AS [GroupBy1] 

It is one of the most usable query and it takes about 3 seconds to complete. From the query plan we see expensive Nested Loop join which will generate concurrency in case of many query like this will be running. However, If we change join method to Hash Join by using SQL Server hint OPTION(HASH JOIN), we will decrease concurrency and query run time (less then one second). Corrected version of query and its plan shown below.

 
SELECT 
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT 
        COUNT(1) AS [A1]
        FROM  [dbo].[VW_DOCS] AS [Extent1]
        LEFT OUTER JOIN [dbo].[DOCS_READSTATUS] AS [Extent2] ON ([Extent1].[DocId] = [Extent2].[DocId]) AND ([Extent2].[WorkplaceId] = 24)
        WHERE ([Extent1].[DocDoctypeId] = 1) AND ([Extent1].[DocPeriodId] = 3) AND (([Extent1].[DocOrganizationId] = 24) OR ( EXISTS (SELECT 
            1 AS [C1]
            FROM  [dbo].[DOCS_EXECUTOR] AS [Extent3]
            INNER JOIN [dbo].[DOCS_DIRECTIONS] AS [Extent4] ON [Extent3].[ExecutorDirectionId] = [Extent4].[DirectionId]
            WHERE ([Extent1].[DocId] = [Extent3].[ExecutorDocId]) AND ((([Extent3].[ExecutorWorkplaceId] = 24) AND (1 = [Extent4].[DirectionConfirmed])) OR (([Extent3].[ExecutorOrganizationId] = 24) AND (1 = [Extent4].[DirectionConfirmed])))
        ))) AND (( NOT ((2 =  CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocDeleted] IS NOT NULL))) OR ((2 =  CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocInsertedById] = 24))) AND (1 = 1)
    )  AS [GroupBy1] 
	OPTION (HASH JOIN)


Problem has been fixed. Thank you for attention 🙂


 

A Real life performance issue due to Fetch Next N rows in MS SQL Server

A couple of days ago, I was asked to improve the performance of a particular SQL query which took, in average, 10 second to complete. Here’s below the corresponding SQL and It’s  plan taken from a MS SQL Server 2017 instance running under Windows Server 2016 Standart.

SELECT [x].*
FROM [NotA] AS [x]
WHERE [x].[IsDeleted] = 0
ORDER BY [x].[Id]
OFFSET 5144530 ROWS FETCH NEXT 10 ROWS ONLY


NotA table has primary key Clustered Index on ID column. From the SQL Plan statistics we can observe that more then 5 million Index rows were scanned. But why almost all Index rows were scanned, despite that we read just 10 rows from the Table. Each Index leaf node has address for the next leaf node, the next one for the next leaf node and etc. The more value of OFFSET parameter  the more Index rows were scanned. Let’s see the SQL Plan if we set smaller value for OFFSET parameter.

SELECT [x].*
FROM [NotA] AS [x]
WHERE [x].[IsDeleted] = 0
ORDER BY [x].[Id]
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY

Just 20 rows were scanned and SQL takes less then 1 second to complete. But what should we do If we need to set big value for OFFSET parameter.

Solution for the situation will be using the exact value by BETWEEN statement in the query.  The SQL query and It’s Plan shown below.

SELECT [x].*
FROM [NotA] AS [x]
WHERE [x].[IsDeleted] = 0 and
[x].[Id] between 5144530 and 5144530 + 10
ORDER BY [x].[Id]
--OFFSET 5144530 ROWS FETCH NEXT 10 ROWS ONLY 

SQL query takes less then 1 second to complete.

In case of data is not sequential using approach like 5144530 and 5144530 + 10 is not correct of course. In that case you should use approach like below:

 
SELECT [x].* FROM [NotA] AS [x] WHERE [x].[IsDeleted] = 0 and [x].[Id] in 
(SELECT [x].id FROM [NotA] AS [x] WHERE [x].[IsDeleted] = 0 ORDER BY [x].[Id] 
OFFSET 5144530 ROWS FETCH NEXT 10 ROWS ONLY)

That’s all thanks for attention 🙂

Divide the mutex contention for function and procedure

This is a well-known from Andrey Nikolayev blog, how to divide mutex contention for packages and cursors.

For package:
 alter system set “_kgl_debug”=”name=’DBMS_APPLICATION_INFO’ schema=’SYS’ namespace=1 debug=33554432”, “name=’DBMS_APPLICATION_INFO’ schema=’SYS’ namespace=2 debug=33554432” scope=spfile;
 alter system set “_kgl_hot_object_copies”= 10 scope=spfile;

For cursor:
 alter system set "_kgl_debug" = "hash='4d1ef2753f3bb11043fd2f61c011abac' debug=33554432" scope=spfile;
 alter system set “_kgl_hot_object_copies”= 10 scope=spfile;

But what about Functions and Procedures? What syntax should we use ?
Let’s run the query and see what is happening in our system.

OS: REL7 64
DB version: 12.1.0.2.0

select * from (
   select case when (kglhdadr =  kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
             kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE 
               from x$kglob  
               order by kglobt24 desc) where type like 'FUNC%' or type like 'PROC%'  

CURSOR  ADDRESS           NAME         HASH_VALUE  TYPE    LOCKED_TOTAL PINNED_TOTAL EXECUTIONS  NAMESPACE
Parent  000000007B69E810  FORMAT_DATE  3111443796  FUNCTION      119         187       99371       1
Parent  000000007BEFE460  SYS_IXMLAGG  4008213881  FUNCTION      45          45        0           1
Parent  000000007E882E38  GET_ADDRESS  3903151007  FUNCTION      28          37        21          1
..........

In all cases we see 1 for NAMESPACE. I am going to clone the FORMAT_DATE function.
Let’s make sure that there is only one instance of FORMAT_DATE function is in the memory.

select * from (
   select case when (kglhdadr =  kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
             kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE 
               from x$kglob order by kglobt24 desc) 
               where name='FORMAT_DATE'

CURSOR  ADDRESS           NAME         HASH_VALUE  TYPE    LOCKED_TOTAL PINNED_TOTAL EXECUTIONS  NAMESPACE
Parent  000000007B69E810  FORMAT_DATE  3111443796  FUNCTION      119         187       99371       1

We see just one instance of FORMAT_DATE function. Let’s clone the FORMAT_DATE function.

alter system set "_kgl_debug"="name='FORMAT_DATE' schema='CMS' namespace=1 debug=33554432" scope=spfile;
alter system set "_kgl_hot_object_copies"= 10 scope=spfile;
restart instance ....

By default, Oracle created approximately “number of CPU Cores” hot objects copies. You can adjust this number by _kgl_hot_object_copies parameter.
After all check again.

select * from (
   select case when (kglhdadr =  kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
             kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE 
               from x$kglob order by kglobt24 desc) 
               where name='FORMAT_DATE'

CURSOR  ADDRESS           NAME         HASH_VALUE TYPE     LOCKED_TOTAL PINNED_TOTAL EXECUTIONS NAMESPACE
Parent	000000007BF1B3C8  FORMAT_DATE  2206514549 FUNCTION     2	 2	       0	 1
Parent	000000007E1F4310  FORMAT_DATE  4244497168 FUNCTION     2	 2	       0	 1
Parent	000000007B489908  FORMAT_DATE  3951634923 FUNCTION     2	 2	       0	 1
Parent	000000007E8C90D0  FORMAT_DATE  3111443796 FUNCTION     1	 0	       0	 1

There are 4 instances of FORMAT_DATE function and this count increases (up to 10 which was set above) / decreases depends on load.

Troubleshooting third party SQL query in SQL Server

During pick time on our production system (2-Node Cluster of SQL Server 2012(11.0.6020.0) on Windows Server 2012R2)  SQL Server Profiler showed  SQL query which took, in average, 10 seconds to complete. It is one of the most used SQL query and its completion in 10 seconds causes many problems.

exec sp_executesql N'SELECT [GroupBy1].[A1] AS [C1] FROM ( SELECT
COUNT(1) AS [A1] FROM [dbo].[V_DOCS] AS [Extent1]
LEFT OUTER JOIN [dbo].[DOCS_R] AS [Extent2] ON ([Extent1].[DocId] = [Extent2].[DocId]) AND ([Extent2].[WorkplaceId] = @p__linq__5)
WHERE (( EXISTS (SELECT 1 AS [C1] FROM ( SELECT
[Extent3].[DirectionId] AS [DirectionId],
[Extent3].[DirectionTypeId] AS [DirectionTypeId],
[Extent3].[DirectionConfirmed] AS [DirectionConfirmed]
FROM [dbo].[DOCS_D] AS [Extent3]
WHERE [Extent1].[DocId] = [Extent3].[DirectionDocId]
) AS [Project1]
WHERE (1 = [Project1].[DirectionConfirmed]) AND ( EXISTS (SELECT
1 AS [C1] FROM [dbo].[DOCS_E] AS [Extent4]
WHERE ([Project1].[DirectionId] = [Extent4].[ExecutorDirectionId]) AND ([Extent4].[ExecutorWorkplaceId] = @p__linq__0)
)) AND (4 &amp;amp;amp;lt;&amp;amp;amp;gt; [Project1].[DirectionTypeId])
)) OR ([Extent1].[DocInsertedById] = @p__linq__1)) AND ([Extent1].[DocDoctypeId] = @p__linq__2) AND ([Extent1].[DocPeriodId] = @p__linq__3)
AND (( NOT ((2 = CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocDeleted] IS NOT NULL))) OR ((2 = CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocInsertedById] = @p__linq__4))) AND (1 = 1)
) AS [GroupBy1]',N'@p__linq__5 int,@p__linq__0 int,@p__linq__1 int,@p__linq__2 int,@p__linq__3 int,@p__linq__4 int',@p__linq__5=162,@p__linq__0=162,@p__linq__1=162,@p__linq__2=18,@p__linq__3=3,@p__linq__4=162

The plan of the query shown below.

Let’s change expensive Nested Loop to the Hash Join by using OPTION(HASH JOIN) hint. The query and its plan shown below.

exec sp_executesql N'SELECT [GroupBy1].[A1] AS [C1] FROM ( SELECT
COUNT(1) AS [A1] FROM [dbo].[V_DOCS] AS [Extent1]
LEFT OUTER JOIN [dbo].[DOCS_R] AS [Extent2] ON ([Extent1].[DocId] = [Extent2].[DocId]) AND ([Extent2].[WorkplaceId] = @p__linq__5)
WHERE (( EXISTS (SELECT 1 AS [C1] FROM ( SELECT
[Extent3].[DirectionId] AS [DirectionId],
[Extent3].[DirectionTypeId] AS [DirectionTypeId],
[Extent3].[DirectionConfirmed] AS [DirectionConfirmed]
FROM [dbo].[DOCS_D] AS [Extent3]
WHERE [Extent1].[DocId] = [Extent3].[DirectionDocId]
) AS [Project1]
WHERE (1 = [Project1].[DirectionConfirmed]) AND ( EXISTS (SELECT
1 AS [C1] FROM [dbo].[DOCS_E] AS [Extent4]
WHERE ([Project1].[DirectionId] = [Extent4].[ExecutorDirectionId]) AND ([Extent4].[ExecutorWorkplaceId] = @p__linq__0)
)) AND (4 &amp;amp;amp;amp;lt;&amp;amp;amp;amp;gt; [Project1].[DirectionTypeId])
)) OR ([Extent1].[DocInsertedById] = @p__linq__1)) AND ([Extent1].[DocDoctypeId] = @p__linq__2) AND ([Extent1].[DocPeriodId] = @p__linq__3)
AND (( NOT ((2 = CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocDeleted] IS NOT NULL))) OR ((2 = CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocInsertedById] = @p__linq__4))) AND (1 = 1)
) AS [GroupBy1] OPTION(HASH JOIN)',N'@p__linq__5 int,@p__linq__0 int,@p__linq__1 int,@p__linq__2 int,@p__linq__3 int,@p__linq__4 int',@p__linq__5=162,@p__linq__0=162,@p__linq__1=162,@p__linq__2=18,@p__linq__3=3,@p__linq__4=162

SQL query execution time dropped from 10 seconds to less than one second. But this SQL query comes from third party product and we are not able to change it directly. Solution is creation of Plan Guide in SQL Server.

EXEC sys.sp_create_plan_guide @name = 'ForceHashJoin',
@stmt = N'SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
COUNT(1) AS [A1]
FROM [dbo].[V_DOCS] AS [Extent1]
LEFT OUTER JOIN [dbo].[DOCS_R] AS [Extent2] ON ([Extent1].[DocId] = [Extent2].[DocId]) AND ([Extent2].[WorkplaceId] = @p__linq__5)
WHERE (( EXISTS (SELECT
1 AS [C1]
FROM ( SELECT
[Extent3].[DirectionId] AS [DirectionId],
[Extent3].[DirectionTypeId] AS [DirectionTypeId],
[Extent3].[DirectionConfirmed] AS [DirectionConfirmed]
FROM [dbo].[DOCS_D] AS [Extent3]
WHERE [Extent1].[DocId] = [Extent3].[DirectionDocId]
) AS [Project1]
WHERE (1 = [Project1].[DirectionConfirmed]) AND ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[DOCS_E] AS [Extent4]
WHERE ([Project1].[DirectionId] = [Extent4].[ExecutorDirectionId]) AND ([Extent4].[ExecutorWorkplaceId] = @p__linq__0)
)) AND (4 <> [Project1].[DirectionTypeId])
)) OR ([Extent1].[DocInsertedById] = @p__linq__1)) AND ([Extent1].[DocDoctypeId] = @p__linq__2) AND ([Extent1].[DocPeriodId] = @p__linq__3)
AND (( NOT ((2 = CAST( [Extent1].[DocDeleted] AS int)) AND ([Extent1].[DocDeleted] IS NOT NULL))) OR ((2 = CAST( [Extent1].[DocDeleted] AS int))
AND ([Extent1].[DocInsertedById] = @p__linq__4))) AND (1 = 1)
) AS [GroupBy1]',
@type = N'SQL',
@module_or_batch = NULL,
@params = N'@p__linq__5 int,@p__linq__0 int,@p__linq__1 int,@p__linq__2 int,@p__linq__3 int,@p__linq__4 int',
@hints = N'OPTION(HASH JOIN)';

Thats all thank you for attention 🙂

Same query runs slowly in Insert as select statement

A couple of days ago, I was asked to improve the performance of a particular SQL query which took, in average, 30 to 36 minutes to complete. Here’s below the corresponding SQL  plan taken from a 12.1.0.2 Oracle instance running under RED hat EL7. The SQL text of the problematic query is a little bit big, therefore I am not going to post it here. However, it is worth mentioning that it is insert as select statement.

Plan hash value: 3778189745
-------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name                       | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                               |                            |   120K|   479M|       |  7043K  (1)| 00:04:36 |
|   1 |  LOAD TABLE CONVENTIONAL                       | REPORT                     |       |       |       |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED          | T1                         |     1 |    21 |       |     9   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                            | CAT_IDX                    |     8 |       |       |     1   (0)| 00:00:01 |
|   4 |   NESTED LOOPS OUTER                           |                            |   120K|   479M|       |  7043K  (1)| 00:04:36 |
|   5 |    VIEW                                        |                            |    81 |   319K|       |   414   (1)| 00:00:01 |
|*  6 |     CONNECT BY NO FILTERING WITH START-WITH    |                            |       |       |       |            |          |
|*  7 |      TABLE ACCESS BY INDEX ROWID BATCHED       | T1                         |   140 |  4900 |       |    48   (0)| 00:00:01 |
|*  8 |       INDEX RANGE SCAN                         | CTID_IDX                   |  1656 |       |       |     8   (0)| 00:00:01 |
|   9 |    VIEW                                        | VW_LAT_5D3B31E1            |  1484 |   207K|       | 86956   (1)| 00:00:04 |
|* 10 |     VIEW                                       |                            |  1484 |   244K|       | 86956   (1)| 00:00:04 |
|  11 |      SORT GROUP BY                             |                            |  1484 |   128K|       | 86956   (1)| 00:00:04 |
|  12 |       GENERATE CUBE                            |                            |  1484 |   128K|       | 86956   (1)| 00:00:04 |
|  13 |        SORT GROUP BY                           |                            |  1484 |   128K|  2824K| 86956   (1)| 00:00:04 |
|* 14 |         FILTER                                 |                            |       |       |       |            |          |
|* 15 |          FILTER                                |                            |       |       |       |            |          |
|  16 |           NESTED LOOPS OUTER                   |                            | 27414 |  2382K|       | 86393   (1)| 00:00:04 |
|  17 |            NESTED LOOPS ANTI                   |                            | 27420 |  1097K|       |  4116   (1)| 00:00:01 |
|  18 |             TABLE ACCESS BY INDEX ROWID BATCHED| T2                         | 27423 |   964K|       |  4115   (1)| 00:00:01 |
|  19 |              BITMAP CONVERSION TO ROWIDS       |                            |       |       |       |            |          |
|  20 |               BITMAP AND                       |                            |       |       |       |            |          |
|  21 |                BITMAP MINUS                    |                            |       |       |       |            |          |
|  22 |                 BITMAP MINUS                   |                            |       |       |       |            |          |
|  23 |                  BITMAP MINUS                  |                            |       |       |       |            |          |
|  24 |                   BITMAP MINUS                 |                            |       |       |       |            |          |
|* 25 |                    BITMAP INDEX SINGLE VALUE   | T2_TYPE_INDX               |       |       |       |            |          |
|* 26 |                    BITMAP INDEX SINGLE VALUE   | T2_STATUS_INDX             |       |       |       |            |          |
|* 27 |                   BITMAP INDEX SINGLE VALUE    | T2_STATUS_INDX             |       |       |       |            |          |
|* 28 |                  BITMAP INDEX SINGLE VALUE     | T2_STATUS_INDX             |       |       |       |            |          |
|* 29 |                 BITMAP INDEX SINGLE VALUE      | T2_STATUS_INDX             |       |       |       |            |          |
|  30 |                BITMAP CONVERSION FROM ROWIDS   |                            |       |       |       |            |          |
|* 31 |                 INDEX RANGE SCAN               | T2_ORGAN_ID_INDX           |       |       |       |   125   (0)| 00:00:01 |
|* 32 |             INDEX UNIQUE SCAN                  | RPT_PK                     |     1 |     5 |       |     0   (0)| 00:00:01 |
|  33 |            VIEW                                | VW_LAT_E5ABEE75            |     1 |    48 |       |     3   (0)| 00:00:01 |
|* 34 |             FILTER                             |                            |       |       |       |            |          |
|  35 |              TABLE ACCESS BY INDEX ROWID       | T3                         |     1 |    24 |       |     3   (0)| 00:00:01 |
|* 36 |               INDEX UNIQUE SCAN                | T3_PK                      |     1 |       |       |     2   (0)| 00:00:01 |
|* 37 |          FILTER                                |                            |       |       |       |            |          |
|* 38 |           TABLE ACCESS BY INDEX ROWID BATCHED  | T4                         |     1 |    29 |       |     8   (0)| 00:00:01 |
|* 39 |            INDEX RANGE SCAN                    | T4_CASE_ID_IDX             |     4 |       |       |     3   (0)| 00:00:01 |
|* 40 |          FILTER                                |                            |       |       |       |            |          |
|* 41 |           TABLE ACCESS BY INDEX ROWID BATCHED  | T3                         |     1 |    23 |       |     5   (0)| 00:00:01 |
|* 42 |            INDEX RANGE SCAN                    | INDX_T3_CASE_ID            |     1 |       |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------------------------

We see expensive NESTED LOOPS OUTER in the SQL Plan. However, SQL text itself (out of insert statement) uses more cheaper HASH JOIN RIGHT OUTER.

Plan hash value: 2043906871
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                       | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                            |    81 |   333K|       | 87370   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED         | T1                         |     1 |    21 |       |     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                           | CAT_IDX                    |     8 |       |       |     1   (0)| 00:00:01 |
|*  3 |  HASH JOIN RIGHT OUTER                       |                            |    81 |   333K|       | 87370   (1)| 00:00:04 |
|   4 |   VIEW                                       |                            |  1484 |   244K|       | 86956   (1)| 00:00:04 |
|   5 |    SORT GROUP BY                             |                            |  1484 |   128K|       | 86956   (1)| 00:00:04 |
|   6 |     GENERATE CUBE                            |                            |  1484 |   128K|       | 86956   (1)| 00:00:04 |
|   7 |      SORT GROUP BY                           |                            |  1484 |   128K|  2824K| 86956   (1)| 00:00:04 |
|*  8 |       FILTER                                 |                            |       |       |       |            |          |
|*  9 |        FILTER                                |                            |       |       |       |            |          |
|  10 |         NESTED LOOPS OUTER                   |                            | 27414 |  2382K|       | 86393   (1)| 00:00:04 |
|  11 |          NESTED LOOPS ANTI                   |                            | 27420 |  1097K|       |  4116   (1)| 00:00:01 |
|  12 |           TABLE ACCESS BY INDEX ROWID BATCHED| T2                         | 27423 |   964K|       |  4115   (1)| 00:00:01 |
|  13 |            BITMAP CONVERSION TO ROWIDS       |                            |       |       |       |            |          |
|  14 |             BITMAP AND                       |                            |       |       |       |            |          |
|  15 |              BITMAP MINUS                    |                            |       |       |       |            |          |
|  16 |               BITMAP MINUS                   |                            |       |       |       |            |          |
|  17 |                BITMAP MINUS                  |                            |       |       |       |            |          |
|  18 |                 BITMAP MINUS                 |                            |       |       |       |            |          |
|* 19 |                  BITMAP INDEX SINGLE VALUE   | T2_TYPE_INDX               |       |       |       |            |          |
|* 20 |                  BITMAP INDEX SINGLE VALUE   | T2_STATUS_INDX             |       |       |       |            |          |
|* 21 |                 BITMAP INDEX SINGLE VALUE    | T2_STATUS_INDX             |       |       |       |            |          |
|* 22 |                BITMAP INDEX SINGLE VALUE     | T2_STATUS_INDX             |       |       |       |            |          |
|* 23 |               BITMAP INDEX SINGLE VALUE      | T2_STATUS_INDX             |       |       |       |            |          |
|  24 |              BITMAP CONVERSION FROM ROWIDS   |                            |       |       |       |            |          |
|* 25 |               INDEX RANGE SCAN               | T2_ORGAN_ID_INDX           |       |       |       |   125   (0)| 00:00:01 |
|* 26 |           INDEX UNIQUE SCAN                  | RPT_PK                     |     1 |     5 |       |     0   (0)| 00:00:01 |
|  27 |          VIEW                                | VW_LAT_A2C3E886            |     1 |    48 |       |     3   (0)| 00:00:01 |
|* 28 |           FILTER                             |                            |       |       |       |            |          |
|  29 |            TABLE ACCESS BY INDEX ROWID       | T3                         |     1 |   104 |       |     3   (0)| 00:00:01 |
|* 30 |             INDEX UNIQUE SCAN                | T3_PK                      |     1 |       |       |     2   (0)| 00:00:01 |
|* 31 |        FILTER                                |                            |       |       |       |            |          |
|* 32 |         TABLE ACCESS BY INDEX ROWID BATCHED  | T4                         |     1 |    29 |       |     8   (0)| 00:00:01 |
|* 33 |          INDEX RANGE SCAN                    | T4_CASE_ID_IDX             |     4 |       |       |     3   (0)| 00:00:01 |
|* 34 |        FILTER                                |                            |       |       |       |            |          |
|* 35 |         TABLE ACCESS BY INDEX ROWID BATCHED  | T3                         |     1 |    23 |       |     5   (0)| 00:00:01 |
|* 36 |          INDEX RANGE SCAN                    | INDX_T3_CASE_ID            |     1 |       |       |     3   (0)| 00:00:01 |
|  37 |   VIEW                                       |                            |    81 |   319K|       |   414   (1)| 00:00:01 |
|* 38 |    CONNECT BY NO FILTERING WITH START-WITH   |                            |       |       |       |            |          |
|* 39 |     TABLE ACCESS BY INDEX ROWID BATCHED      | T1                         |   140 |  4900 |       |    48   (0)| 00:00:01 |
|* 40 |      INDEX RANGE SCAN                        | CTID_IDX                   |  1656 |       |       |     8   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------------------

Lets get the OUTLINE  section of that query and try to change expensive NESTED LOOPS OUTER join to more cheaper HASH JOIN RIGHT OUTER. 

select * from table(dbms_xplan.display(null,null,’+OUTLINE’))

BEGIN_OUTLINE_DATA
...................
USE_NL(@"SEL$D9E73049" "from$_subquery$_032"@"SEL$11") 
LEADING(@"SEL$D9E73049" "CTG"@"SEL$11" "from$_subquery$_032"@"SEL$11")
NO_ACCESS(@"SEL$D9E73049" "from$_subquery$_032"@"SEL$11")
NO_ACCESS(@"SEL$D9E73049" "CTG"@"SEL$11")
FULL(@"INS$1" "REPORT"@"INS$1")
...................
DB_VERSION('12.1.0.2')
OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA

INSERT /*+ USE_HASH(@"SEL$D9E73049" "from$_subquery$_032"@"SEL$11") */ SELECT ... FROM ...

After many attempts, it was not possible to change the plan. 😦

Thanks to Chinar Aliyev who informed me about BUG below.

Bug 22077191 SubOptimal Execution Plan
(Costly Nested Loop / Can’t force Hash Join) on Create Table As Select / INSERT SELECT with ANSI Joins