"If you want sense, you'll have to make it yourself." - Norton Juster, The Phantom Tollbooth In a recent post to an Oracle forum a query was presented and a tuning request was made. It appears that the query was taking 20 hours to complete. Through further interrogation and responses it was discovered that the dates were being stored in a VARCHAR2 column and implicit date conversion was being used. To show how much of a problem this can cause the following example was created; notice the results returned and the execution plans generated for each run of the query, once with teh table defined in the manner the original poster described and one with the date column using the DATE datatype. We begin: SQL> SQL> create table datetst( 2 myid number, 3 res_value varchar2(3), 4 mydt varchar2(20)); Table created. SQL> SQL> create index datetest_idx on datetst(res_value); Index created. SQL> SQL> begin 2 for i in 1..10000 loop 3 if mod(i,23) = 0 then 4 insert into datetst 5 values(i, 'PUR', to_char(sysdate+i, 'MM/DD/RRRR')); 6 else 7 insert into datetst 8 values(i, 'BID', to_char(sysdate+i, 'MM/DD/RRRR')); 9 end if; 10 end loop; 11 12 commit; 13 end; 14 / PL/SQL procedure successfully completed. SQL> Let’s now run a query using conditions similar to the posted query and see what Oracle returns: SQL> SQL> SQL> select * 2 from datetst 3 where mydt As expected the implicit date conversion failed; modifying the query to explicitly convert the strings to dates produces ‘interesting’ results: SQL> SQL> select * 2 from datetst 3 where res_value = 'PUR' 4 and mydt The query should have returned no more than 10 rows that met the criteria, and it returned 222. Looking at the plan we see: SQL> SQL> select * From table(dbms_xplan.display_cursor()); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 6qatrtphp5wjt, child number 0 ------------------------------------- select * from datetst where res_value = 'PUR' and mydt Oracle found 222 rows that ‘matched’ the conditions, illustrating a problem of using an incorrect datatype; Oracle can’t know these are dates and compares them as ASCII strings creating a result set that is much larger than it should be. We drop the table and start over: SQL> SQL> drop table datetst purge; Table dropped. SQL> SQL> create table datetst( 2 myid number, 3 res_value varchar2(3), 4 mydt date); Table created. SQL> SQL> create index datetest_idx on datetst(res_value); Index created. SQL> SQL> begin 2 for i in 1..10000 loop 3 if mod(i,23) = 0 then 4 insert into datetst 5 values(i, 'PUR', sysdate+i); 6 else 7 insert into datetst 8 values(i, 'BID', sysdate+i); 9 end if; 10 end loop; 11 12 commit; 13 end; 14 / PL/SQL procedure successfully completed. SQL> We now run the original query (that didn’t have explicit date conversion, since we no longer need it) and examine the results: SQL> SQL> select * 2 from datetst 3 where res_value = 'PUR' 4 and mydt SQL> select * From table(dbms_xplan.display_cursor()); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 0m2c2mv7zhx49, child number 0 ------------------------------------- select * from datetst where res_value = 'PUR' and mydt Oracle now found the 10 rows we sought using the conditions we specified because the date data was correctly stored as a DATE datatype. Using VARCHAR2 made the result set 2,220 percent larger, and that was for a 10000 row table. Let’s re-run the example with 1,000,000 rows and see what numbers Oracle produces: SQL> SQL> create table datetst( 2 myid number, 3 res_value varchar2(3), 4 mydt varchar2(20)); Table created. SQL> SQL> create index datetest_idx on datetst(res_value); Index created. SQL> SQL> begin 2 for i in 1..1000000 loop 3 if mod(i,23) = 0 then 4 insert into datetst 5 values(i, 'PUR', to_char(sysdate+i, 'MM/DD/RRRR')); 6 else 7 insert into datetst 8 values(i, 'BID', to_char(sysdate+i, 'MM/DD/RRRR')); 9 end if; 10 end loop; 11 12 commit; 13 end; 14 / PL/SQL procedure successfully completed. SQL> SQL> select * 2 from datetst 3 where mydt SQL> select * 2 from datetst 3 where res_value = 'PUR' 4 and mydt SQL> select * From table(dbms_xplan.display_cursor()); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 6qatrtphp5wjt, child number 0 ------------------------------------- select * from datetst where res_value = 'PUR' and mydt SQL> drop table datetst purge; Table dropped. SQL> SQL> create table datetst( 2 myid number, 3 res_value varchar2(3), 4 mydt date); Table created. SQL> SQL> create index datetest_idx on datetst(res_value); Index created. SQL> SQL> begin 2 for i in 1..1000000 loop 3 if mod(i,23) = 0 then 4 insert into datetst 5 values(i, 'PUR', sysdate+i); 6 else 7 insert into datetst 8 values(i, 'BID', sysdate+i); 9 end if; 10 end loop; 11 12 commit; 13 end; 14 / PL/SQL procedure successfully completed. SQL> SQL> select * 2 from datetst 3 where res_value = 'PUR' 4 and mydt SQL> select * From table(dbms_xplan.display_cursor()); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 0m2c2mv7zhx49, child number 0 ------------------------------------- select * from datetst where res_value = 'PUR' and mydt With 1,000,000 rows of data Oracle inflated the original 10-row result set to 22,054 rows, a whopping 220,440 percent increase. As the data volumes increase this result set will increase to even larger numbers, resulting in extremely long query times and vast numbers of incorrect results, something I doubt the original poster had counted on. Using the correct datatype is essential in ensuring Oracle can do its job and do it properly, returning result sets that are reliable. Storing data in format that doesn’t reflect the actual data type can be disastrous, as illustrated here. It pays when writing or purchasing applications that the proper datatype is in force for the columns being used. It only makes sense. Filed under: General
↧