Very good video on you tube:
Thursday, 28 June 2012
SQL LOADER
SQL*Loader (sqlldr ) is the utility to
use for high performance data loads. The
data can be loaded from any text file and inserted into the database.
Maximizing SQL*Loader Performance
SQL*Loader reads a data file and a description
of the data which is defined in the control file. Using this
information and any additional specified parameters (either on the command line
or in the PARFILE), SQL*Loader loads the data into the database.
During processing, SQL*Loader writes messages
to the log file, bad rows to the bad file, and discarded rows to the discard
file.
The Control File
The SQL*Loader control file contains
information that describes how the data will be loaded. It contains the
table name, column datatypes, field delimiters, etc. It simply provides
the guts for all SQL*Loader processing.
Manually creating control files is an
error-prone process. The
following SQL script (controlfile.sql) can be used to generate an
accurate control file for a given table. The
script accepts a table name and a date format (to be used for date columns),
and generates a valid control file to use with SQL*Loader for that
table.
< controlfile.sql
set echo off ver off feed off pages 0
accept tname prompt 'Enter Name of Table: '
accept dformat prompt 'Enter Format to Use for Date Columns: '
spool &tname..ctl
select 'LOAD DATA'|| chr (10) ||
'INFILE ''' || lower
(table_name) || '.dat''' || chr (10) ||
'INTO TABLE '||
table_name || chr (10)||
'FIELDS TERMINATED BY
'','''||chr (10)||
'TRAILING NULLCOLS'
|| chr (10) || '('
from user_tables
where table_name = upper ('&tname');
select decode (rownum, 1, ' ', ' , ') ||
rpad (column_name,
33, ' ') ||
decode (data_type,
'VARCHAR2', 'CHAR NULLIF ('||column_name||'=BLANKS)',
'FLOAT', 'DECIMAL EXTERNAL
NULLIF('||column_name||'=BLANKS)',
'NUMBER', decode
(data_precision, 0,
'INTEGER EXTERNAL NULLIF ('||column_name||
'=BLANKS)', decode (data_scale, 0,
'INTEGER EXTERNAL NULLIF ('||
column_name||'=BLANKS)',
'DECIMAL EXTERNAL NULLIF ('||
column_name||'=BLANKS)')),
'DATE', 'DATE
"&dformat" NULLIF ('||column_name||'=BLANKS)', null)
from user_tab_columns
where table_name = upper ('&tname')
order by column_id;
select ')'
from dual;
spool off
Once executed and given a table name and date
format, controlfile.sql will generate a
control file with the following contents:
< controlfile.par
LOAD DATA
INFILE 'table_with_one_million_rows.dat'
INTO TABLE TABLE_WITH_ONE_MILLION_ROWS
FIELDS TERMINATED BY ','
TRAILING NULLCOLS
(
COL1 DECIMAL EXTERNAL NULLIF (COL1=BLANKS)
,
COL2 DECIMAL EXTERNAL NULLIF (COL2=BLANKS)
,
COL3 CHAR NULLIF (COL3=BLANKS)
,
COL4 CHAR NULLIF (COL4=BLANKS)
,
COL5 CHAR NULLIF (COL5=BLANKS)
,
COL6 DATE "MM-DD-YYYY" NULLIF (COL6=BLANKS)
)
The control file can also specify that records
are in fixed format. A
file is in fixed record format when all records in a datafile are the same
length. The
control file specifies the specific starting and ending byte location of each
field. This
format is harder to create and less flexible but can yield performance
benefits. A
control file specifying a fixed format for the same table could look like the
following:
LOAD DATA
INFILE 'table_with_one_million_rows.dat'
INTO TABLE TABLE_WITH_ONE_MILLION_ROWS
(
COL1 POSITION(1:4) INTEGER EXTERNAL
, COL2 POSITION(6:9) INTEGER EXTERNAL
, COL3 POSITION(11:46) CHAR
, col4 position(48:83) CHAR
, col5 position(85:120) CHAR
, COL6 POSITION(122:130) DATE
"MMDDYYYY"
)
The Log File
The log file contains information about the
SQL*loader execution. It
should be viewed after each SQL*Loader job is complete. Especially
interesting is the summary information at the bottom of the log, including CPU
time and elapsed time. The
data below is a sample of the contents of the log file.
*****************************************************************************************
SQL*Loader: Release 9.2.0.1.0 - Production on Mon Mar 10 23:39:04 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
*****************************************************************************************
Control File: sqlload.par
Data File: sqlload.txt
Bad File: sqlload.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct
Table TABLE_WITH_ONE_MILLION_ROWS, loaded from every logical
record.
Insert option in effect for this table: INSERT
TRAILING NULLCOLS option in effect
Table TABLE_WITH_ONE_MILLION_ROWS, loaded from every logical
record.
Insert option in effect for this table: INSERT
TRAILING NULLCOLS option in effect
Column
Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ----
---------
COL1 FIRST * , CHARACTER
NULL if COL1 = BLANKS
COL2 NEXT * , CHARACTER
NULL if COL2 = BLANKS
COL3 NEXT * , CHARACTER
NULL if COL3 = BLANKS
COL4 NEXT * , CHARACTER
NULL if COL4 = BLANKS
COL5 NEXT * , CHARACTER
NULL if COL5 = BLANKS
COL6 NEXT * , DATE MMDDYYYY
NULL if COL6 = BLANKS
Table TABLE_WITH_ONE_MILLION_ROWS:
1000000 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN
clauses were failed.
0 Rows not loaded because all fields
were null.
Date cache:
Max Size: 1000
Entries
: 1
Hits : 999999
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 512000
Read buffer bytes: 1048576
Total logical records
skipped: 0
Total logical records read: 1000000
Total logical records
rejected: 0
Total logical records
discarded: 0
Total stream buffers loaded by SQL*Loader main
thread: 255
Total stream buffers loaded by SQL*Loader load
thread: 128
Run began on Mon Mar 10 23:39:04 2003
Run ended on Mon Mar 10 23:39:51 2003
Elapsed time was: 00:00:47.55
CPU time was: 00:00:12.85
SQL*Loader Options
SQL*Loader provides
the following options, which can be specified either on the command line or
within a parameter file:
· bad – A file that is
created when at least one record from the input file is rejected. The rejected data
records are placed in this file. A
record could be rejected for many reasons, including a non-unique key or a
required column being null.
· bindsize – [256000] The size
of the bind array in bytes.
· columnarrayrows – [5000] Specifies
the number of rows to allocate for direct path column arrays.
· control – The name of the
control file. This
file specifies the format of the data to be loaded.
· data – The name of the
file that contains the data to load.
· direct – [FALSE] Specifies
whether or not to use a direct path load or conventional.
· discard – The name of the
file that contains the discarded rows. Discarded
rows are those that fail the WHEN clause condition when selectively loading
records.
· discardmax – [ALL] The maximum
number of discards to allow.
· errors – [50] The number
of errors to allow on the load.
· external_table – [NOT_USED] Determines whether
or not any data will be loaded using external tables. The other valid
options include GENERATE_ONLY and EXECUTE.
· file – Used only with
parallel loads, this parameter specifies the file to allocate extents from.
· load – [ALL] The number
of logical records to load.
· log – The name of the file
used by SQL*Loader to log results.
· multithreading – The default is
TRUE on multiple CPU systems and FALSE on single CPU systems.
· parfile – [Y] The name of
the file that contains the parameter options for SQL*Loader.
· parallel – [FALSE] Specifies
a filename that contains index creation statements.
· readsize – The size of the
buffer used by SQL*Loader when reading data from the input file. This value should
match that of bindsize.
· resumable – [N] Enables and
disables resumable space
allocation. When
“Y”, the parameters resumable_name and resumable_timeout are
utilized.
· resumable_name – User defined
string that helps identify a resumable statement
that has been suspended. This
parameter is ignored unless resumable=
Y.
· resumable_timeout – [7200 seconds] The time period in
which an error must be fixed. This
parameter is ignored unless resumable =
Y.
· rows – [64] The number
of rows to load before a commit is issued
(conventional path only). For
direct path loads, rows are the number of rows to read from the data file
before saving the data in the datafiles.
· silent – Suppress errors
during data load. A
value of ALL will suppress all load messages. Other options
include DISCARDS, ERRORS, FEEDBACK, HEADER, and PARTITIONS.
· skip – [0] Allows the
skipping of the specified number of logical records.
· skip_unusable_indexes – [FALSE]
Determines whether SQL*Loader skips the building of indexes that are in an
unusable state.
· skip_index_maintenance – [FALSE] Stops
index maintenance for direct path loads only.
· streamsize – [256000]
Specifies the size of direct path streams in bytes.
· userid – The Oracle
username and password.
To check which options are available in any
release of SQL*Loader use this command:
sqlldr help=y
Maximizing SQL*Loader Performance
SQL*Loader is
flexible and offers many options that should be considered to maximize the
speed of data loads. These
include:
1. Use Direct Path
Loads -
The conventional path loader essentially loads the data by using standard
insert statements. The
direct path loader (direct=true) loads directly into the Oracle data
files and creates blocks in Oracle database block format. The fact that SQL
is not being issued makes the entire process much less taxing on the
database. There
are certain cases, however, in which direct path loads cannot be used
(clustered tables). To
prepare the database for direct path loads, the script $ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.
2. Disable Indexes and
Constraints. For
conventional data loads only, the disabling of indexes and constraints can
greatly enhance the performance of SQL*Loader.
3. Use a Larger Bind
Array. For
conventional data loads only, larger bind arrays limit the number of calls to
the database and increase performance. The size of the bind array is
specified using the bindsize parameter. The bind array's
size is equivalent to the number of rows it contains (rows=) times the
maximum length of each row.
4. Use ROWS=n to Commit Less
Frequently. For
conventional data loads only, the rows parameter specifies
the number of rows per commit. Issuing fewer
commits will enhance performance.
5. Use Parallel
Loads. Available
with direct path data loads only, this option allows multiple SQL*Loader jobs
to execute concurrently.
$ sqlldr
control=first.ctl parallel=true direct=true
$ sqlldr
control=second.ctl parallel=true direct=true
6. Use Fixed Width
Data. Fixed
width data format saves Oracle some processing when parsing the data. The savings can be
tremendous, depending on the type of data and number of rows.
7. Disable Archiving
During Load. While
this may not be feasible in certain environments, disabling database archiving
can increase performance considerably.
8. Use unrecoverable. The unrecoverable
option (unrecoverable load data) disables the writing of the data to the redo
logs. This
option is available for direct path loads only.
Wednesday, 27 June 2012
Wednesday, 13 June 2012
Update data in one table with data from another table
It's a tricky question asked in interviews:
create table a (col1 varchar2(10), col2 varchar2(10));
create table b (col1 varchar2(10), col2 varchar2(10));
insert into a VALUES ('1', 'FIELD2-1_A');
insert into B VALUES ('1', 'FIELD2-1_B');
insert into a VALUES ('2', 'FIELD2-2_A');
insert into B VALUES ('2', 'FIELD2-2_B');
SELECT * FROM A
SELECT * FROM B;
UPDATE a
SET col2 = (SELECT col2
FROM b
WHERE b.col1 = a.col1)
WHERE EXISTS (SELECT 1
FROM b
WHERE b.col1 = a.col1)
create table a (col1 varchar2(10), col2 varchar2(10));
create table b (col1 varchar2(10), col2 varchar2(10));
insert into a VALUES ('1', 'FIELD2-1_A');
insert into B VALUES ('1', 'FIELD2-1_B');
insert into a VALUES ('2', 'FIELD2-2_A');
insert into B VALUES ('2', 'FIELD2-2_B');
SELECT * FROM A
SELECT * FROM B;
UPDATE a
SET col2 = (SELECT col2
FROM b
WHERE b.col1 = a.col1)
WHERE EXISTS (SELECT 1
FROM b
WHERE b.col1 = a.col1)
Friday, 8 June 2012
UTL_FILE
-------------------------------------
--Get the Path, which is registered for UTL_FILE
select value from v$parameter where name like 'utl%file%'
---------------
/usr/tmp
--Create Table to Hold the Data
create table names
(
name varchar2(30)
);
--Create anonymous plsql block to insert data into table
SET SERVEROUTPUT ON;
DECLARE
line VARCHAR2(100);
namesfile UTL_FILE.FILE_TYPE;
BEGIN
-- Syntax : FOPEN ( directory alias, filename, open mode)
namesfile := UTL_FILE.FOPEN('/usr/tmp','abc.txt','R'); -- open in read mode
--UTL_FILE.('/tmp','xyz.txt','R'); -- open in read mode
LOOP
-- read a line from NAMES.TXT
-- NO_DATA_FOUND exception occurs when you reach end of file
-- Syntax : GET_LINE( filename, variable , max length)
begin
UTL_FILE.GET_LINE(namesfile,line,100);
INSERT INTO names VALUES
( line
); -- insert into NAMES table
exception
when no_data_found then
return;
end;
commit;
END LOOP;
EXCEPTION
WHEN OTHERS THEN
-- UTL_FILE.FCLOSE(namesfile); -- close file
--WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE(SQLERRM);
END;
SELECT * FROM names;
DECLARE
JOBSFILE UTL_FILE.FILE_TYPE;
-- TAKE ALL JOB TITLES FROM JOBS
CURSOR JOBSCUR IS
SELECT name FROM per_jobs where rownum <= 10;
BEGIN
-- OPEN FILE FOR WRITING
JOBSFILE := UTL_FILE.FOPEN('/usr/tmp','job.txt','W');
-- TAKE ONE RECORD FROM CURSOR
FOR REC IN JOBSCUR
LOOP
UTL_FILE.PUT_LINE(JOBSFILE, REC.name); -- WRITE INTO FILE
END LOOP;
UTL_FILE.FCLOSE(JOBSFILE); -- CLOSE FILE
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE( 'ERROR -->' || SQLERRM);
END;
----------------
For Windows with XE
grant execute on UTL_FILE to hr;
create directory filesdir as 'c:\oracle';
GRANT READ ON DIRECTORY FILESDIR TO HR;
grant write on directory filesdir to hr;
create table names
(
name varchar2(30)
);
SET SERVEROUTPUT ON;
DECLARE
line VARCHAR2(100);
namesfile UTL_FILE.FILE_TYPE;
BEGIN
-- Syntax : FOPEN ( directory alias, filename, open mode)
namesfile := UTL_FILE.FOPEN('FILESDIR','NAMES.TXT','R'); -- open in read mode
LOOP
-- read a line from NAMES.TXT
-- NO_DATA_FOUND exception occurs when you reach end of file
-- Syntax : GET_LINE( filename, variable , max length)
UTL_FILE.GET_LINE(namesfile,line,100);
INSERT INTO names VALUES
( line
); -- insert into NAMES table
END LOOP;
EXCEPTION
WHEN OTHERS THEN
UTL_FILE.FCLOSE(namesfile); -- close file
--WHEN OTHERS THEN
-- DBMS_OUTPUT.PUT_LINE(SQLERRM);
END;
SELECT * FROM names;
DECLARE
JOBSFILE UTL_FILE.FILE_TYPE;
-- TAKE ALL JOB TITLES FROM JOBS
CURSOR JOBSCUR IS
SELECT JOB_TITLE FROM JOBS ORDER BY JOB_TITLE;
BEGIN
-- OPEN FILE FOR WRITING
JOBSFILE := UTL_FILE.FOPEN('FILESDIR','JOBS.TXT','W');
-- TAKE ONE RECORD FROM CURSOR
FOR REC IN JOBSCUR
LOOP
UTL_FILE.PUT_LINE(JOBSFILE, REC.JOB_TITLE); -- WRITE INTO FILE
END LOOP;
UTL_FILE.FCLOSE(JOBSFILE); -- CLOSE FILE
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE( 'ERROR -->' || SQLERRM);
END;
--Get the Path, which is registered for UTL_FILE
select value from v$parameter where name like 'utl%file%'
---------------
/usr/tmp
--Create Table to Hold the Data
create table names
(
name varchar2(30)
);
--Create anonymous plsql block to insert data into table
SET SERVEROUTPUT ON;
DECLARE
line VARCHAR2(100);
namesfile UTL_FILE.FILE_TYPE;
BEGIN
-- Syntax : FOPEN ( directory alias, filename, open mode)
namesfile := UTL_FILE.FOPEN('/usr/tmp','abc.txt','R'); -- open in read mode
--UTL_FILE.('/tmp','xyz.txt','R'); -- open in read mode
LOOP
-- read a line from NAMES.TXT
-- NO_DATA_FOUND exception occurs when you reach end of file
-- Syntax : GET_LINE( filename, variable , max length)
begin
UTL_FILE.GET_LINE(namesfile,line,100);
INSERT INTO names VALUES
( line
); -- insert into NAMES table
exception
when no_data_found then
return;
end;
commit;
END LOOP;
EXCEPTION
WHEN OTHERS THEN
-- UTL_FILE.FCLOSE(namesfile); -- close file
--WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE(SQLERRM);
END;
SELECT * FROM names;
DECLARE
JOBSFILE UTL_FILE.FILE_TYPE;
-- TAKE ALL JOB TITLES FROM JOBS
CURSOR JOBSCUR IS
SELECT name FROM per_jobs where rownum <= 10;
BEGIN
-- OPEN FILE FOR WRITING
JOBSFILE := UTL_FILE.FOPEN('/usr/tmp','job.txt','W');
-- TAKE ONE RECORD FROM CURSOR
FOR REC IN JOBSCUR
LOOP
UTL_FILE.PUT_LINE(JOBSFILE, REC.name); -- WRITE INTO FILE
END LOOP;
UTL_FILE.FCLOSE(JOBSFILE); -- CLOSE FILE
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE( 'ERROR -->' || SQLERRM);
END;
----------------
For Windows with XE
grant execute on UTL_FILE to hr;
create directory filesdir as 'c:\oracle';
GRANT READ ON DIRECTORY FILESDIR TO HR;
grant write on directory filesdir to hr;
create table names
(
name varchar2(30)
);
SET SERVEROUTPUT ON;
DECLARE
line VARCHAR2(100);
namesfile UTL_FILE.FILE_TYPE;
BEGIN
-- Syntax : FOPEN ( directory alias, filename, open mode)
namesfile := UTL_FILE.FOPEN('FILESDIR','NAMES.TXT','R'); -- open in read mode
LOOP
-- read a line from NAMES.TXT
-- NO_DATA_FOUND exception occurs when you reach end of file
-- Syntax : GET_LINE( filename, variable , max length)
UTL_FILE.GET_LINE(namesfile,line,100);
INSERT INTO names VALUES
( line
); -- insert into NAMES table
END LOOP;
EXCEPTION
WHEN OTHERS THEN
UTL_FILE.FCLOSE(namesfile); -- close file
--WHEN OTHERS THEN
-- DBMS_OUTPUT.PUT_LINE(SQLERRM);
END;
SELECT * FROM names;
DECLARE
JOBSFILE UTL_FILE.FILE_TYPE;
-- TAKE ALL JOB TITLES FROM JOBS
CURSOR JOBSCUR IS
SELECT JOB_TITLE FROM JOBS ORDER BY JOB_TITLE;
BEGIN
-- OPEN FILE FOR WRITING
JOBSFILE := UTL_FILE.FOPEN('FILESDIR','JOBS.TXT','W');
-- TAKE ONE RECORD FROM CURSOR
FOR REC IN JOBSCUR
LOOP
UTL_FILE.PUT_LINE(JOBSFILE, REC.JOB_TITLE); -- WRITE INTO FILE
END LOOP;
UTL_FILE.FCLOSE(JOBSFILE); -- CLOSE FILE
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE( 'ERROR -->' || SQLERRM);
END;
Subscribe to:
Posts (Atom)