Thursday, May 30, 2013

How to setup ACFS manually


You can use ASMCA to setup ACFS, below are the steps to 
setup it manually:

Create Diskgroups

1 CREATE DISKGROUP ACFS_DATA EXTERNAL REDUNDANCY
2 DISK 'ORCL:ACFSVOL1' NAME ACFSVOL1
3 ATTRIBUTE 'au_size'='1M',
4 'compatible.asm' = '11.2',
5 'compatible.rdbms' = '11.2',
6 'compatible.advm' = '11.2';

Load ACFS Drivers

Validate ACFS drivers, 
#lsmod |grep oracle 
oracleacfs           1734520  0 
oracleadvm            242048  0 
oracleoks             295248  2 oracleacfs,oracleadvm 
oracleasm              84136  1 

If you don't see oracleacfs, oracleadvm and oracleoks, load the ACFS Driver:

$GRID_HOME/bin/acfsload start –s

Create the Dynamic volume on ACFS
1 sqlplus / as sysasm 
2 alter diskgroup ACFS_DATA add volume acfsvol_01 size 100G;
3  
4 select VOLUME_NAME, VOLUME_DEVICE from v$asm_volume;
5  
6 VOLUME_NAME                    VOLUME_DEVICE
7 ------------------------------ ------------------------------
8 ACFSVOL_01                     /dev/asm/acfsvol_01-335

Create Filesystem for ACFS Volume Device
As root user, to create a filesystem on linux for the ACFS volume device

mkfs -t acfs -b 4k /dev/asm/acfsvol_01-335
mkfs.acfs: version                   = 11.2.0.1.0.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/acfsvol_01-335
mkfs.acfs: volume size               = 268435456

Mount ACFS Filesystem

mkdir -p /datapump

mount -t acfs /dev/asm/acfsvol_01-335 /datapump

chown oracle:oinstall /datapump

df -h to verify.

Oracle ACFS Mount Registry
Register ACFS mount point to automatic startup ACFS mount devices

acfsutil registry -a /dev/asm/acfsvol_01-335 /datapump


Wednesday, May 29, 2013

Change MTU size for a Network Interface in Redhat linux


During the 11.2.0.3 GI installation, Oracle recommend to change MTU of private network interface from 9000 to 1500. 

Here are the steps I followed:

1. Verify current mtu setting, bond1 is priivate interface.

$ /bin/netstat -in                      
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
bond1      9000   0     2556      0      0      0     1629      0      0      0 BMmRU
eth0       1500   0  4246139      0      0      0  4779910      0      0      0 BMRU
eth2       9000   0     1269      0      0      0      816      0      0      0 BMsRU
eth3       9000   0     1287      0      0      0      813      0      0      0 BMsRU
lo        16436   0    99006      0      0      0    99006      0      0      0 LRU

2. Change mtu in ifcfg-bond1

$ vi /etc/sysconfig/network-scripts/ifcfg-bond1

DEVICE=bond1
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
NETMASK=255.255.0.0
IPADDR=10.10.10.170
MTU=9000
PEERDNS=yes
IPV6INIT=no
BONDING_OPTS="mode=0 miimon=100 primary=eth2"

Chagne MTU to 1500


3. Restart private network interface.

service network restart bond1

4 Verify that mtu size changed.

$ /bin/netstat -in
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
bond1      1500   0        0      0      0      0        9      0      0      0 BMmRU
eth0       1500   0      100      0      0      0       73      0      0      0 BMRU
eth2       1500   0        0      0      0      0        6      0      0      0 BMsRU
eth3       1500   0        0      0      0      0        3      0      0      0 BMsRU
lo        16436   0   118694      0      0      0   118694      0      0 

Tuesday, May 28, 2013

Define Transparent Application Failover(TAF) Policy with service


Services simplify the deployment of TAF. You can define a TAF policy for a service, and all connections using this service will automatically have TAF enabled. This does not require any client-side changes. The TAF setting on a service overrides any TAF setting in the client connection definition.

To define a TAF policy for a service, use SRVCTL as in the following example, where the service name is RAC_S and the database name is RAC:

$ srvctl modify service -d RAC -s RAC_S -P BASIC -e SELECT -z 5 -w 120

$ srvctl config service -d RAC -s RAC_S 
Service name: RAC_S
Service is enabled
Server pool: RAC_RAC_S
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: NONE
TAF failover retries: 5
TAF failover delay: 120
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition: 
Preferred instances: RAC3
Available instances: RAC2,RAC1

Dynamical sql to drop all disks from an ASM disk group

set pagesize 0

select 'ALTER DISKGROUP '||  dg.name || ' DROP DISK '|| d.name || ';' 
from v$asm_disk d join v$asm_diskgroup dg
on (d.group_number = dg.group_number)
where dg.name = '&diskgroup_name';

Saturday, May 25, 2013

recover copy of database failed with ORA-15012


Problem description:

During the incrementally Updated Backup, I got following error message 
while executing command -- "recover copy of database with tag '$TAGAME';" 

recovering datafile copy file number=00016 name=+RAC_FLASH/RACprd/datafile/index_medium.1854.813296975
channel oem_disk_backup: reading from backup piece +RAC_FLASH/RACprd/backupset/2013_05_25/nnndn1_RACprd_lvl0_0.2907.816333595
released channel: oem_disk_backup
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 05/25/2013 21:40:57
ORA-19870: error while restoring backup piece +RAC_FLASH/RACprd/backupset/2013_05_25/nnndn1_RACprd_lvl0_0.2907.816333595
ORA-19625: error identifying file +RAC_FLASH/RACprd/datafile/index_medium.1854.813296975
ORA-17503: ksfdopn:2 Failed to open file +RAC_FLASH/RACprd/datafile/index_medium.1854.813296975
ORA-15012: ASM file '+RAC_FLASH/RACprd/datafile/index_medium.1854.813296975' does not exist

Cause:
The copy of the data files are marked to be Available in the RMAN repository. 
So during an incremental merge, RMAN expects to merge the incremental 1 backup to the datafile copies.

Solution:
RMAN> crosscheck copy of database;

Friday, May 24, 2013

Shutdown abort in a rac instance


Here is how instance recovery happened in rac environment:

After you issue the SHUTDOWN ABORT command or after an instance terminates abnormally. An instance that is still running performs instance recovery for the instance that shut down. If no other instances are running, the next instance to open the database performs instance recovery for any instances needing it.

How to re-configure dataguard broker


Sometimes there is some issue with dataguard broker configuration, you may want to re-configure it. Here are the steps

1, Stop dataguard brokers on both primary and standby database

alter system set dg_broker_start = false;

2, Remove the dataguard configure files on both primary and standby database from ASM or filesystem

3, Start dataguard brokers on both primary and standby database

alter system set dg_broker_start = true;

4, Re-create dataguard broker configuraation

DGMGRL> CONNECT sys;
Password: password
Connected.

DGMGRL> CREATE CONFIGURATION 'PRIMARY' AS
>  PRIMARY DATABASE IS 'PRIMARY'
>  CONNECT IDENTIFIER IS PRIMARY;

DGMGRL> ADD DATABASE 'STANDBY' AS
>  CONNECT IDENTIFIER IS STANDBY;

DGMGRL> ENABLE CONFIGURATION;
Enabled.

DGMGRL> SHOW CONFIGURATION;


RMAN-04014: startup failed: ORA-00439: feature not enabled: Real Application Clusters


Problem description:

Use Rman duplicate to create standby database, if failed with following error message:


RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 05/24/2013 11:50:18
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-04014: startup failed: ORA-00439: feature not enabled: Real Application Clusters

Cause:

This error happens when the primary database is in RAC and the standby database running on single instance. During the duplicate the RMAN will read the source control file, its RAC parameters and try to duplicate RAC as well which is not possible as the standby is running on single instance.

Solution:

Add SET CLUSTER_DATABASE=’FALSE’ in RMAN duplicate command.

Tuesday, May 21, 2013

Config VIP on a newly added cluster node


After I added a new node to the cluster, somehow VIP is not configured automatically.

$ srvctl status nodeapps -n rac3-lnx
Network is enabled
Network is running on node: rac3-lnx
GSD is enabled
GSD is not running on node: rac3-lnx
ONS is enabled
ONS daemon is running on node: rac3-lnx
PRKO-2165 : VIP does not exist on node(s) : rac3-lnx

executed as root to add vip on node3

srvctl add vip -n rac3-lnx -A rac3-v/255.255.255.0 -k 1 -v


Start VIP on node3


$ srvctl start vip -n rac3-lnx

$ srvctl config vip -n rac3-lnx
VIP exists: /rac3-v/172.24.194.203/172.24.194.0/255.255.255.0/bond0, hosting node rac3-lnx


$ srvctl status nodeapps -n rac3-lnx
VIP rac3-v is enabled
VIP rac3-v is running on node: rac3-lnx
Network is enabled
Network is running on node: rac3-lnx
GSD is enabled
GSD is not running on node: rac3-lnx
ONS is enabled
ONS daemon is running on node: rac3-lnx
rac3-lnx (oracle)

find all trace file cover certain time period and copy to another directory.

Following is the command I used to copy all trace file cover periods between 1pm to 2pm on 2013-05-18 to a tmp directory

cp $(grep '2013-05-18 13:' *trc|awk -F: '{print $1}'|uniq) /tmp

Deleted trace files still holding by Oracle process

Problem description

delete an big trace file under bdump , but storage was not released

Symptoms
$ du -sh /u02/oracle/rac
31M

$ df -h /u02/oracle/rac
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/rac3--vg01-u02_rac
                       16G   15G  316M  98% /u02/oracle/rac

$/usr/sbin/lsof /u02/oracle/niids
COMMAND   PID   USER   FD   TYPE  DEVICE        SIZE   NODE NAME
COMMAND   PID   USER   FD   TYPE  DEVICE        SIZE   NODE NAME
oracle   1177 oracle    2w   REG 253,239         798  81929 /u02/oracle/rac/logs/bdump/RAC3_ora_1177.trc
oracle   1177 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)
oracle   1177 oracle    6w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   1177 oracle    8w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   1949 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)
oracle   1949 oracle    6w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   1949 oracle    8w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   7147 oracle    2w   REG 253,239         743  89415 /u02/oracle/rac/logs/bdump/RAC3_nsv0_7147.trc (deleted)
oracle   7147 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)
oracle   7147 oracle    6w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   7147 oracle    8w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   7147 oracle    9w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   9497 oracle    2w   REG 253,239      229376  89437 /u02/oracle/rac/logs/bdump/RAC3_lns1_9497.trc
oracle   9497 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)
oracle   9497 oracle    6w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   9497 oracle    8w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle   9497 oracle    9w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
ksh     13250 oracle  cwd    DIR 253,239        4096      2 /u02/oracle/rac
oracle  14578 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)
oracle  14578 oracle    6w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle  14578 oracle    8w   REG 253,239        2565  81922 /u02/oracle/rac/logs/bdump/alert_RAC3.log
oracle  17116 oracle    2w   REG 253,239 15650070528  89291 /u02/oracle/rac/logs/bdump/RAC3_j000_17116.trc (deleted)
oracle  17116 oracle    5w   REG 253,239         893  17213 /u02/oracle/rac/logs/udump/RAC3_ora_25913.trc (deleted)

Cause:
There is still oracle process holding deleted files


Solution:

restart the instance

srvctl stop instance -d RAC -i RAC3
srvctl start instance -d RAC -i RAC3

Monday, May 20, 2013

Add a new node, Run root.sh failed with " Configuration of ASM ... failed "


problem description

Adding 3rd nodes on a 2 node cluster, execute root.sh on 3rd nodes.

$ sudo /u01/app/crs/11.2.0.3/root.sh
Performing root user operation for Oracle 11g 

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/crs/11.2.0.3

Enter the full pathname of the local bin directory: [/usr/local/bin]: 
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/crs/11.2.0.3/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac3-lnx'
CRS-2676: Start of 'ora.mdnsd' on 'rac3-lnx' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac3-lnx'
CRS-2676: Start of 'ora.gpnpd' on 'rac3-lnx' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac3-lnx'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac3-lnx'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac3-lnx' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac3-lnx' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac3-lnx'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac3-lnx'
CRS-2676: Start of 'ora.diskmon' on 'rac3-lnx' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac3-lnx' succeeded

Symptoms


found following error in ocssd.log


2013-05-20 18:28:17.654: [   SKGFD][1098938688]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2013-05-20 18:28:17.654: [   SKGFD][1098938688]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2013-05-20 18:28:17.654: [   SKGFD][1098938688]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2013-05-20 18:28:17.654: [   SKGFD][1098938688]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2013-05-20 18:28:17.654: [    CSSD][1098938688]clssnmvDiskVerify: Successful discovery of 0 disks
2013-05-20 18:28:17.655: [    CSSD][1098938688]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2013-05-20 18:28:17.655: [    CSSD][1098938688]clssnmvFindInitialConfigs: No voting files found
2013-05-20 18:28:17.655: [    CSSD][1098938688](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds


Cause
By default, Oracle Database ASM instance parameter 'ASM_DISKSTRING' is unset and therefore uses a default value of '*'. This means that the ASM instance scans both/dev/raw/and/dev/oracleasm/disks/directories when searching for oracle ASM disks.

When using a combination of oracleasm disks (/dev/oracleasm/disks/*) with raw devices (/dev/raw/*), in particular where some raw devices are not owned by or accessible to the oracle user) as well as ASM instance parameter 'ASM_DISKSTRING' with an unspecified value, error ORA-15186 is likely to occur.

Solution
1. either edit the file /etc/sysconfig/oracleasm-_dev_oracleasm and change the lines:
ORACLEASM_SCANORDER=""
ORACLEASM_SCANEXCLUDE=""
to
ORACLEASM_SCANORDER="dm"
ORACLEASM_SCANEXCLUDE="sd"
or alternatively run the following command (as user root)
/usr/sbin/oracleasm configure -i -e -u user -g group -o "dm" -x "sd"

2. stop & restart ASMLIB as user root using:
/usr/sbin/oracleasm exit
/usr/sbin/oracleasm init

3. restart CRS or reboot node

The above steps need to be executed on all nodes

Relocate service in RAC environment

TEMPORARILY RELOCATE SERVICE FROM ONE NODE TO ANOTHER NODE IN CLUSTER

$ srvctl status service -d RAC                                       
Service RAC_S is running on instance(s) RAC1

$ srvctl relocate service -d RAC -s RAC_S -i RAC1 -t RAC2

$ srvctl status service -d RAC                                       
Service RAC_S is running on instance(s) RAC2


PERMANENT RELOCATE SERVICE FROM ONE NODE OF THE CLUSTER TO ANOTHER

$ srvctl config service -d RAC
Service name: RAC_S
Service is enabled
Server pool: RAC_RAC_S
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition: 
servPreferred instances: RAC1
Available instances: RAC2

$ srvctl modify service -d RAC -s RAC_S -n -i RAC2 -a RAC1

$ srvctl config service -d RAC
Service name: RAC_S
Service is enabled
Server pool: RAC_RAC_S
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition: 
Preferred instances: RAC2
Available instances: RAC1

Friday, May 17, 2013

Prerequisite check "CheckActiveFilesAndExecutables" failed while applying opatch

Problem description:

applying opatch failled:

$ /u01/app/crs/11.2.0.2/OPatch/opatch napply -oh /u01/app/oracle/product/11.2.0.2/db_2 -local /dba/staging/patches/16056267/
Oracle Interim Patch Installer version 11.2.0.3.4
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0.2/db_2
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0.2/db_2/oraInst.loc
OPatch version    : 11.2.0.3.4
OUI version       : 11.2.0.2.0
Log file location : /u01/app/oracle/product/11.2.0.2/db_2/cfgtoollogs/opatch/opatch2013-05-16_12-23-56PM_1.log

Verifying environment and performing prerequisite checks...
Prerequisite check "CheckActiveFilesAndExecutables" failed.
The details are:


Following executables are active :
/u01/app/oracle/product/11.2.0.2/db_2/lib/libclntsh.so.11.1
UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
Log file location: /u01/app/oracle/product/11.2.0.2/db_2/cfgtoollogs/opatch/opatch2013-05-16_12-23-56PM_1.log

OPatch failed with error code 73

Cause of problem:

Something is using oracle library.

Solution of problem:

Using fuser to find which process is using that library and kill it.

/sbin/fuser /u01/app/oracle/product/11.2.0.2/db_2/lib/libclntsh.so.11.1
/u01/app/oracle/product/11.2.0/db_2/lib/libclntsh.so.11.1: 12195m

ps -ef|grep 12195
kill -9 12195


Monday, May 13, 2013

RMAN configure archivelog deletion policy in Data Guard

In Primary Database

RMAN>configure archivelog deletion policy to applied on all standby;

In Physical Standby Database where backup is not taken

RMAN>configure archivelog deletion policy to applied on standby;

In Physical Standby Database where backup is taken

RMAN>configure archivelog deletion policy to NONE;


ACFS not mounted after restart on standalone server

Problem description:

After reboot, ACFS is not mounted. 


Cause of problem:

Oracle Restart does not support root-based Oracle ACFS resources. As a result, the following operations are not automatically performed:

    Loading Oracle ACFS drivers

    Mounting Oracle ACFS file systems listed in the Oracle ACFS mount registry

    Mounting resource-based Oracle ACFS database home file systems

Solution of problem:

1. Manually load the modules required for ACFS, with the command:

# lsmod |grep oracle 
oracleasm              84136  1 
#/u01/app/crs/11.2.0.2/bin/acfsload start –s 
ACFS-9327: Verifying ADVM/ACFS devices. 
ACFS-9322: completed 
#lsmod |grep oracle 
oracleacfs           1734520  0 
oracleadvm            242048  0 
oracleoks             295248  2 oracleacfs,oracleadvm 
oracleasm              84136  1 

2. Enable ACFS Volume:

ASMCMD> volinfo -a
Diskgroup Name: DATAPUMP_DATA

         Volume Name: DATAPUMPVOL
         Volume Device: /dev/asm/datapumpvol-438
         State: DISABLED
         Size (MB): 613376
         Resize Unit (MB): 256
         Redundancy: UNPROT
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage: ACFS
         Mountpath: /datapump 

ASMCMD> volenable -G DATAPUMP_DATA DATAPUMPVOL
ASMCMD> volinfo -a
Diskgroup Name: DATAPUMP_DATA

         Volume Name: DATAPUMPVOL
         Volume Device: /dev/asm/datapumpvol-438
         State: ENABLED
         Size (MB): 613376
         Resize Unit (MB): 256
         Redundancy: UNPROT
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage: ACFS
         Mountpath: /datapump 

ASMCMD> volstat -G DATAPUMP_DATA

3. Mount ACFS and change owner to oracle

/bin/mount -t acfs /dev/asm/datapumpvol-438 /datapump

chown oracle:oinstall /datapump


create a new filesystem on linux

Having the device ready

[oracle@rac1 ~]$ ls -al /dev/sd*
brw-r----- 1 root disk 8,   0 May 13 08:42 /dev/sda
brw-r----- 1 root disk 8,   1 May 13 08:42 /dev/sda1
brw-r----- 1 root disk 8,   2 May 13 08:42 /dev/sda2
brw-r----- 1 root disk 8,  16 May 13 08:42 /dev/sdb
brw-r----- 1 root disk 8,  17 May 13 08:44 /dev/sdb1
brw-r----- 1 root disk 8,  32 May 13 08:42 /dev/sdc
brw-r----- 1 root disk 8,  33 May 13 08:44 /dev/sdc1
brw-r----- 1 root disk 8,  48 May 13 08:42 /dev/sdd
brw-r----- 1 root disk 8,  49 May 13 08:44 /dev/sdd1
brw-r----- 1 root disk 8,  64 May 13 08:42 /dev/sde
brw-r----- 1 root disk 8,  65 May 13 08:44 /dev/sde1
brw-r----- 1 root disk 8,  80 May 13 08:42 /dev/sdf
brw-r----- 1 root disk 8,  81 May 13 08:44 /dev/sdf1
brw-r----- 1 root disk 8,  96 May 13 08:42 /dev/sdg
brw-r----- 1 root disk 8,  97 May 13 08:42 /dev/sdg1
brw-r----- 1 root disk 8, 112 May 13 08:42 /dev/sdh


Partitioning with fdisk

[root@rac1 ~]# fdisk /dev/sdh
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 6527.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-6527, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-6527, default 6527): 
Using default value 6527

Command (m for help): p

Disk /dev/sdh: 53.6 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdh1               1        6527    52428096   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Creating an ext3 filesystem

[root@rac1 ~]#  mkfs -t ext3 /dev/sdh1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
6553600 inodes, 13107024 blocks
655351 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
400 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
        4096000, 7962624, 11239424

Writing inode tables: done                            
Creating journal (32768 blocks): 
done
Writing superblocks and filesystem accounting information: 
done

This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.


Create a mount point:
[root@rac1 u01]# mkdir /u01


Edit the /etc/fstab to include the new partition
# vi /etc/fstab
Append the new line as follows:
/dev/sdh1 /u01 ext3 defaults 1 2 

Mount the filesystem

[root@rac1 u01]# mkdir /u01

Friday, May 10, 2013

crsctl stop crs failed if there is ACFS mount point

Problem description:

# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'oratest1-lnx'
CRS-2673: Attempting to stop 'ora.crsd' on 'oratest1-lnx'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'oratest1-lnx'
CRS-2673: Attempting to stop 'ora.DATAPUMP_DATA.dg' on 'oratest1-lnx'
CRS-2675: Stop of 'ora.DATAPUMP_DATA.dg' on 'oratest1-lnx' failed
CRS-2799: Failed to shut down resource 'ora.DATAPUMP_DATA.dg' on 'oratest1-lnx'
CRS-2794: Shutdown of Cluster Ready Services-managed resources on 'oratest1-lnx' has failed
CRS-2675: Stop of 'ora.crsd' on 'oratest1-lnx' failed
CRS-2799: Failed to shut down resource 'ora.crsd' on 'oratest1-lnx'
CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'oratest1-lnx' has failed
CRS-4687: Shutdown command has completed with error(s).
CRS-4000: Command Stop failed, or completed with errors.

Cause of problem:

There is an ACFS mount point 
df -h

/dev/asm/datapumpvol-371
                     167772160    398368 167373792   1% /datapump

Solution of problem:

# umount /datapump
# crsctl stop crs

cp -rf asking for "cp: overwrite ... ?"

Problem description:

$ cp -rf OPatch/* OPatch_OLD/
cp: overwrite `OPatch_OLD/crs/auto_patch.pl'? 

Cause of problem:

$ alias |grep cp
cp='cp -i'

Solution of problem:

$ 'cp' -rf OPatch/* OPatch_OLD/

Tuesday, May 7, 2013

Merge statement performance issue on RAC enviornment

Problem description:

Merge statement has fluctuating performance issue in RAC database.


Cause of the Problem:

Merge statment request blocks from different nodes

Solution of the Problem:

It is recommended to run merge in one singe node, and it is doable in RAC to make sure the connection to one node only.

1. add service RAC_S
srvctl add service -d RAC -s RAC_S -r RAC1 -a RAC2 

2. start service RAC_S
srvctl start service -d RAC -s RAC_S

3. check service status
srvctl status service -d RAC -s RAC_S

4. Verify service is added to RAC database
SQL> show parameter servcie

Monday, May 6, 2013

PL/SQL package PRICE_RMAN.DBMS_RCVCAT version 11.02.00.02 in RCVCAT database is not current


Problem Description

While starting Oracle RMAN backup whenever we use catalog database as repository it fails with following errors:
$ rman target / catalog catowner@rman_catdb

Recovery Manager: Release 11.2.0.3.0 - Production on Mon May 6 10:06:37 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: METHDEV (DBID=2414256617)
connected to recovery catalog database
PL/SQL package PRICE_RMAN.DBMS_RCVCAT version 11.02.00.02 in RCVCAT database is not current
PL/SQL package PRICE_RMAN.DBMS_RCVMAN version 11.02.00.02 in RCVCAT database is not current

Try to resync catalog, it fails with forllowing errors:

RMAN> resync catalog;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of resync command on default channel at 05/06/2013 10:07:05
RMAN-10015: error compiling PL/SQL program
RMAN-10014: PL/SQL error 0 on line 1655 column 12: Statement ignored
RMAN-10014: PL/SQL error 306 on line 1655 column 12: wrong number or types of arguments in call to 'CHECKTABLESPACE'

RMAN> exit


Recent Changes

The target database has been recently upgrade to 11.2.0.3 and so RMAN executable is upgraded but catalog database is not aware of this upgrade.

Cause of the Problem
The problem happened due to version mismatch between the RMAN-executable and the RMAN-catalog SCHEMA.

Solution of the Problem
Solution 01:
Upgrade the RMAN catalog SCHEMA. Start the RMAN-executable from the ORACLE_HOME which has been upgraded. There is only a connection to the CATALOG required.
A connection to the TARGET is optional.

For example issue,

$ rman catalog $RMAN_USERID/$RMAN_PASSWD@$RMAN_CONN_STR
RMAN> upgrade catalog;

Solution 02:
Don't use catalog for backup information. So run backup without connecting catalog database.

$ rman target /
$ backup database;


Friday, May 3, 2013

Upgrade to 11.2.0.3 - OCM: ORA-12012 and ORA-29280

Symptoms:

After updated database to 11.2.0.3, getting following error in alert log:

Errors in file /opt/oracle/diag/rdbms/db/trace/db_j001_26027.trc:
ORA-12012: error on auto execute of job "ORACLE_OCM"."MGMT_CONFIG_JOB_2_2"
ORA-29280: invalid directory path
ORA-06512: at "ORACLE_OCM.MGMT_DB_LL_METRICS", line 2436
ORA-06512: at line 1

Cause:

OCM is the Oracle Configuration Manager, a tool to proactively monitor your Oracle environment to provide this information to Oracle Software Support.
OCM is trying to write to a local directory which does not exist. Besides that the OCM version delivered with Oracle Database Patch Set 11.2.0.3 is older than the newest available OCM Collector 10.3.7 - the one which has that issue fixed.

Solutions:

you'll either drop OCM completely if you won't use it:

SQL> drop user ORACLE_OCM cascade;

you'll disable the collector jobs:

SQL> exec dbms_scheduler.disable('ORACLE_OCM.MGMT_CONFIG_JOB');
SQL> exec dbms_scheduler.disable('ORACLE_OCM.MGMT_STATS_CONFIG_JOB'); 

Refer to MOS Note: [ID 1453959.1]


Wednesday, May 1, 2013

ora.diskmon offline on 11.2.0.3


As Grid Infrastructure daemon diskmon.bin is used for Exadata fencing, started from 11.2.0.3, resource ora.diskmon will be offline in non-Exadata environment. This is expected behaviour change.


$ crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       pricedev1-lnx            Started             
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.crf
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.crsd
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.cssd
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.cssdmonitor
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.ctssd
      1        ONLINE  ONLINE       pricedev1-lnx            OBSERVER            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.evmd
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.gipcd
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.gpnpd
      1        ONLINE  ONLINE       pricedev1-lnx                                
ora.mdnsd
      1        ONLINE  ONLINE       pricedev1-lnx  



Modifying AWR Automatic Snapshot Settings


Verify Current setting:

SQL> set linesize 100
SQL> col snap_interval format a20
SQL> col retention format a20
SQL> col topnsql format a20

SQL> select * from dba_hist_wr_control;

      DBID SNAP_INTERVAL        RETENTION            TOPNSQL
---------- -------------------- -------------------- --------------------
2684842560 +00000 01:00:00.0    +00008 00:00:00.0    DEFAULT


Change the interval to 10 minutes and retention to 10 days:

SQL> execute dbms_workload_repository.modify_snapshot_settings( interval => 10,retention => 14400);

Verify change:


SQL> select * from dba_hist_wr_control;

      DBID SNAP_INTERVAL        RETENTION            TOPNSQL
---------- -------------------- -------------------- --------------------
2684842560 +00000 00:10:00.0    +00010 00:00:00.0    DEFAULT