Quantcast
Channel: Oracle
Viewing all articles
Browse latest Browse all 1814

Wiki Page: Managing & Troubleshooting Exadata - Part 1 - Upgrading & Patching Exadata

$
0
0
As part of managing & troubleshooting Exadata article series, I shall be discussing various administrative and troubleshooting topics on Exadata. This part of the article is mainly focused on the need of Exadata patching, patching path & tools, and some real-time examples on how to patch cell servers and database servers. Be it Exadata or non-Exadata, patching remains one of the key responsibilities for a DBA or DMA to keep the system secure and stable. When it comes to patching, Exadata patching has a different approach and various software categories (layers) that need to be patched in contrast to non-Exadata system. In the very early versions of Exadata, patching was one of the toughest tasks and required a significant skills and authority over the patching procedure to make it successful. Oracle Platinum support usually provides the service to patch the Exadata full stack regularly, however, it is sometimes very important for a DMA to understand the patching approach, patching path and order and sufficient knowledge about which Oracle documents to review. Majorly, there are three important software categories on the Exadata that needs to be patched quarterly, semiannually and annually, as explained below: · Cell Servers o OS updates o Firmware updates o Drivers · Database Servers o OS updates o Firmware updates · InfiniBand Switch o InfiniBand switch + OS updates · Other Components (Cisco Ethernet switch, PDU, KVM) · Oracle GI/RDBMS Homes o Standard GI/RDBMS binary updates The following image describes the various layers involved on an Exadata System which typically required maintenance and continuous patching to keep the system stable & secure: The image below depicts which patching tool is used to patch the Exadata stack: Image courtesy: http://uhesse.com/2014/12/20/exadata-patching-introduction/ Patching tools Patching Exadata database machine is automated and made easy with the latest patching tools. It is essential to have enough understanding of various patching utilities introduced by Oracle to patch the various components or full Exadata stack. The following are the most common patching utilities used to patch cell servers, DB servers, InfiniBand switch and additional components (Cisco Ethernet switch, KVM, PDU) of DBM. patchmgr utility is a tool used to apply the latest patches or rollback the patches to the Exadata Storage Servers in rolling or non-rolling fashion. The utility automates the patching operations and also provides an option to send email notification upon patch/rollback operation failure succeed, waiting, not attempted etc status. ./patchmgr –cells cel1_group –patch –rolling \ –smtp_from dba@domain.com \ –smtp_to dbma@domain.com To update/patch the infiniband switches, the following syntax can be used: ./patchmgr –ibswitches [ibswitch_list_fle] \ -upgrade | -downgrade [-ibswitch_precheck] [-force] The following tasks are performed during the course of Storage server patching: · New OS image is pushed to the inactive partition · Multiple cell reboots are performed for various course of action · USB recovery media is recreated and keeps a good backup dbnodeupdate.sh utility download dbnodeupdate.zip via patch 16486998 which contains dbnodeupdate.sh utility. The utility replaces the manual steps and automates all the necessary steps and checks required to successfully patch the Exadata database servers. The following tasks are performed during the course of Database server patching: · Stop/start/disable CRS · Performs filesystem backup · Applies OS updates · Relink all Oracle homes · Enable CRS for auto-restart ./dbnodeupdate.sh [ -u | -r | -c ] [ -l ] [-p] [-n] [-s] [-i] [-q] [-v] [-t] [-a] [-b] [-m] | [-V] | [-h] -u Upgrade -r Rollback -c Complete post actions (verify image status, cleanup, apply fixes, relink all homes, enable GI to start/start all domU's) -l Baseurl (http or zipped iso file for the repository) -s Shutdown stack (domU's for VM) before upgrading/rolling back -p Bootstrap phase (1 or 2) only to be used when instructed by dbnodeupdate.sh -q Quiet mode (no prompting) only be used in combination with -t -n No backup will be created (Option disabled for systems being updated from Oracle Linux 5 to Oracle Linux 6) -t 'to release' - used when in quiet mode or used when updating to one-offs/releases via 'latest' channel (requires 11.2.3.2.1) -v Verify prereqs only. Only to be used with -u and -l option -b Perform backup only -a Full path to shell script used for alert trapping -m Install / update-to exadata-sun/hp-computenode-minimum only (11.2.3.3.0 and later) -i Relinking of stack will be disabled and stack will not be not started. Only possible in combination with -c. -V Print version -h Print usage Opatch utility is used to patch the GI and RDBMS homes. OPlan utility provides step-by-step patching instructions specific to your environment. OPlan automatically analyze and collects the required configuration information. Also, generates a set of instructions and commands which are suitable/customized to your environment. $ORACLE_HOME/oplan/oplan generateApplySteps Patching order The following presents a patching order and complete Exadata stack path: · InifiniBand Switch o Spine/Leaf · Exadata Storage Server · Exadata Database Server · Database Bundle Patches o Grid Home o Oracle Homes Patching example The following example demonstrates patching storage servers and db servers patching using the different utilities: Pre-patching tasks (all actions performed as root user): Step 1) # imageinfo Step 2) On a DB server, connect to ASM instance to get the current disk_repair_time before you change the value to sustain the patching downtime: SELECT dg.name AS diskgroup, SUBSTR(a.name,1,18) AS name, SUBSTR(a.value,1,24) AS value, read_only FROM V$ASM_DISKGROUP dg, V$ASM_ATTRIBUTE a WHERE dg.group_number = a.group_number and a.name ='disk_repair_time'; As a safe side, change the value to 24hrs, as demonstrated below: SQL> alter diskgroup DG_DBFS SET ATTRIBUTE 'disk_repair_time' = '24h'; SQL> alter diskgroup DG_DATA SET ATTRIBUTE 'disk_repair_time' = '24h'; SQL> alter diskgroup DG_FRA SET ATTRIBUTE 'disk_repair_time' = '24h'; Step 3) Create a file on the OS and have Cell 1 hostname in the file, maintain a separate file for each cell vi /root/cell1_group – put cell host name in the file Step 4) inactive all grid disks, login as celladmin users, with cellcli utility, do the following: CellCLI> alter griddisk all inactive; Verify all griddisk are inactive state: CellCLI> list griddisk attributes name, status; Cell Patching: Step 5) Patching Cell node (starts from first cell node and patch remaining from the first cell node) Get into patch location and run the following commands: # ./patchmgr -cells /root/cell1_group -reset_force # ./patchmgr -cells /root/cell1_group -cleanup # ./patchmgr -cells /root/cell1_group -patch_check_prereq # ./patchmgr -cells /root/cell1_group -patch Once the patching is successfully completes on the cell, as celladmin, run the following commands through cellCLI command: CellCLI> list griddisk attributes name, status; CellCLI> alter griddisk all active; Continue with other cells with the same set of commands. However, ensure you have successfully configured the SSH between the cell nodes and have a different file with hostname, as mentioned in step 3. DB Server patching After completing the cell server patching, move on to database servers and apply the patching with dbnodeupdate.sh utility. Beforehand, you must copy the patch (zip file) on all db severs. Step 1) Stop/Disable cluster on the database server ./crsctl disable crs ./crsctl stop crs –f Step 2) ./dbnodeupdate.sh -v -u -l /u01/app/oracle/patch/p18876946_112331_Linux-x86-64.zip ./dbnodeupdate.sh -b -u -l /u01/app/oracle/patch/p18876946_112331_Linux-x86-64.zip ./dbnodeupdate.sh -u -l /u01/app/oracle/patch/p18876946_112331_Linux-x86-64.zip Wait for multiple reboots, once the system is up and running, do the following action: ./dbnodeupdate.sh -c The –c option performs the post patch tasks, such as, enable/start the CRS, relink the binaries etc. Continue the same set of action on the rest of nodes. Once you successfully complete patching over the remaining db servers, revert back the disk_repair_time value through the ASM. Tips Here is the list of few patching best practice tips that a DMA should follow: o Patch non-production system first o Perform standby first apply approach, if standby system exists o Read the documents and have a appropriate plan in-place o Run Exadata Healthcheck before/after patching o Ensure you patch when the workload is low o Always refer to MOS Note 888828.1 o Ensure the disk_repair_time value for existing diskgroups increased from its default value, preferably 24hrs value. References Following are a few useful MOS notes which can be used as reference when you plan to patch: 1. Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1) 2. dbnodeupdate.sh: Exadata Database Server Patching using the DB Node Update Utility (Doc ID 1553103.1) 3. Exadata Patching Overview and Patch Testing Guidelines (Doc ID 1262380.1) 4. Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)

Viewing all articles
Browse latest Browse all 1814

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>