Oracle Products: Enterprise Grade Or Just Enterprise Pricing?

I recently worked on a major project for a large organisation implementing over a hundred servers. These servers comprised the entire back-office infrastructure and would provide billing and financial services, order processing and CRM. The software stack chosen was a suite of Oracle products: Oracle eBusiness Suite, Oracle Billing and Revenue Management, Oracle Siebel, Oracle SOA Suite, Oracle OBIEE and Oracle DB Enterprise Edition. All very expensive software to both license and support.

I’ve worked heavily with Oracle DB before, and found it to be very stable and robust when sitting on Sun Clusters - however this would be the first time I got to install the software on our new “Enterprise” Operating System - Oracle Enterprise Linux. The reasoning behind the sea of Oracle was that we’d have complete end-to-end support for the entire stack. We’d log one ticket about an issue and someone at Oracle would be able to help. That was the theory, anyway.

The logical choice for the database backend was to cluster the nodes using Oracle Grid Infrastructure. This clustering software provides the backbone for Oracle RAC. ASM was also staring me in the face, rather than using another clustered filesystem. However, due to licensing constraints within the organisation, RAC was out of range in terms of price. I had to develop a solution to fail over a single instance database using Oracle Grid Infrastructure between two nodes, and have the database work correctly. This in itself was a bit of a mission, and will be the subject of a later article (I have many scripts and some good notes around this that I’m keen to upload). The DB part of the project was implemented quickly, was automated, and I built 5 2-node DB clusters in as many days.

Oracle eBusiness Suite is strange, and the installer unpacks (along with the application) an Oracle DB install, complete with the required loaded database. R12 of Oracle eBusiness Suite seems clunky and old from an administrative standpoint, but is very customisable, robust, and with the right people working with it through the User Interface, perfect for most tasks.

Now we turn to the bad stuff. Oracle Billing and Revenue Management (hereafter BRM), Oracle Siebel CRM (hereafter Siebel) and Oracle SOA Suite - particularly the Fusion Middleware (hereafter Fusion). Due to the System Integrators and Developers working with the stack, quite old versions of the software were chosen. For BRM and Siebel I won’t dwell, but for Fusion, 10g was selected. This meant that Oracle Application Server was the bundled J2EE middleware. My god. I am well versed in the ease of clustering Weblogic, and can do it blindfolded … almost. But clustering this dinosaur? My only previous experience with OAS was assisting hapless DBAs in not blowing it up on a daily basis as they ran reports in Discover (don’t get me started on that).

Anyway, I got started on reading the 10g Enterprise Deployment Guide and realised that clustering this was not an easy task. But I prevailed. Many Oracle support requests later, I had a two node Oracle SOA cluster running: OAS 10g, ESB, WSM, BPEL, AIA Foundation Pack, Oracle Service Registry, bunch of AIA PIPs. This was a huge chore, even for someone with 20 years UNIX experience. The SOA suite installer would run its configuration “wizards” and, particularly with the AIA FP, installation would explode at random points. I found myself hacking files during the install to get it across the line, and avoid stack traces, segmentation faults, and all kinds of other things I didn’t need in my day. After all that, the developers code didn’t work when clustered, they were too stupid to work out why, and we shut down a node. Anyway …

BRM and Siebel worked fairly well. I ended up clustering both using Grid Infrastructure and having BRM run active/passive failover across 2 nodes, with Siebel running its Siebel Servers active/active (one on each node) and having a failover Gateway Name Server service. These two components actually served us quite well.

Oracle certify Grid on RedHat/OEL 6.x - their website just says “RHEL 6″ - i.e. ALL SUBVERSIONs. That’s what you’d presume. I built a two node cluster on OEL 6 - i.e. Oracle’s OS, Oracle’s Clustering Software. I then upgraded using yum and brought my kernel to the latest security release - Oracle’s Unbreakable Enterprise Kernel, I hasten to add. I installed Grid, ran root.sh - “ADVM/ACFS not supported on this release” or some such. WTF? I logged a job with Oracle, as to me, the certification says this should work. After a bit of argy bargy, even the latest Grid patch set wouldn’t fix this - they add support for newer OS releases and bundle it in with patch sets.

But what would happen if I was already running Grid on an older system, took it down for maintenance to patch my system, brought it up and boom! My ACFS filesystems no longer worked? Their solution: “roll back the kernel”. Yes, that’s what you need to do, but you shouldn’t have to. End-to-end supported, certified and broken.

Oracle support seemed to be like this for me for the most part. I’d log a job, it’d go offshore somewhere and somebody would send me back a request for more information. This would bounce around a bit, and I’d give the same information a few times. Then somebody would look at the job. I didn’t measure it, but I’d estimate that 90% of the time, over about a hundred jobs, I found my own solution by either tinkering or Googling before Oracle came back with a response. The trouble is, Solaris support requests are now suffering the same fate. Sun support used to be absolutely awesome - great onshore support as soon as the job was logged, tight SLAs that were met, and knowledgable field engineers. That’s all gone now, yet we’re still paying top dollar for it.

So what I want to ask is this: if the support for Oracle products has downgraded, and the software itself needs patches upon patches to run, why are we still paying for enterprise-grade software and service?