Continuous availability benefits of pureScale now available in a new low cost DB2 offering

KellySchlambKelly Schlamb
DB2 pureScale and PureData Systems Specialist, IBM

Today, IBM has announced a set of new add-on offerings for DB2, which includes the IBM DB2 Performance Management Offering, IBM DB2 BLU Acceleration In-Memory Offering, IBM DB2 Encryption Offering, and the IBM DB2 Business Application Continuity Offering. More details on these offerings can be found here. Generally speaking, the intention of these offerings is to make some of the significant capabilities and features of DB2 available as low cost options for those not using the advanced editions of DB2, which already include these capabilities.

If you’ve read any of my past posts you know that I’m a big proponent of DB2’s pureScale technology. And staying true to form, the focus of my post here is on the IBM DB2 Business Application Continuity (BAC) offering, which is a new deployment and licensing model for pureScale. This applies to DB2 10.5 starting with fix pack 5 (the current fix pack level released in December 2014).

For more information on DB2 pureScale itself, I suggest taking a look here and here. But to boil it down to a few major points, it’s an active/active, shared data, clustering solution that provides continuous availability in the event of both planned and unplanned outages. pureScale is available in the DB2 Advanced Workgroup Server Edition (AWSE) and Advanced Enterprise Server Edition (AESE). Its architecture consists of the Cluster Caching Facilities (CF), which provide centralized locking and data page management for the cluster, and DB2 members, which service the database transaction requests from applications. This multi-member architecture allows workloads to scale-out and workload balance across up to 128 members.

While that scale-out capability is attractive to many people, some have told me that they love the availability that pureScale provides but that they don’t have the scalability needs for it. And in this case they can’t justify the cost of the additional software licenses to have this active/active type of environment – or to even move from their current DB2 Workgroup Server Edition (WSE) or Enterprise Server Edition (ESE) licensing up to the corresponding advanced edition that contains pureScale.

This is where BAC comes in. With BAC – which is a purchasable option on top of WSE and ESE – you can create a two member pureScale cluster. The difference, and what makes this offering interesting and attractive for some, is that the cluster can be used in an active/active way, but it’s licensed as an active/passive cluster. Specifically, one member of the cluster is used to run your application workloads and the other member is available as a standby in case that primary member fails or has to be brought down for maintenance. But isn’t that passive? No… and the reason is that this secondary member doesn’t just sit idle waiting for that to happen. Under the BAC offering terms, you are also allowed to run administrative operations on this secondary “admin” member. In fact, you are allowed to do all of the following types of work on this member:

  • Backup, Restore
  • Runstats
  • Reorg
  • Monitoring (including DB2 Explain and any diagnostic or problem determination activities)
  • Execution of DDL
  • Database Manager and database configuration updates
  • Log based capture utilities for the purpose of data capture
  • Security administration and setup

By offloading this administrative work off of the primary member, you leave it with more capacity to run your application workloads. And with BAC, you are only fully licensing the one primary member where your applications are running (for either WSE or ESE plus BAC). The licensing of the secondary member, on the other hand, falls under DB2’s warm/idle standby licensing which means a much reduced cost for it (e.g. for PVU pricing the secondary member would only be 100 PVUs of WSE or ESE plus 100 PVUs of BAC). For more details on actual software costs, please talk to your friendly neighborhood IBM rep.

BACgraphicAnd because this is still pureScale at work here, if there’s a failure of the primary member, the application workloads will automatically failover to the secondary member. Likewise, the database will stay up and remain accessible to applications on the secondary member when the primary member undergoes maintenance – like during a DB2 fix pack update. In both of these cases the workload is allowed to run on the secondary member and when the primary member is brought back up, the workloads will failback to it. All of the great availability characteristics of pureScale at a lower cost!

If you contrast this with something like Oracle RAC One Node, which has some similar characteristics to IBM DB2 BAC, only the primary node (instance) in Oracle RAC One Node is active and the standby node is not. In fact, it’s not even started until the work has to failover, so there’s a period of time where the cluster is completely unavailable. So a longer outage, slower recovery times, and no ability to run administrative work on this idle node like you can do with BAC.

Sounds great, right?

And for those of you that do want the additional scale-out capability, but like the idea of having that standby admin member at a reduced cost, IBM has thought of you too. Using AWSE or AESE (the BAC offering isn’t involved here), you can implement a pureScale cluster with multiple primary members with a single standby admin member. The multiple primary members are each fully licensed for AWSE or AESE, but the single standby admin member is only licensed as a passive server in the cluster (again, using the PVU example that would only be 100 PVUs of either AWSE or AESE). In this case, you can do any of that administrative work previously described on the standby member, and it’s also available for workloads to failover to if there are outages for one or more of the primary members in the cluster.

Happy clustering!

What is DB2ssh?

photo.doBy Mihai Iacob
DB2 Security Development

The IBM DB2 pureScale Feature provides high levels of distributed availability, scalability and transparency to the application, but why do I need to enable password-less SSH for the root user in my DB2 pureScale cluster? Well you don’t any longer and this site  explains how to use db2ssh to securely deploy and configure the DB2 pureScale Feature.

Both the DB2 installer and GPFS, the filesystem used by DB2 pureScale, have a requirement to run commands as root on a remote system. Db2ssh provides an alternative to enabling password-less SSH as root, by effectively SSH-ing as a regular user, and then elevating privileges to root to run the require commands.

Wait, isn’t that asking for trouble? Can a non-root user run remote commands as root in my cluster ? Not at all, there are rigorous security checks put in place to make sure only the root user can run commands remotely as root. This is accomplished by having the root user digitally sign any message that is sent to the remote system and having the remote system verify this signature before executing any commands. SSH can also be configured in a secure way to prevent against replay attacks.

Take a look at the article to find out how to configure and troubleshoot DB2ssh.

Make Your Apps Highly Available and Scalable

By Vinayak Joshi
Senior Software Engineer, IBM

The IBM premium data-sharing technologies offer unmatched high-availability and scalability to applications. If you are a JDBC application developer wanting to explore how these benefits accrue to your application and whether you need to do anything special to exploit these benefits, my article – “Increase scalability and failure resilience of applications with IBM Data Server Driver for JDBC and SQLJ” – is a great source of information.

In the article, I explain how turning on a single switch on the IBM Data Server Driver for JDBC and SQLJ opens up all the workload balancing and high availability benefits to your JDBC applications. There is very little required for an application to unlock the workload balancing and high availability features built into the DB2 server and driver technologies.

For those curious about  how the driver achieves this in tandem with pureScale and sysplex server technologies, the article should provide a good end-to-end view. While all the nuts and bolts explanations are provided, it is stressed that all of it happens under the covers, and beyond the bare minimum understanding, application developers and DBA’s need not concern themselves with it too much if they do not wish to.

The aspects a developer needs to keep in mind are highlighted and recommendations on configuring and tuning applications are provided.  We’ve made efforts to keep the reading technically accurate while keeping the language simple enough for a non-technical audience to grasp.

Any and all feedback shall be much appreciated and taken into account. Take a look at the article by clicking here, and feel free to share your thoughts in the comment section below

pureScale at the Beach. – What’s New in the DB2 “Cancun Release”

KellySchlamb Kelly Schlamb
DB2 pureScale and PureData Systems Specialist, IBM

cancun_beachToday, I’m thinking about the beach. We’re heading into the last long weekend of the summer, the weather is supposed to be nice, and later today I’ll be going up to the lake with my family. But that’s not really why the beach is on my mind. Today, the DB2 “Cancun Release” was announced and made available, and as somebody that works extensively with DB2 and pureScale, it’s a pretty exciting day.

I can guarantee you that you that over the next little while, you’re going to be hearing a lot about the various new features and capabilities in the “Cancun Release” (also referred to as Cancun Release 10.5.0.4 or DB2 10.5 FP4). For instance, the new Shadow Tables feature — which exploits DB2 BLU Acceleration — allows for real-time analytics processing and reporting on your transactional database system. Game changing stuff. However, I’m going to leave those discussions up to others or for another time and today I’m going to focus on what’s new for pureScale.

As with any major new release, some things are flashy and exciting, while other things don’t have that same flash but make a real difference in the every day life of a DBA. Examples of the latter in Cancun include the ability to perform online table reorgs and incremental backups (along with support for DB2 Merge Backup) in a pureScale environment, additional Optim Performance Manager (OPM) monitoring metrics and alerts around the use of HADR with pureScale, and being able to take GPFS snapshot backups. All of this leads to improved administration and availability.

There’s a large DB2 pureScale community out there and over the last few years we’ve received a lot of great feedback on the up and running experience. Based on this, various enhancements have been made to provide faster time to value, with the improved ease of use and serviceability of installation, configuration, and updates. This includes improved installation documentation, enhanced prerequisite checking, beefing up some of the more common error and warning messages, improved usability for online fix pack updates, and the ability to perform version upgrades of DB2 members and CFs in parallel.

In my opinion, the biggest news (and yes, the flashiest stuff) is the addition of new deployment options for pureScale. Previously, the implementation of a DB2 pureScale cluster required specialized network adapters — RDMA-capable InfiniBand or RoCE (RDMA over Converged Ethernet) adapter cards. RDMA stands for Remote Direct Memory Access and it allows for direct memory access from one computer into that of another without involving either one’s kernel, so there’s no interrupt handling and no context-switching that takes place as part of sending a message via RDMA (unlike with TCP/IP-based communication). This allows for very high-throughput, low-latency message passing, which DB2 pureScale uniquely exploits for very fast performance and scalability. Great upside, but a downside is the requirement on these adapters and an environment that supports them.

Starting in the DB2 Cancun Release, a regular, commodity TCP/IP-based interconnect can be used instead (often referred to as using “TCP/IP sockets”). What this gives you is an environment that has all of the high availability aspects of an RDMA-based pureScale cluster, but it isn’t necessarily going to perform or scale as well as an RDMA-based cluster will. However, this is going to be perfectly fine for many scenarios. Think about your daily drive to work. While you’d like to have a fast sports car for the drive in, it isn’t necessary for that particular need (maybe that’s a bad example — I’m still trying to convince my wife of that one). With pureScale, there are cases where availability is the predominant motivator for using it and there might not be a need to drive through massive amounts of transactions per second or scale up to tens of nodes. Your performance and scalability needs will dictate whether RDMA is required or not for your environment. By the way, you might see this feature referred to as pureScale “lite”. I’m still slowly warming up to that term, but the important thing is people know that “lite” doesn’t imply lower levels of availability.

With the ability to do this TCP/IP sockets-based communication between nodes, it also opens up more virtualization options. For example DB2 pureScale can be implemented using TCP/IP sockets in both VMware (Linux) and KVM (Linux) on Intel, as well as in AIX LPARs on Power boxes. These virtualized environments provide a lower cost of entry and are perfect for development, production environments with moderate workloads, QA, or just getting yourself some hands-on experience with pureScale.

It’s also worth pointing out that DB2 pureScale now supports and is optimized for IBM’s new POWER8 platform.

Having all of these new deployment options changes the economics of continuous availability, allowing broad infrastructure choices at every price point.

One thing that all of this should show you is the continued focus and investment in the DB2 pureScale technology by IBM research and development. With all of the press and fanfare around BLU, people often ask me if this is at the expense of IBM’s other technologies such as pureScale. You can see that this is definitely not the case. In fact, if you happen to be at Insight 2014 (formerly known as IOD) in Las Vegas in October, or at IDUG EMEA in Prague in November, I’ll be giving a presentation on everything new for pureScale in DB2 10.5, up to and including the “Cancun Release”. It’s an impressive amount of features that’s hard to squeeze into an hour. 🙂

For more information on what’s new for pureScale and DB2 in general with this new release, check out the fix pack summary page in the DB2 Information Center.

Simplifying Oracle Database Migrations

Danny Arnold

Danny Arnold ,  Worldwide Competitive Enablement Team

As part of my role in IBM Information Management, as a technical advocate for our DB2 for LUW(Linux , Unix, Windows) product set, I often enter into discussions with clients that are currently using Oracle Database.

With the unique technologies delivered in the DB2 10 releases (10.1 and 10.5), such as

  • temporal tables to allow queries against data at a specific point-in-time,
  • row and column access control (RCAC) to provide granular row and column level security that extends the traditional RDBMS table privileges for additional data security, pureScale for near continuous availability database clusters,
  • database partitioning feature (DPF) for parallel query processing against large data sets (100s of TBs), and
  • the revolutionary new BLU Acceleration technology to allow analytic workloads to use column-organized tables to deliver performance orders of magnitude faster than conventional row-organized tables,

many clients like the capabilities and technology that DB2 for LUW provides.

However, a key concern is the level of effort to migrate an existing Oracle Database environment to DB2 .  Although DB2  provides Oracle compatibility and has had this capability built into the database engine since the DB2 9.7 release, there is still confusion on the part of clients as to what this Oracle compatibility means in terms of a migration effort.  Today, DB2 provides a native Oracle PL/SQL procedural language compiler, support for Oracle specific ANSII SQL language extensions, Oracle SQL functions, and Oracle specific data types (such as NUMBER and VARCHAR2).  This compatibility layer within DB2 allows many Oracle Database environments to be migrated to DB2 with minimal effort. Many stored procedures and application SQL that are used against Oracle Database can run unchanged against DB2 reducing both the migration effort and migration risk, as the application did not have to be modified. So the testing phase is much less effort than for a changed or heavily modified application SQL and stored procedures. Although the migration effort seems relatively straight forward, there are still questions that come up with clients and there is the need for a clear explanation of the Oracle Database to DB2 migration process.

Recently, a new solution brief entitled “Simplify your Oracle database migrations” published by IBM Data Management , provides a clear explanation of how DB2 and the PureData for Transactions appliance built upon DB2 pureScale can deliver a clustered database environment for migrating an Oracle database to DB2.  This brief provides a clear and concise overview of what an Oracle to DB2 migration requires and the assistance and tooling available from IBM to make a migration straightforward for a client’s environment.  The brief provides a concise description of the IBM tooling, IBM Database Conversion Workbench, which is available to assist a client in moving their tables, stored procedures, and data from Oracle to DB2.

The fact that DB2 for LUW makes migrating from Oracle a task that takes minimal effort, due to the Oracle compatibility built into DB2, is complemented by the PureData for Transactions system. PureData for Transactions provides an integrated, pre-built DB2 pureScale environment that allows a pureScale instance and a DB2 clustered database to be ready for use in a matter of hours. This helps simplify the implementation and configuration experience for the client. Combining the ease of Oracle migration to DB2 with the rapid implementation and configuration possible with PureData for Transactions, provides a winning combination for a client looking for a more cost effective and available alternative to the Oracle Database.

Achieving High Availability with PureData System for Transactions

KellySchlamb

Kelly Schlamb , DB2 pureScale and PureData Systems Specialist, IBM

A short time ago, I wrote about improving IT productivity with IBM PureData System for Transactions and I mentioned a couple of new white papers and solution briefs on that topic.  Today, I’d like to highlight another one of these new papers: Achieving high availability with PureData System for Transactions.

I’ve recently been meeting with a lot of different companies and organizations to talk about DB2 pureScale and PureData System for Transactions, and while there’s a lot of interest and discussion around performance and scalability, the primary reason that I’m usually there is to talk about high availability and how they can achieve higher levels than what they’re seeing today. One thing I’m finding is that there are a lot of different interpretations of what high availability means (and I’m not going to argue here over what the correct definition is). To some, it’s simply a matter of what happens when some sort of localized unplanned outage occurs, like a failure of their production server or a component of that server. How can downtime be minimized in that case?  Others extend this discussion out to include planned outages, such as maintenance operations or adding more capacity into the system. And others will include disaster recovery under the high availability umbrella as well (while many keep them as distinctly separate topics — but that’s just semantics). It’s not enough that they’re protected in the event of some sort of hardware component failure for their production system, but what would happen if the entire data center was to experience an outage? Finally (and I don’t mean to imply that this is an exhaustive list — when it comes to keeping the business available and running, there may be other things that come into the equation as well), availability could also include a discussion on performance. There is typically an expectation of performance and response time associated with transactions, especially those that are being executed on behalf of customers, users, and business processes. If a customer clicks on button on a website and it doesn’t come back quickly, it may not be distinguishable from an outage and the customer may leave that site, choosing to go to a competitor instead.

It should be pointed out that not every database requires the highest levels of availability. It might not be a big deal to an organization if a particular departmental database is offline for 20 minutes, or an hour, or even the entire day. But there are certainly some business-critical databases that are considered “tier 1” that do require the highest availability possible. Therefore, it is important to understand the availability requirements that your organization has.  But I’m likely already preaching to the choir here and you’re reading this because you do have a need and you understand the ramifications to your business if these needs aren’t met. With respect to the companies I’ve been meeting with, just hearing about what kinds of systems they depend on from both an internal and external perspective- and what it means to them if there’s an interruption in service- has been fascinating.  Of course, I’m sympathetic to their plight, but as a consumer and a user I still have very high expectations around service. I get pretty mad when I can’t make an online trade, check the status of my travel reward accounts, or even order a pizza online ; especially when I know what those companies could be doing to provide better availability to their users.  🙂

Those things I mentioned above — high availability, disaster recovery, and performance (through autonomics) — are all discussed as part of the paper in the context of PureData System for Transactions. PureData System for Transactions is a reliable and resilient expert integrated system designed for high availability, high throughput online transaction processing (OLTP). It has built-in redundancies to continue operating in the event of a component failure, disaster recovery capabilities to handle complete system unavailability, and autonomic features to dynamically manage utilization and performance of the system. Redundancies include power, compute nodes, storage, and networking (including the switches and adapters). In the case of a component failure, a redundant component keeps the system available. And if there is some sort of data center outage (planned or unplanned), a standby system at another site can take over for the downed system. This can be accomplished via DB2’s HADR feature (remember that DB2 pureScale is the database environment within the system) or through replication technology such as Q Replication or Change Data Capture (CDC), part of IBM InfoSphere Data Replication (IIDR).

Just a reminder that the IDUG North America 2014 conference will be taking place in Phoenix next month from May 12-16. Being in a city that just got snowed on this morning, I’m very much looking forward to some hot weather for a change. Various DB2, pureScale, and PureData topics are on the agenda. And since I’m not above giving myself a shameless plug, come by and see me at my session: A DB2 DBA’s Guide to pureScale (session G05). Click here for more details on the conference. Also, check out Melanie Stopfer’s article on IDUG.  Hope to see you there!

Improve IT Productivity with IBM PureData System for Transactions

KellySchlamb

Kelly Schlamb , DB2 pureScale and PureData Systems Specialist, IBM
I’m a command line kind of guy, always have been. When I’m loading a presentation or a spreadsheet on my laptop, I don’t open the application or the file explorer and work my way through it to find the file in question and double click the icon to open it. Instead, I open a command line window (one of the few icons on my desktop), navigate to the directory I know where the file is (or will do a command line file search to find it) and I’ll execute/open the file directly from there. When up in front of a crowd, I can see the occasional look of wonder at that, and while I’d like to think it’s them thinking “wow, he’s really going deep there… very impressive skills”, in reality it’s probably more like “what is this caveman thinking… doesn’t he know there are easier, more intuitive ways of accomplishing that?!?”

The same goes for managing and monitoring the systems I’ve been responsible for in the past. Where possible, I’ve used command line interfaces, I’ve written scripts, and I’ve visually pored through raw data to investigate problems. But inevitably I’d end up doing something wrong, like miss a step, do something out of order, or miss some important output – leaving things not working or not performing as expected. Over the years, I’ve considered that part of the fun and challenge of the job. How do I fix this problem? But nowadays, I don’t find it so fun. In fact, I find it extremely frustrating.Things have gotten more complex and there are more demands on my time. I have much more important things to do than figure out why the latest piece of software isn’t interacting with the hardware or other software on my system in a way it is supposed to. When I try to do things on my own now, any problem is immediately met with an “argh!” followed by a google search hoping to find others who are trying to do what I’m doing and have a solution for it.

When I look at enterprise-class systems today, there’s just no way that some of the old techniques of implementation, configuration, tuning, and maintenance are going to be effective. Systems are getting larger and more complex. Can anybody tell me that they enjoy installing fix packs from a command line or ensuring that all of the software levels are at exactly the right level before proceeding with an installation of some modern piece of software (or multiple pieces that all need to work together, which is fairly typical today)? Or feel extremely confident in getting it all right? And you’ve all heard about the demands placed on IT today by “Big Data”. Most DBAs, system administrators, and other IT staff are just struggling to keep the current systems functioning, not able to give much thought to implementing new projects to handle the onslaught of all this new information. The thought of bringing a new application and database up, especially one that requires high availability and/or scalability, is pretty daunting. As is the work to grow out such a system when more demands are placed on it.

It’s for these reasons and others that IBM introduced PureSystems. Specifically, I’d like to talk here about IBM PureData System for Transactions. It’s an Expert Integrated System that is designed to ensure that the database environment is highly available, scalable, and flexible to meet today’s and tomorrow’s online transaction processing demands. These systems are a complete package and they include the hardware, storage, networking, operating system, database management software, cluster management software, and the tools. It is all pre-integrated, pre-configured, and pre-tested. If you’ve ever tried to manually stand up a new system, including all of the networking stuff that goes into a clustered database environment, you’ll greatly appreciate the simplicity that this brings.

The system is also optimized for transaction processing workloads, having been built to capture and automate what experts do when deploying, managing, monitoring, and maintaining these types of systems. System administration and maintenance is all done through an integrated systems console, which simplifies a lot of the operational work that system administrators and database administrators need to do on a day-to-day basis. What? Didn’t I just say above that I don’t like GUIs? No, I didn’t quite say that. Yeah, I still like those opportunities for hands-on, low-level interactions with a system, but it’s hard not to appreciate something that is going to streamline everything I need to do to manage a system and at the same time keep my “argh” moments down to a minimum. The fact that I can deploy a DB2 pureScale cluster within the system in about an hour and deploy a database in minutes (which, by the way, also automatically sets it up for performance monitoring) with just a few clicks is enough to make me love my mouse.

IBM has recently released some white papers and solution briefs around this system and a couple of them talk to these same points that I mentioned above. To see how the system can improve your productivity and efficiency, allowing your organization to focus on the more important matters at hand, I suggest you give them a read:

Improve IT productivity with IBM PureData System for Transactions solution brief
Four strategies to improve IT staff productivity white paper

The four strategies as described in these papers, that talk to the capabilities of PureData System for Transactions, are:

  • Simplify and accelerate deployment of high availability clusters and databases
  • Streamline systems management
  • Reduce maintenance time and risk
  • Scale capacity without incurring downtime

I suspect that I won’t be changing my command line and management/maintenance habits on my laptop and PCs any time soon, but when it comes to this system, I’m very happy to come out of my cave.

Disaster? What Disaster?

Bill Cole

Bill Cole – Competitive Sales Specialist,Information Management, IBM

In a previous post, I wrote about our System/370 dangling from a crane.  It was a simpler time where loss of computing wasn’t the business-collapsing event it can be today.  If you believe one of the adverts regarding disaster recovery, a significant number of businesses don’t recover.  That’s a scary thought for any of us who are responsible for building that capability.  In case you’re wondering, this isn’t an academic experience for me.  In fact, you may be using a navigation service or buying gifts on line through one of the systems that  I architected and implemented.

From a database perspective, 24 x 7 x forever is an expensive and technically challenging proposition.  Whatever your reasons or choices, we’re simply talking about an insurance policy that’s written in hardware, software and processes rather than bits of paper.  In one of my stints as a Production DBA, the CEO would walk by and ask about my systems.  I told him they were still up so we were still in business.  He understood the logic and wasn’t comforted.  Neither was I since I was the entire DBA staff!

Let’s start with some basics.  You’re ensuring that your database survives intact and the application can continue to function, even if it’s in a degraded fashion.  I’ve had customers with exact replicas of their Production environment and others with a smaller version just to keep doing through the disaster.  It depends on how much you want to spend on that insurance policy.  If your business is really global and/or your customers & partners expect ubiquitous access, then your choice is made.

And you have to commit to testing.  It’s far too late to find a hole when the hurricane comes through and your machine is dangling from a crane.  One of my clients actually fails over, conducts business for a weekend and then fails back.  It’s not extreme since they’re out of business if the system ever fully fails.  They survived Hurricane Sandy because they were ready and knew how to fail over and keep going.

You really have three choices.  HADR, QRep and CDC.  CDC??  Yup.  CDC replicates changes from one database to another and that seems to be what we’re talking about, right?

DB2 with HADR (High Availability Disaster Recovery) is the simple choice.  It works with pureScale 10.5, too.  You can even tune the time delay so you have some idea of how many transactions might be in flight.  The application should see an error and recovery nicely.  That’s the theory anyway.  Failover and failback are supported.  So you’re good to go, as we say in NASCAR country.  If you’re using pureScale and HADR on a Power system, you’ve pretty much prepared for anything and everything from a database perspective.  QRep is the likely variation without some of the neat tuning knobs HADR brings.

The two biggest issues are licensing and network costs, it seems to me.  Well, it’s expensive to have a pipe large enough to handle the volume of data in a large production environment.  Licensing isn’t an issue if you’re using PureData for Transactions (PDTx) since the relevant licenses (database, pureScale and HADR) are included.  Changes the whole debate.  You’ve got the first part of the insurance policy all wrapped up and paid for.

Choice two-A: Monthly or weekly cold backups and daily incrementals.  Or incrementals more often depending on your recovery options.  I’ve accomplished this one by having a process that watches for log file completions and then FTPs the completed log file to another system for backing up.  Or you can write a script that does a remote copy of the log files directly (less chance of random corruption in my experience) to the backup system.

Choice two-B: Same as above but shipping the logs to a database that is ingesting the logs in recovery mode.  The variation would be to use Change Data Capture to accomplish this.  And there’s lots more CDC could do for you besides this chore.

Choice three: Write every transaction to a log so you can replay it for recovery.  Hmmm.  How do you sell that one to the application developers?  And won’t there be a performance hit?

There’s a fourth choice.  Read on!

Here’s a short synopsis of IBM’s products:

QREP – replicate transactions to a remote database using MQ to transmit transaction messages reliably.  Think of your backups going through the reliable messaging that MQ provides rather than taking a chance with ftp or simply losing some part of a transmission.

CDC (Change Data Capture) – replicates transactions to a remote database using a proprietary TCP/IP messaging.  More on CDC in another installment.  Another useful option within CDC is to build files for DataStage to use for reloading the database.  That seems a pretty interesting option.

HADR — replicate transactions to a remote database.  HADR can be tuned to prevent loss of any transactions (which imposes an overhead on performance, of course).  You can choose to lose a few transactions by configuring for async replication, but you can tune how long that window is.  One of the things that you can do with your backup/standby databases is reporting using HADR.  I’m a big fan of this option since I hate the thought of servers simply waiting for something to fail without providing any real business value.

One of the really esoteric HA/DR configurations I’ve seen is cascaded backup databases (backups of backups).   I’ve seen this done with HADR and log shipping.  Or HADR to HADR.  It works.  I’d consider using different methods such as HADR and then log shipping.

None of the products above require any changes to application code so any and all applications should work without worrying you or the developers.  So you can stay focused on adding value to the business rather than simply playing with the technology (no matter how much it may be).

Full-out paranoia: Combine several of the above into your strategy.  I had three different methods of backup for a very large online auction house (no, not the one you’re thinking about).  A daily full warm backup.  Remote copies of the logs. Log shipping with recovery.  One of the backups had to be right!

Trick question: Which method did we miss?  The cloud, of course.  Why not put your DR site in a Public cloud?  All of these methods could be implemented in a cloud.  I suggested Public because you don’t have to manage it and it’s not subject to the vicissitudes of your environment or budget.  You can scale it to meet your requirements and pay only for what you use.  It’s a valid variation on our theme.

Finally, I keep talking about testing your recovery methods because it seems to be the weak spot that every system has.  It’s messy and time-consuming to no real business purpose.  Not to mention fraught with the possibility of a major malfunction.  We all have stories in our pocket about the backup that was never tested or the plug being pulled.  I’ve got lots of them, too.  I designed and then managed a data center.  I had a diesel generator that would kick in whenever the building lost power.  I tested it weekly.  It seemed the rational thing to do having spent all that money on it.  I was sitting in my office a few weeks later and the building lights went out.  Someone doing construction had cut the power.  I ran to my computer room (twelve feet) and found everything humming along.  Of course, no one in the building knew that because their PCs were dead.  The CEO knocked on the door after walking down the stairs.  My data center was still up.  We were still in business.  Smiles all around.  Ah, paranoia pays off!

Learn more about the new version of  DB2 for LUW

Read about the The PureData System for Transactions, which is optimized exclusively for transactional workloads.

Follow Bill Cole on Twitter : @billcole_ibm

New to IBM PureData System for Transactions: DB2 10.5 and HADR

KellySchlamb

Kelly Schlamb, DB2 pureScale and PureData Systems Specialist, IBM

In a comment on my previous blog post, somebody recently asked about when the PureData System for Transactions was going to be updated to include DB2 10.5, the latest and greatest version of DB2 that was released on June 14th, 2013.  At the time, I hinted that it would be coming soon but I couldn’t share any details.  The curtain can now been lifted.

PureData System for Transactions Fix pack 3 was made available for download on July 31st (and any new deployments of the system will automatically have this level as well).  This fix pack adds DB2 10.5 to the software stack of the system.  So, when you go to deploy a cluster you can now choose to deploy either DB2 10.1 or 10.5, depending on your needs.

As with every major release of DB2, this new version is jam-packed with countless features and enhancements.  There’s a lot of great information out there about 10.5 if you’re interested in reading more, including some entries from fellow blogger, Bill Cole and the What’s New section in the Information Center.

While many things will be of general interest to people – such as performance enhancements, new SQL capabilities, and further Oracle compatibility updates – I did want to specifically call out something that will be of great interest to those interested in the DB2 pureScale feature and PureData System for Transactions (which has pureScale built into it).  This is the addition of HADR support.  HADR is DB2’s High Availability / Disaster Recovery feature.  With HADR, changes to a primary database are replicated via transaction log shipping to one or more standbys, allowing for both local high availability and longer distance disaster recovery.  There are many reasons why DB2 users have embraced HADR, but the one I hear all the time is that it’s built right into DB2 which makes it very easy to setup and maintain.

In the case of a pureScale environment, you’re already getting the highest levels of availability.  For instance, with pureScale in the PureData System for Transactions box, a compute node failure wouldn’t result in the database going offline.  In this case, other compute nodes associated with the cluster would still be available to process database transactions.  So, with HA already being accounted for, the value in using HADR in this type of environment is for setting up a disaster recovery system.  Now, you do have other options for disaster recovery of PureData System for Transactions, such as using QReplication, and there is functionality within QRep that might make it a more suitable choice for a particular database (such as only having a need to replicate a partial set of the data in the database).  But with HADR you have another option and those that use it and love it today for their traditional DB2 environments, can now use it here as well.  For example, you can have a PureData System for Transactions at your primary site and another one at your disaster recovery site.  HADR is enabled at the database level and for one or more databases on the primary system, you can create a standby version at the other site.  If there is ever a need to failover to that standby site, it’s a relatively simple process to perform the takeover of the databases on the standby.

Rather than getting into the specifics about how this works, how to configure the environment, etc. I’m going to take the easy route and point you to this comprehensive document on this topic.

Just learning about the product? Check out the PureData for Transactions page .

Complexity Isn’t Necessary, or Occam was Right

Bill Cole

Bill Cole – Competitive Sales Specialist,Information Management, IBM

I have spent a lifetime in IT hiding the complexity of environments from my users.  When they’re sitting comfortably in front of their screens, they don’t need to know what we’ve done to build that environment any more than just how their car works.  In fact, it’s in their best interest that they don’t.

I feel the same way about clustering systems.  Over the years, there have been numerous forms of clustering available to us.  Think about the System 360 and all its successors, including the System Z.  They’re all just clusters of processors doing specialized tasks.  It’s that model we should emulate when we talk about clustering distributed systems.  While it can seem to be magic, it’s not.  It may be very complex at some level,  but we need not expose that complexity to the world.

And that’s why I think DB2 pureScale is the best clustering solution available.  It is simple, easy to use and you can understand it without spending two weeks with the development team.  No special hardware considerations.  No asking the experts for the best software, hardware and network combinations.  No additional complex software.  Just create your database with the pureScale option and you are in business.

For all that, you can have a pureScale cluster that’s exactly one system wide.  Why,you ask? Easy.  You’re not paying any penalty (or dollars) for the option and it gives you that comfy cushion knowing you can add a second or third node at any time for patching , platform maintenance -or to handle some ad hoc load.  Just start up that second node, make sure you’re in sync and you’re ready to go.  Simple!  Makes you look like a genius to management with all that forethought.  It’s not a choice of active-active or active-passive.  It’s a choice you can make when you need to.  You don’t worry about the directions to Dallas unless you’re going to Dallas, right?  But you’ve got that map app handy just in case.  Same thing here.

I spent more than a decade building and tuning clusters with competing software.  That was a struggle.  The complexity was never hidden.  Instead, it was – and is – viewed as a litmus test of your technical acumen ability.  Can’t understand it?  That’s your fault!  Only the cognoscenti need apply.  It requires that you understand not only the database, but how the operating systems deals with the database, network & disk, how the disk subsystem works, and how the network is configured and what that means to the overall environment.  Phew!  And the required system setup is now considered a gigantic security risk, without an answer.

And then there was tuning and troubleshooting.  Let me tell you it wasn’t easy.  Those activities will turn your hair white. And the answers most often lie with applications using the cluster.  Change the application to be cluster aware.  Really?!  Then the application isn’t actually portable.  Why did a node crash?  It was a network or disk blip within one of the nodes.  Really?  How do you know?  I just know.  Trust me…….  See what I mean?  (I’ve had those very conversations.  It’s uncomfortable for everyone.)

pureScale doesn’t require any of that magic.  It’s built on the proven principles we’ve used for years on System Z.  Even Larry Ellison thinks it’s cool technology.  Interesting endorsement, eh?   pureScale simply requires that you create the database with the proper option and DB2 takes care of the rest.  There’s no magical heartbeat to worry about.  No disk communications issues.  Better yet, there’s no additional software to install, manage and patch.  It’s just DB2.  What could be easier?

What about performance tuning?  DB2 pureScale for LUW scales almost linearly.  First, there are no application changes required.  If the application runs well on a single node, it will probably run just as well on multiple nodes.  Unlike other systems, pureScale is aware that it’s running in a cluster environment so it makes intelligent choices about optimizing queries and using buffer pages.  The autonomics built into the database monitor the activity and make adjustments on the fly.  Better adjustments than we’ll ever be able to make, too.

What about node failures?  Clusters aren’t immune to nodes failing – or being failed.  It’s how the environment prepares for and handles the failure that counts.  pureScale won’t lose transactions due to most types of node failures.  It won’t take hours to re-establish connections and recover the database state.  It knows the state of transactions and locks and blocks globally so the recovery is within a few heartbeats.  The world doesn’t stop while the system recovers.

So what does Occam have to do with this?  Remember what Occam’s razor stands for – I’m paraphrasing here – The simplest solution is the best.  I like that.  It appeals to the couch potato in me.  The geek in me argues about it.  After four decades in IT, I’ve come to treasure simplicity where I can find it.  Or make it.  Or hide it.  Being the “little man behind the curtain” isn’t comfortable.  DB2 pureScale for LUW gives you Occam’s simplicity.  After all, you’ve got an enterprise to keep running.  Let the database manage itself so you can make those contributions to your enterprise.  It’s much more rewarding than reading through countless screens of pointless performance stats.

Bill Cole on Twitter : @billcole_ibm