Just prior to Oracle OpenWorld 2013, Oracle announced the database version 12c. Among many architectural and technical advances of this version, a new feature was introduced known as the ‘In-Memory Option’. Oracle was not the first to achieve this – other competitors came in much earlier but Oracle’s In-Memory Database option offers much better features.
What is an “In-Memory Database” anyway?
Simply put, this option allows the entire or a part of the database to be loaded in memory and run from there. This is the latest trend in the RDBMS technology. As the name suggests, the database (full or in part) happens to reside in memory and DML operations happen in-memory. Since this memory is blazingly fast compared to the disk, one gets far greater throughput. Claims of being 10 to 100 times faster have been heard. The basic idea is to avoid (or rather drastically reduce) disk Input/Output and achieve better results by in-memory computing.
Row-store vs. Column-store
Traditional databases store data in row format. The database row is a collection of many columns that contain various attributes of the database record. These rows are stored in tables. Such a database is eminently suitable for OLTP type of operations (e.g. ERP). In such scenarios, the user either inserts or updates or deletes data from the database tables. Rows are stored in the hard disk in the form of blocks. One block could store one or many rows. This type of database organization presents no problem for typical day-to-day operations.
The problems begin with reporting and dash-boarding. It is inevitable that users will require reports and dashboards for their daily functions, basically in the form of analytical queries. Analytical queries fetch only a few columns. But these columns are fetched from many tables and traverse thousands of rows – sometimes millions. This results in a huge number of disk seeks which slow down the operations. This cannot be avoided – a typical OLTP database has served and will continue to serve two purposes – standard OLTP as well as analytical reporting.
To solve this problem of analytics, the column-oriented database was designed and developed (FACT/DIMENSION databases). The column-oriented database stores data in column format. Imagine this as a transpose of a typical row-based table – only now, columns become rows while rows become columns (of course, this matter is not as trivial as it sounds). The processing (querying) is very fast because of this format. This is how data warehouses are organized and built.
But data warehouses are expensive to build and maintain. They require periodic refreshing since the updated data still resides on the OLTP database. This process of refreshing, even if it is incremental, takes a significant amount of time. And finally, (report) cubes have to be created on the data warehouse – any new report requires creating a fresh cube.
Oracle’s In-Memory Database Option
As stated earlier, many database vendors came up with the in-memory database much before Oracle. But Oracle had three major goals in mind:
- 100X Faster Queries, Real Time Analytics, and Instantaneous Queries on OLTP DB or Data Warehouse:This was the primary goal. Other in-memory databases offered the fast queries while sacrificing OLTP performance.
- Insert rows 3x to 4x Faster: This was the first major challenge. How can the OLTP be speeded up? Traditional wisdom suggested that it was either one way or other – faster OLTP by row-format or faster analytics by column-format but not both.
- Transparent: No application changes; minutes to implement or just throw the switch. This was the second major challenge. It was decided that the users cannot be asked to redesign and rewrite the huge inventory of applications just to get performance improvement – no one would ‘buy’ into such a proposition.
Oracle solved the problem by designing the dual format database. The entire table/partition would be in memory (assuming that memory is available). The in-memory organization would store the data in a row format as well as the column format. Both formats would be simultaneously active and consistent in terms of the transaction. This resulted in:
- 100x Faster Analytics using in-memory column format
- 2x-3x faster OLTP using row format
Why is this a Game Changer?
The magic is the ‘dual format’. Row store facilitates faster OLTP while column store speeds up analytics. As the database column store is in memory, analytical cubes can be created on the fly in memory.
It has been observed that a typical OLTP table has 2 to 4 indexes that are required for fast OLTP while there are a whopping 12 to 25 additional indexes just for analytical processing or reporting. Database architects and administrators spend quite a bit of effort to determine which indexes are necessary and which not. Once created, these indexes slow down OLTP. They are expensive to maintain. What’s more – they service only existing queries and reports. With the in-memory option, the only indexes that are necessary are those that enforce referential integrity – those involved in primary and foreign key relationships; the analytical indexes become redundant and can be dropped. This further improves OLTP.
Other in-memory databases use the column structure only and hence require an extensive (and expensive) application redesign/rewrite. In case of the Oracle In-memory database option, it is a matter of ‘throwing a switch’ according to Oracle.
The ROI
The in-memory database option cannot be realistically implemented in traditional servers that have few sockets/cores and limited memory. This approach will need to look into some serious hardware including extensive CPUs and memory. This kind of hardware is expensive. The 12c In-Memory option is also a paid option. But I feel that this expense justifies itself:
- The first big advantage is that there is no redesign required – applications will run the same as they did previously, the backup/recovery, replication, maintenance continues as before and it is completely backward-compatible
- OLTP will run 2x to 4x times faster
- Users will be able to execute dashboards and reports that finish at least 100x faster. Faster outputs mean more/better questions, faster answers, quicker resolutions and better business
- Enterprises can retire reporting databases, operational data stores (ODS) and perhaps, the data warehouse as well
Conclusion
Oracle Database In-Memory significantly speeds OLTP as well as analytics queries by orders of magnitude hitherto unheard of. It enables real-time decisions. It is easily deployed under any existing application that is compatible with the Oracle Database as no application changes are required. Over and above this, the option is scalable to any level. Extreme performance for both analytics and transactions enables organizations to continuously optimize processes and make rapid data-driven decisions thereby transforming into Real-Time Enterprises that are extremely agile and efficient.
References