Using Oracle GoldenGate for Trickle-Feeding RDBMS Transactions into Hive and HDFS

A few months ago I wrote a post on the blog around using Apache Flume to trickle-feed log data into HDFS and Hive, using the Rittman Mead website as the source for the log entries. Flume is a good technology to use for this type of capture requirement as it captures log entries, HTTP calls, JMS queue entries and other “event” sources easily, has a resilient architecture and integrates well with HDFS and Hive. (more...)

Analyzing Twitter Data using Datasift, MongoDB, Hive and ODI12c

Last week I posted an article on the blog around analysing Twitter data using Datasift, MongoDB and Pig, where I used the Datasift service to stream tweets about Rittman Mead into a MongoDB NoSQL database, and then queried the dataset using Pig. The context for this is the idea of a “data reservoir”, where we supplement the more traditional file and relational datasets we find in data warehouses with other data, typically machine generated, (more...)

Analyzing Twitter Data using Datasift, MongoDB and Pig

If you followed our recent postings on the updated Oracle Information Management Reference Architecture, one of the key concepts we talk about is the “data reservoir”. This is a pool of additional data that you can add to your data warehouse, typically stored on Hadoop or NoSQL databases, where you store unstructured, semi-structured or unprocessed structured data in large volume and at low cost. Adding a data reservoir gives you the ability to leverage (more...)

Taking a Look at the Oracle Database 12c In-Memory Option

The In-Memory Option for Oracle Database 12c became available a few weeks ago with the 12.1.0.2 database patchset, adding column-store and in-memory capabilities to the Oracle Database. Unlike pure in-memory databases such as Oracle TimesTen, the in-memory option adds an in-memory column-store feature to the regular row-based storage in the Oracle database, creating in-memory copies of selected row-store tables in a compressed column-based storage format, with the whole process being automatic and (more...)