Combining Spark Streaming and Data Frames for Near-Real Time Log Analysis & Enrichment

A few months ago I posted an article on the blog around using Apache Spark to analyse activity on our website, using Spark to join the site activity to some reference tables for some one-off analysis. In this article I’ll be taking an initial look at Spark Streaming, a component within the overall Spark platform that allows you to ingest and process data in near real-time whilst keeping the same overall code-based as your (more...)

Taking a Closer Look at Knowledge Modules in ODI12c – Component-Style and Multi-Connect KMs

Another question that came up from the ODI12c Bootcamp Course I’m delivering for a client in London at the moment is how to choose between the different knowledge modules that come with ODI12c. What with the choice now between template-style KMs and the new component-style KMs, the new option of multi-connect KMs, and the general question around which KM you pick within a KM type when building a table mapping, I thought it’d be interesting (more...)

Using Streams with ODI12c for Oracle-to-Oracle Change Data Capture

Although Oracle GoldenGate replaced Oracle Streams a couple of years ago as the recommended data replication and change data capture technology for Oracle databases, many customers still on Oracle Database 11gR2 or earlier still use Streams for Oracle-to-Oracle change data capture as it works and compared to GoldenGate doesn’t require any additional licensing. Oracle’s GoldenGate Statement of Direction paper from 2014 states that streams in Oracle 11gR2 will continue to be supported but no future (more...)

Options for Enabling ODI11g+12c Standalone Agents for High-Availability (or … Why JEE Agents are the Best Option)

A few years ago we posted some articles on the blog around Oracle Data Integrator ETL restartability and resilience, and making ODI’s agents highly-availabile. The context around those posts was a large data integration project we were doing for a client who was deploying Oracle Fusion Middleware across the entire enterprise, and they wanted us to use the new ability within ODI11g to deploy agents within WebLogic Server Managed Servers to provide true high-availability (more...)