OEID 3.0 First Look - Update/Delete Data Improvements

Data Modeling, Endeca, OEID, Oracle Endeca 3.0 Getting Started, Tips And Tricks

For almost a decade, the core Endeca MDEX engine that underpins Oracle Endeca Information Discovery (OEID) has supported one-time indexing (often referred to as a Baseline Update) as well as incremental updates (often referred to as partials).  Through all of the incarnations of this functionality, from "partial update pipelines" to "continuous query", there was one common limitation.  Your update operations were always limited to act on "per-record" operations.

If you're a person coming from a SQL/RDBMS background, this was a huge limitation and forced a conceptual change in the way that you think about data.  Obviously, Endeca is not (and never was) a relational system but the freedom to update data whenever and where ever you please, that SQL provided, was often a pretty big limitation, especially at scale.  Building an index nightly for 100,000 E-Commerce products is no big deal.  Running a daily process to feed 1 million updated records into a 30 million record Endeca Server instance just so that a set of warranty claims could be "aged" from current month to prior month is something completely different.

Thankfully, with the release of the latest set of components for the ETL layer of OEID (called OEID Integrator), huge changes have been made to the interactions available for modifying an Endeca Server instance (now called a "Data Domain").  If you've longed for a "SQL-style experience" where records can be updated or deleted from a data store by almost any criteria imaginable, OEID Integrator v3.0 delivers.

OEID Incremental Updates

BigData, Data Modeling, Endeca, Integration, OEID, Tips And Tricks

A fairly common approach...

More often than not, when pulling data from a database into OEID, we need to employ incremental updates.   To introduce incremental updates, we need a way to identify which records have been added, updated or deleted since our last load.  This change identification is commonly referred to as change data capture, or CDC.  There is no one way to accomplish CDC and often the best approach is dictated by the mechanisms in place in the source system.  Usually the database we're pulling from isn't leveraging any explicit change data capture (CDC) mechanism.

Pivoting in Endeca

BigData, Data Modeling, Endeca, OEID, Tips And Tricks

The impetus for Ranzal's SmartStateManager

The prologue.

Pivoting across entities in your organization's data is a central feature of any data discovery application.  For example, understanding what parts in your organization's supply chain have the highest number of quality issues, and then pivoting to the discrete list of suppliers that provide those parts is a powerful, yet expected, data discovery capability.

About Edgewater Ranzal

Edgewater Ranzal is an integrated Business Analytics solution provider deeply rooted in Enterprise Performance Management (EPM). Coupled with our Business Intelligence (BI) and Big Data (BD) expertise, we provide holistic solutions that help organizations define, measure, and innovate their business, provide a clear vision, and drive business value.

Subscribe to Email Updates

Recent Posts

Posts by Topic

see all