Archive for the ‘data virtualization’ Category

in memory computing

February 14, 2012

Approximately two years back I made a post on Enterprise Data Fabric technology. The aim of the data grid or “in memory data store” is to remove the movement of data in and out of slower disk storage for processing. Instead the data in kept in “pooled main memory” during the processing.

To get above the physical limitations on the amount of main memory, the data grid technologies will create a pooled memory cluster with data distributed over multiple nodes connected using a high bandwidth network.

With SAP bringing HANA, an in memory database that has option to store data in traditional row store and column store (read storing data in rows and columns) within an in-memory technology and Oracle bringing the Exaletics appliance, the in-memory computing is getting more attention.

So, the claims are that the in memory technology will boost the performance by multiple degrees. But the truth is it can only remove the time taken to move the data out of disk into main memory. If there is a query that is processing the data using a wrong access method, even if all the data is moved into a memory store the processing will still take as long to provide the answer!

In memory computing would need re designing the applications to use the technology for better information processing. OLTP workload will surely improve the performance due to memory caching but the consistency of the data need to be managed by the application moving to a event based architecture.

OLAP and Analytical workloads would also improve the performance by using memory based column stores with a good design of the underlying structure of data that suits the processing requirements.

Overall, in memory computing is promising at the moment but without the right design to use the new technology, the old systems will not just get the performance boost just by moving the data store into the main memory

Let us wait and see how the technology shapes further in future…..

>data consolidation (ETL) and data federation (EII)

June 16, 2011

>Operational IT systems focus on providing the support for the business operations & enable capture, validation, storage and presentation of transactional data during normal running of the operations. They contain latest view of the organization’s operational state.

Traditionally, the data from various operational systems is extracted, transformed and loaded into a central warehouse for historical trending and analytic purposes. This ETL process will need a separate IT infrastructure to hold the data as well as it introduces some time lag in making the information in the OLTP systems available in the central data warehouse.

When the costs/resources required for consolidating data in the traditional way is not suitable due to the latest trends of acquisitions etc., there is a need for a different mechanism of data integration. The relatively different way of looking at this problem is to provide a semantic layer that can be used to access the data across heterogeneous sources for analytical purposes. This new way is called as “Data Federation” or “Data Virtualization” or EII – Enterprise Information Integration.

Key advantages of EII are quick delivery and lower costs. Key disadvantage is the performance of the solution and dependence on the source systems.

A good use case of data virtualization in my view is to consolidate different enterprise data warehouses due to mergers/acquisitions.

Traditional ETL and data warehouse technology vendors are coming up with data federation tools. Informatica Data Services uses a consolidate data integration philosophy where as Business Objects data federator uses a virtual tables in the BO universes for providing same functionality. Composite Integration Server is the independent technology provider in this area.

Key considerations in selecting the data federation and associated technologies are
1. native access to the heterogeneous source systems
2. capabilities of access method optimization
3. caching capabilities of the federation platform
4. metadata discovery capabilities from various sources
5. ease of development

A carefully chosen hybrid approach of consolidation and federation of data is required for a successful enterprise in the modern world.