From monolithic databases via distributed systems to data in real time: stakes, opportunities, use-cases and technologies.
In our data-driven age, the ability to ingest all types of data in real
time is henceforth critical when it comes to unlock valuable and actionable insights
from data sets.
As you can imagine or you know, the IT industry is moving from monolithic
databases to distributed systems.
For those who unfamiliar, Connectikpeople.co recalls that when it comes to monolithic databases, we talk about: primary
place where people store and process the most interesting data; a primary place
where more features are accumulated, where databases become more complicated
and it gets harder to add new features while still maintaining all the legacy
ones.
While the distributed systems
like (HDFS) and a computation engine (MapReduce) overcome these limitations providing
respectively inter alia: a distributed file system and a computation engine for
storing and processing data in batches.
In fact, in this momentum, by
using HDFS, companies can now afford to collect additional data sets that are
valuable, but are too expensive to store in databases.
By using MapReduce,
people can generate reports and perform analytics.
But there is still a problem: ingest
all types of data in real time.
Apache Kafka comes into play with the following features:
- Can store high volume of data on commodity hardware,
- It's a multi-subscription system,
- The same published data set can be consumed multiple times,
- can deliver messages to both real-time and batch consumers at the same time without performance degradation,
- Can be used to provide the reliability needed for mission critical data, and more.
Uber, Twitter, Netflix, LinkedIn, Yahoo, Cisco, and Goldman Sachs use Kafka
as a central place to ingest all types of data in real time.
Connectikpeople.co recalls that a data platform powered by distributed
pub/sub systems like Kafka will play an important role in the Bigdata eco
system as more companies are moving towards more real-time processing.
The following specialized systems enable companies to derive new insights
and build streamlined applications.