by Diego Calvo | Jul 5, 2018 | Big Data
Messaging systems provide a communication channel between applications of the big data ecosystem, this systems usually implement queue systems, such as: Apache KAFKA: Message intermediation system based on the publisher/subscriber model. RabbitMQ: Message Queuing... by Diego Calvo | Jul 5, 2018 | Apache Hadoop, Big Data
The Big data ecosystems data processing frameworks are classified in the following blocks: Batch Processing Hadoop Map-Reduce: Batch or batch processing engine. Real-time processing Apache Storm Apache Samza IBM InfoSphere Apache S4 (Yahoo) Apache complexion Hybrid... by Diego Calvo | Jul 5, 2018 | Big Data
Storm definition Apache Storm is a low-latency, high-availability real-time distributed computing system based on master-slave architecture. Storm is ideal for working with data that need to be analyzed in real time where latency is a variable to take into account, an... by Diego Calvo | Jul 3, 2018 | Apache Hadoop, Big Data
RabbitMQ definition RabbitMQ is an MQ Message Queuing system that allows you to communicate to a multitude of actors in a fast, secure, asynchronous and reliable way. RabbitMQ acts as a middleware between producers and consumers of messages. Features Guarantees the... by Diego Calvo | Jul 2, 2018 | Big Data
Flume definition Apache Flume is a distributed service that reliably and efficiently moves large amounts of data, especially logs. Ideal for online analytics applications in Hadoop environments. Flume has a simple and flexible architecture based on streaming data,...