Data ingest tools for BIG data ecosystems are classified into the following blocks:
- Apache Nifi: An ETL tool that takes care of loading data from different sources, passes it through a process flow for treatment, and dumps it into another source.
- Apache Sqoop: Bidirectional data transfer between Hadoop and an SQL databases (structured data)
- Apache Flume: System of ingesting semi-structured or unstructured data in streaming on HDFS or Hbase.
On the other hand there are messaging systems with own ingest functions, such as:
- Apache KAFKA: Message intermediation system based on the publisher/subscriber model.
- RabbitMQ: Message Queuing System (MQ) that acts as a middleware between producers and consumers.
- Amazon Kinesis: Kafka’s counterpart to the Amazon Web Services infrastructure.
- Microsoft Azure Event Hubs: Kafka’s counterpart to the Microsoft Azure infrastructure.
- Google Pub/Sub: Kafka’s counterpart to Google Cloud infrastructure.
0 Comments