Big Data Archivos - Page 6 of 8

Apache Kafka

by Diego Calvo | Jun 27, 2018 | Big Data

Kafka definition Apache Kafka is a message intermediation system based on the publisher/subscriber model. Kafka is considered a persistent, scalable, replicated, and fault-tolerant system. To these features is added the speed of readings and writes that make it an...

Apache Nifi

by Diego Calvo | Jun 27, 2018 | Big Data

Definition of Nifi Apache NiFi is an integrated real-time data processing and logistics platform to automate data movement between different systems quickly, easily and securely. Apache Hifi is an ETL tool that is responsible for loading data from different sources,...

if, for, yield, foreach & while in Scala

by Diego Calvo | Jun 22, 2018 | Apache Spark, Big Data, Scala-example

IF Example of conditional use where it determines whether a note is approved or suspense var x = 6 if (x > = 5) { println (“approved”) } else {} println (“Substeno”) } X: Int = 6 Approved FOR Example of using “for” where...

ElasticSearch

by Diego Calvo | Jun 22, 2018 | Big Data

ElasticSearch definition Elasticsearch is an open-source real-time search server that provides indexed and distributed storage based on Lucene. It provides all the Lucene search power for full-text searches, but simplifies queries through its to RestFul Web interface....

Comparison of Scala, Java, Python and R in Apache Spark

by Diego Calvo | Jun 20, 2018 | Big Data

Metric Scala Java Python R Type Compiled Compiled Interpreted Interpreted Based on JVM If If Not Not Cumbersome (-) (+) (-) (-) Length of code (-) (+) (-) (-) Productivity (+) (-) (+) (+) Scalability (+) (+) (-)...

Apache Spark

by Diego Calvo | Jun 20, 2018 | Apache Spark, Big Data

Spark definition Apache Spark is a distributed computing system of free software, which allows to process large sets of data on a set of machines simultaneously, providing horizontal scalability and fault tolerance. To meet these features provides a program...