by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example
Create lists Examples that define the lists to be used in the rest of the sections of the post val list1 = 1::2::3::4::5::Nil val list2 = List(1,2,3,4,5) val list3 = List.range(1,6) val list4 = List.range(1,6,2) val list5 = List.fill(5)(1) val list6 =... by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example
Displays a number of examples of file compression and decompression in different formats of both rendering and Compression. Compress Json Files val rdd = sc.parallelize( Array(1, 2, 3, 4, 5) ) // Define RDD val df = rdd.toDF() // df transform... by Diego Calvo | Jul 5, 2018 | Apache Spark, Big Data
Spark Streaming definition Apache Spark Streaming is an extension of the Spark core API, which responds to real-time data processing in a scalable, high-performance, fault-tolerant manner. Spark Sreaming live was developed by the University of California at Berkeley,... by Diego Calvo | Jun 27, 2018 | Apache Spark
RDD definition RDD Resilient distributed datasets represents an immutable and partitioned collection of elements that can be operated in parallel. A RDD can be created or paralelizando a collection of data (list, dictionary,..) or loading it of an external storage... by Diego Calvo | Jun 22, 2018 | Apache Spark, Big Data, Scala-example
IF Example of conditional use where it determines whether a note is approved or suspense var x = 6 if (x > = 5) { println (“approved”) } else {} println (“Substeno”) } X: Int = 6 Approved FOR Example of using “for” where... by Diego Calvo | Jun 20, 2018 | Apache Spark, Big Data
Spark definition Apache Spark is a distributed computing system of free software, which allows to process large sets of data on a set of machines simultaneously, providing horizontal scalability and fault tolerance. To meet these features provides a program...