by Diego Calvo | Aug 10, 2018 | Apache Hadoop, Apache Spark, Big Data
Write data to HDFS Example of how to write RDD data in a HDFS of Hadoop. Delete the file if it exists Import Scala. sys. process. _ "HDFs DFS-rm-R/pruebas"! Record a RDD in HDFS Val Rdd = sc. parallelize (List ( (0, 60), (0, 56), (0, 54), (0,... by Diego Calvo | Jul 23, 2018 | Apache Spark, Big Data, Scala-example
Create DataFrames Example of how to create a dataframe in Scala. import org.apache.spark.sql.types.{StructType, StructField, StringType, IntegerType}; val data = List( Row(“Peter”,”Garcia”,24,24000),... by Diego Calvo | Jul 21, 2018 | Apache Spark, Big Data, Scala-example
Creating datasets RDD Simple to Dataset Example of creating a dataset from a RDD val rdd = sc.parallelize(List(1,2,3,4,5)) val ds = spark.createDataset(rdd) ds.show() +—–+ |value| +—–+ | 1 | | 2 | | 3 | | 4 | | 5 | +—–+ Classes to... by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example
Create lists Examples that define the lists to be used in the rest of the sections of the post val list1 = 1::2::3::4::5::Nil val list2 = List(1,2,3,4,5) val list3 = List.range(1,6) val list4 = List.range(1,6,2) val list5 = List.fill(5)(1) val list6 =... by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example
Displays a number of examples of file compression and decompression in different formats of both rendering and Compression. Compress Json Files val rdd = sc.parallelize( Array(1, 2, 3, 4, 5) ) // Define RDD val df = rdd.toDF() // df transform...