Diego Calvo, Autor en Diego Calvo

Connect with Scala to the HDFS of Hadoop

by Diego Calvo | Aug 10, 2018 | Apache Hadoop, Apache Spark, Big Data

Write data to HDFS Example of how to write RDD data in a HDFS of Hadoop. Delete the file if it exists Import Scala. sys. process. _ "HDFs DFS-rm-R/pruebas"! Record a RDD in HDFS Val Rdd = sc. parallelize (List ( (0, 60), (0, 56), (0, 54), (0,...

Scala DataFrames

by Diego Calvo | Jul 23, 2018 | Apache Spark, Big Data, Scala-example

Create DataFrames Example of how to create a dataframe in Scala. import org.apache.spark.sql.types.{StructType, StructField, StringType, IntegerType}; val data = List( Row(“Peter”,”Garcia”,24,24000),...

Scala Dataset

by Diego Calvo | Jul 21, 2018 | Apache Spark, Big Data, Scala-example

Creating datasets RDD Simple to Dataset Example of creating a dataset from a RDD val rdd = sc.parallelize(List(1,2,3,4,5)) val ds = spark.createDataset(rdd) ds.show() +—–+ |value| +—–+ | 1 | | 2 | | 3 | | 4 | | 5 | +—–+ Classes to...

Scala Lists

by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example

Create lists Examples that define the lists to be used in the rest of the sections of the post val list1 = 1::2::3::4::5::Nil val list2 = List(1,2,3,4,5) val list3 = List.range(1,6) val list4 = List.range(1,6,2) val list5 = List.fill(5)(1) val list6 =...

HDFS – compress & decompress in Scala

by Diego Calvo | Jul 20, 2018 | Apache Spark, Big Data, Scala-example

Displays a number of examples of file compression and decompression in different formats of both rendering and Compression. Compress Json Files val rdd = sc.parallelize( Array(1, 2, 3, 4, 5) ) // Define RDD val df = rdd.toDF() // df transform...

« Older Entries

Next Entries »

Connect with Scala to the HDFS of Hadoop

Scala DataFrames

Scala Dataset

Scala Lists

HDFS – compress & decompress in Scala

Mi filosofía

Contacto

Aviso legal