by Diego Calvo | Oct 11, 2018 | Big Data, Trick
Change the password in a simple way To change the password in simple form in hortonworks, just: Access by SSH Access by SSH to the machine, by default the credentials Are: (user: root, key: hadoop) To do this you can use WinSCP, Firecilla, putty,… Use the... by Diego Calvo | Oct 10, 2018 | Big Data, Python-example
Generate data to use for reading and writing in parquet format Example of random data to use in the following sections data = [] for x in range(5): data.append((random.randint(0,9), random.randint(0,9))) df = spark.createDataFrame(data, (“label”,... by Diego Calvo | Oct 9, 2018 | Big Data, Python-example
Generate data to use to read & write JSON Example of random data to use in the following sections data = [] for x in range(5): data.append((random.randint(0,9), random.randint(0,9))) df = spark.createDataFrame(data, (“label”, “data”))... by Diego Calvo | Sep 12, 2018 | Apache Hadoop, Big Data
Yarn definition Yarn (Yet Another Resource negotiator) is a data operating system and distributed Resource Manager, also known as Hadoop 2 as it is the evolution of Hadoop Map-Reduce. The most significant changes of Hadoop 2 over Hadoop 1 is that the thread technology... by Diego Calvo | Sep 5, 2018 | Apache Spark, Big Data, Scala-example
Example: Grouping data in a simple way Example where people table is grouped by last name. df.groupBy(“surname”).count().show() +——-+—–+ |surname|count| +——-+—–+ | Martin| 1| | Garcia| 3|... by Diego Calvo | Sep 4, 2018 | Apache Hadoop, Big Data
Kerberos definition Kerberos is an authentication protocol that allows two computers to demonstrate their identity mutually in a secure way. Implemented on a client server architecture and works on the basis of tickets that serve to demonstrate the identity of the...