Unión de conjuntos
group1 = sc.parallelize(['A','B','C','D']) group2 = sc.parallelize(['C','D','E','F']) rdd_aux = group1.union(group2) print (rdd_aux.collect())
['A', 'B', 'C', 'D', 'C', 'D', 'E', 'F']
Intersección de conjuntos
rdd_aux = group1.intersection(group2) rdd_aux.collect()
['C', 'D']
0 comentarios