org.apache.spark.mllib.rdd
Class MLPairRDDFunctions<K,V>

Object
  extended by org.apache.spark.mllib.rdd.MLPairRDDFunctions<K,V>
All Implemented Interfaces:
java.io.Serializable

public class MLPairRDDFunctions<K,V>
extends Object
implements scala.Serializable

Machine learning specific Pair RDD functions.

See Also:
Serialized Form

Constructor Summary
MLPairRDDFunctions(RDD<scala.Tuple2<K,V>> self, scala.reflect.ClassTag<K> evidence$1, scala.reflect.ClassTag<V> evidence$2)
           
 
Method Summary
static
<K,V> MLPairRDDFunctions<K,V>
fromPairRDD(RDD<scala.Tuple2<K,V>> rdd, scala.reflect.ClassTag<K> evidence$3, scala.reflect.ClassTag<V> evidence$4)
          Implicit conversion from a pair RDD to MLPairRDDFunctions.
 RDD<scala.Tuple2<K,Object>> topByKey(int num, scala.math.Ordering<V> ord)
          Returns the top k (largest) elements for each key from this RDD as defined by the specified implicit Ordering[T].
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MLPairRDDFunctions

public MLPairRDDFunctions(RDD<scala.Tuple2<K,V>> self,
                          scala.reflect.ClassTag<K> evidence$1,
                          scala.reflect.ClassTag<V> evidence$2)
Method Detail

fromPairRDD

public static <K,V> MLPairRDDFunctions<K,V> fromPairRDD(RDD<scala.Tuple2<K,V>> rdd,
                                                        scala.reflect.ClassTag<K> evidence$3,
                                                        scala.reflect.ClassTag<V> evidence$4)
Implicit conversion from a pair RDD to MLPairRDDFunctions.


topByKey

public RDD<scala.Tuple2<K,Object>> topByKey(int num,
                                            scala.math.Ordering<V> ord)
Returns the top k (largest) elements for each key from this RDD as defined by the specified implicit Ordering[T]. If the number of elements for a certain key is less than k, all of them will be returned.

Parameters:
num - k, the number of top elements to return
ord - the implicit ordering for T
Returns:
an RDD that contains the top k values for each key