org.apache.spark.sql

DataFrameNaFunctions

final class DataFrameNaFunctions extends AnyRef

:: Experimental :: Functionality for working with missing data in DataFrames.

Annotations
@Experimental()
Since

1.3.1

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. DataFrameNaFunctions
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def drop(minNonNulls: Int, cols: Seq[String]): DataFrame

    (Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.

    (Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.

    Since

    1.3.1

  9. def drop(minNonNulls: Int, cols: Array[String]): DataFrame

    Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.

    Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.

    Since

    1.3.1

  10. def drop(minNonNulls: Int): DataFrame

    Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.

    Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.

    Since

    1.3.1

  11. def drop(how: String, cols: Seq[String]): DataFrame

    (Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.

    (Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.

    If how is "any", then drop rows containing any null or NaN values in the specified columns. If how is "all", then drop rows only if every specified column is null or NaN for that row.

    Since

    1.3.1

  12. def drop(how: String, cols: Array[String]): DataFrame

    Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.

    Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.

    If how is "any", then drop rows containing any null or NaN values in the specified columns. If how is "all", then drop rows only if every specified column is null or NaN for that row.

    Since

    1.3.1

  13. def drop(cols: Seq[String]): DataFrame

    (Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.

    (Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.

    Since

    1.3.1

  14. def drop(cols: Array[String]): DataFrame

    Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.

    Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.

    Since

    1.3.1

  15. def drop(how: String): DataFrame

    Returns a new DataFrame that drops rows containing null or NaN values.

    Returns a new DataFrame that drops rows containing null or NaN values.

    If how is "any", then drop rows containing any null or NaN values. If how is "all", then drop rows only if every column is null or NaN for that row.

    Since

    1.3.1

  16. def drop(): DataFrame

    Returns a new DataFrame that drops rows containing any null or NaN values.

    Returns a new DataFrame that drops rows containing any null or NaN values.

    Since

    1.3.1

  17. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  19. def fill(valueMap: Map[String, Any]): DataFrame

    (Scala-specific) Returns a new DataFrame that replaces null values.

    (Scala-specific) Returns a new DataFrame that replaces null values.

    The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type: Int, Long, Float, Double, String.

    For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.

    df.na.fill(Map(
      "A" -> "unknown",
      "B" -> 1.0
    ))
    Since

    1.3.1

  20. def fill(valueMap: Map[String, Any]): DataFrame

    Returns a new DataFrame that replaces null values.

    Returns a new DataFrame that replaces null values.

    The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type: Integer, Long, Float, Double, String.

    For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.

    import com.google.common.collect.ImmutableMap;
    df.na.fill(ImmutableMap.of("A", "unknown", "B", 1.0));
    Since

    1.3.1

  21. def fill(value: String, cols: Seq[String]): DataFrame

    (Scala-specific) Returns a new DataFrame that replaces null values in specified string columns.

    (Scala-specific) Returns a new DataFrame that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.

    Since

    1.3.1

  22. def fill(value: String, cols: Array[String]): DataFrame

    Returns a new DataFrame that replaces null values in specified string columns.

    Returns a new DataFrame that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.

    Since

    1.3.1

  23. def fill(value: Double, cols: Seq[String]): DataFrame

    (Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.

    (Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.

    Since

    1.3.1

  24. def fill(value: Double, cols: Array[String]): DataFrame

    Returns a new DataFrame that replaces null or NaN values in specified numeric columns.

    Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.

    Since

    1.3.1

  25. def fill(value: String): DataFrame

    Returns a new DataFrame that replaces null values in string columns with value.

    Returns a new DataFrame that replaces null values in string columns with value.

    Since

    1.3.1

  26. def fill(value: Double): DataFrame

    Returns a new DataFrame that replaces null or NaN values in numeric columns with value.

    Returns a new DataFrame that replaces null or NaN values in numeric columns with value.

    Since

    1.3.1

  27. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  28. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  29. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  30. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  31. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  32. final def notify(): Unit

    Definition Classes
    AnyRef
  33. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  34. def replace[T](cols: Seq[String], replacement: Map[T, T]): DataFrame

    (Scala-specific) Replaces values matching keys in replacement map.

    (Scala-specific) Replaces values matching keys in replacement map. Key and value of replacement map must have the same type, and can only be doubles or strings.

    // Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight".
    df.replace("height" :: "weight" :: Nil, Map(1.0 -> 2.0));
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname".
    df.replace("firstname" :: "lastname" :: Nil, Map("UNKNOWN" -> "unnamed");
    cols

    list of columns to apply the value replacement

    replacement

    value replacement map, as explained above

    Since

    1.3.1

  35. def replace[T](col: String, replacement: Map[T, T]): DataFrame

    (Scala-specific) Replaces values matching keys in replacement map.

    (Scala-specific) Replaces values matching keys in replacement map. Key and value of replacement map must have the same type, and can only be doubles or strings. If col is "*", then the replacement is applied on all string columns or numeric columns.

    // Replaces all occurrences of 1.0 with 2.0 in column "height".
    df.replace("height", Map(1.0 -> 2.0))
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
    df.replace("name", Map("UNKNOWN" -> "unnamed")
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
    df.replace("*", Map("UNKNOWN" -> "unnamed")
    col

    name of the column to apply the value replacement

    replacement

    value replacement map, as explained above

    Since

    1.3.1

  36. def replace[T](cols: Array[String], replacement: Map[T, T]): DataFrame

    Replaces values matching keys in replacement map with the corresponding values.

    Replaces values matching keys in replacement map with the corresponding values. Key and value of replacement map must have the same type, and can only be doubles or strings.

    import com.google.common.collect.ImmutableMap;
    
    // Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight".
    df.replace(new String[] {"height", "weight"}, ImmutableMap.of(1.0, 2.0));
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname".
    df.replace(new String[] {"firstname", "lastname"}, ImmutableMap.of("UNKNOWN", "unnamed"));
    cols

    list of columns to apply the value replacement

    replacement

    value replacement map, as explained above

    Since

    1.3.1

  37. def replace[T](col: String, replacement: Map[T, T]): DataFrame

    Replaces values matching keys in replacement map with the corresponding values.

    Replaces values matching keys in replacement map with the corresponding values. Key and value of replacement map must have the same type, and can only be doubles or strings. If col is "*", then the replacement is applied on all string columns or numeric columns.

    import com.google.common.collect.ImmutableMap;
    
    // Replaces all occurrences of 1.0 with 2.0 in column "height".
    df.replace("height", ImmutableMap.of(1.0, 2.0));
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
    df.replace("name", ImmutableMap.of("UNKNOWN", "unnamed"));
    
    // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
    df.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
    col

    name of the column to apply the value replacement

    replacement

    value replacement map, as explained above

    Since

    1.3.1

  38. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  39. def toString(): String

    Definition Classes
    AnyRef → Any
  40. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped