Class DataFrameNaFunctions
DataFrame
s.
- Since:
- 1.3.1
-
Method Summary
Modifier and TypeMethodDescriptiondrop()
Returns a newDataFrame
that drops rows containing any null or NaN values.drop
(int minNonNulls) Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values.Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.(Scala-specific) Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.Returns a newDataFrame
that drops rows containing null or NaN values.Returns a newDataFrame
that drops rows containing any null or NaN values in the specified columns.Returns a newDataFrame
that drops rows containing null or NaN values in the specified columns.(Scala-specific) Returns a newDataFrame
that drops rows containing null or NaN values in the specified columns.(Scala-specific) Returns a newDataFrame
that drops rows containing any null or NaN values in the specified columns.fill
(boolean value) Returns a newDataFrame
that replaces null values in boolean columns withvalue
.Returns a newDataFrame
that replaces null values in specified boolean columns.(Scala-specific) Returns a newDataFrame
that replaces null values in specified boolean columns.fill
(double value) Returns a newDataFrame
that replaces null or NaN values in numeric columns withvalue
.Returns a newDataFrame
that replaces null or NaN values in specified numeric columns.(Scala-specific) Returns a newDataFrame
that replaces null or NaN values in specified numeric columns.fill
(long value) Returns a newDataFrame
that replaces null or NaN values in numeric columns withvalue
.Returns a newDataFrame
that replaces null or NaN values in specified numeric columns.(Scala-specific) Returns a newDataFrame
that replaces null or NaN values in specified numeric columns.Returns a newDataFrame
that replaces null values in string columns withvalue
.Returns a newDataFrame
that replaces null values in specified string columns.(Scala-specific) Returns a newDataFrame
that replaces null values in specified string columns.Returns a newDataFrame
that replaces null values.(Scala-specific) Returns a newDataFrame
that replaces null values.Replaces values matching keys inreplacement
map with the corresponding values.Replaces values matching keys inreplacement
map with the corresponding values.(Scala-specific) Replaces values matching keys inreplacement
map.replace
(scala.collection.immutable.Seq<String> cols, scala.collection.immutable.Map<T, T> replacement) (Scala-specific) Replaces values matching keys inreplacement
map.
-
Method Details
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing any null or NaN values.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing any null or NaN values in the specified columns.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that drops rows containing any null or NaN values in the specified columns.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing null or NaN values in the specified columns.If
how
is "any", then drop rows containing any null or NaN values in the specified columns. Ifhow
is "all", then drop rows only if every specified column is null or NaN for that row.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
how
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
minNonNulls
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing null or NaN values.If
how
is "any", then drop rows containing any null or NaN values. Ifhow
is "all", then drop rows only if every column is null or NaN for that row.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
how
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that drops rows containing null or NaN values in the specified columns.If
how
is "any", then drop rows containing any null or NaN values in the specified columns. Ifhow
is "all", then drop rows only if every specified column is null or NaN for that row.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
how
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
minNonNulls
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
drop
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that drops rows containing less thanminNonNulls
non-null and non-NaN values in the specified columns.- Overrides:
drop
in classDataFrameNaFunctions<Dataset>
- Parameters:
minNonNulls
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null or NaN values in numeric columns withvalue
.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null or NaN values in numeric columns withvalue
.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null values in string columns withvalue
.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null values in boolean columns withvalue
.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.- Specified by:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
value
- (undocumented)cols
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
Returns a newDataFrame
that replaces null values.The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type:
Integer
,Long
,Float
,Double
,String
,Boolean
. Replacement values are cast to the column data type.For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
import com.google.common.collect.ImmutableMap; df.na.fill(ImmutableMap.of("A", "unknown", "B", 1.0));
- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
valueMap
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
fill
Description copied from class:DataFrameNaFunctions
(Scala-specific) Returns a newDataFrame
that replaces null values.The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type:
Int
,Long
,Float
,Double
,String
,Boolean
. Replacement values are cast to the column data type.For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
df.na.fill(Map( "A" -> "unknown", "B" -> 1.0 ))
- Overrides:
fill
in classDataFrameNaFunctions<Dataset>
- Parameters:
valueMap
- (undocumented)- Returns:
- (undocumented)
- Inheritdoc:
-
replace
Description copied from class:DataFrameNaFunctions
(Scala-specific) Replaces values matching keys inreplacement
map.// Replaces all occurrences of 1.0 with 2.0 in column "height". df.na.replace("height", Map(1.0 -> 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name". df.na.replace("name", Map("UNKNOWN" -> "unnamed")); // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns. df.na.replace("*", Map("UNKNOWN" -> "unnamed"));
- Specified by:
replace
in classDataFrameNaFunctions<Dataset>
- Parameters:
col
- name of the column to apply the value replacement. Ifcol
is "*", replacement is applied on all string, numeric or boolean columns.replacement
- value replacement map. Key and value ofreplacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.- Returns:
- (undocumented)
- Inheritdoc:
-
replace
public <T> Dataset<Row> replace(scala.collection.immutable.Seq<String> cols, scala.collection.immutable.Map<T, T> replacement) Description copied from class:DataFrameNaFunctions
(Scala-specific) Replaces values matching keys inreplacement
map.// Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight". df.na.replace("height" :: "weight" :: Nil, Map(1.0 -> 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname". df.na.replace("firstname" :: "lastname" :: Nil, Map("UNKNOWN" -> "unnamed"));
- Specified by:
replace
in classDataFrameNaFunctions<Dataset>
- Parameters:
cols
- list of columns to apply the value replacement. Ifcol
is "*", replacement is applied on all string, numeric or boolean columns.replacement
- value replacement map. Key and value ofreplacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.- Returns:
- (undocumented)
- Inheritdoc:
-
replace
Description copied from class:DataFrameNaFunctions
Replaces values matching keys inreplacement
map with the corresponding values.import com.google.common.collect.ImmutableMap; // Replaces all occurrences of 1.0 with 2.0 in column "height". df.na.replace("height", ImmutableMap.of(1.0, 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name". df.na.replace("name", ImmutableMap.of("UNKNOWN", "unnamed")); // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns. df.na.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
- Overrides:
replace
in classDataFrameNaFunctions<Dataset>
- Parameters:
col
- name of the column to apply the value replacement. Ifcol
is "*", replacement is applied on all string, numeric or boolean columns.replacement
- value replacement map. Key and value ofreplacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.- Returns:
- (undocumented)
- Inheritdoc:
-
replace
Description copied from class:DataFrameNaFunctions
Replaces values matching keys inreplacement
map with the corresponding values.import com.google.common.collect.ImmutableMap; // Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight". df.na.replace(new String[] {"height", "weight"}, ImmutableMap.of(1.0, 2.0)); // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname". df.na.replace(new String[] {"firstname", "lastname"}, ImmutableMap.of("UNKNOWN", "unnamed"));
- Overrides:
replace
in classDataFrameNaFunctions<Dataset>
- Parameters:
cols
- list of columns to apply the value replacement. Ifcol
is "*", replacement is applied on all string, numeric or boolean columns.replacement
- value replacement map. Key and value ofreplacement
map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.- Returns:
- (undocumented)
- Inheritdoc:
-