withColumn {SparkR}R Documentation

WithColumn

Description

Return a new SparkDataFrame by adding a column or replacing the existing column that has the same name.

Usage

withColumn(x, colName, col)

## S4 method for signature 'SparkDataFrame,character'
withColumn(x, colName, col)

Arguments

x

a SparkDataFrame.

colName

a column name.

col

a Column expression (which must refer only to this SparkDataFrame), or an atomic vector in the length of 1 as literal value.

Details

Note: This method introduces a projection internally. Therefore, calling it multiple times, for instance, via loops in order to add multiple columns can generate big plans which can cause performance issues and even StackOverflowException. To avoid this, use select with the multiple columns at once.

Value

A SparkDataFrame with the new column added or the existing column replaced.

Note

withColumn since 1.4.0

See Also

rename mutate subset

Other SparkDataFrame functions: SparkDataFrame-class, agg(), alias(), arrange(), as.data.frame(), attach,SparkDataFrame-method, broadcast(), cache(), checkpoint(), coalesce(), collect(), colnames(), coltypes(), createOrReplaceTempView(), crossJoin(), cube(), dapplyCollect(), dapply(), describe(), dim(), distinct(), dropDuplicates(), dropna(), drop(), dtypes(), exceptAll(), except(), explain(), filter(), first(), gapplyCollect(), gapply(), getNumPartitions(), group_by(), head(), hint(), histogram(), insertInto(), intersectAll(), intersect(), isLocal(), isStreaming(), join(), limit(), localCheckpoint(), merge(), mutate(), ncol(), nrow(), persist(), printSchema(), randomSplit(), rbind(), rename(), repartitionByRange(), repartition(), rollup(), sample(), saveAsTable(), schema(), selectExpr(), select(), showDF(), show(), storageLevel(), str(), subset(), summary(), take(), toJSON(), unionAll(), unionByName(), union(), unpersist(), withWatermark(), with(), write.df(), write.jdbc(), write.json(), write.orc(), write.parquet(), write.stream(), write.text()

Examples

## Not run: 
##D sparkR.session()
##D path <- "path/to/file.json"
##D df <- read.json(path)
##D newDF <- withColumn(df, "newCol", df$col1 * 5)
##D # Replace an existing column
##D newDF2 <- withColumn(newDF, "newCol", newDF$col1)
##D newDF3 <- withColumn(newDF, "newCol", 42)
##D # Use extract operator to set an existing or new column
##D df[["age"]] <- 23
##D df[[2]] <- df$col1
##D df[[2]] <- NULL # drop column
## End(Not run)

[Package SparkR version 3.1.1 Index]