R frontend for Spark


[Up] [Top]

Documentation for package ‘SparkR’ version 2.0.0

Help Pages

A B C D E F G H I J K L M N O P Q R S T U V W Y misc

-- A --

abs abs
acos acos
add_months add_months
AFTSurvivalRegressionModel-class S4 class that represents a AFTSurvivalRegressionModel
agg Count
agg summarize
alias alias
approxCountDistinct approxCountDistinct
approxQuantile crosstab
arrange Arrange
array_contains array_contains
as.data.frame Download data from a SparkDataFrame into a data.frame
as.data.frame-method Download data from a SparkDataFrame into a data.frame
as.DataFrame Create a SparkDataFrame
asc S4 class that represents a SparkDataFrame column
ascii ascii
asin asin
atan atan
atan2 atan2
attach Attach SparkDataFrame to R search path
attach-method Attach SparkDataFrame to R search path
avg avg

-- B --

base64 base64
between between
between S4 class that represents a SparkDataFrame column
bin bin
bitwiseNOT bitwiseNOT
bround bround

-- C --

cache Cache
cacheTable Cache Table
cancelJobGroup Cancel active jobs for the specified group
cast Casts the column to a different data type.
cast S4 class that represents a SparkDataFrame column
cbrt cbrt
ceil ceil
ceiling ceil
clearCache Clear Cache
clearJobGroup Clear current job group ID and its description
col Though scala functions has "col" function, we don't expose it in SparkR because we don't want to conflict with the "col" function in the R base package and we also have "column" function exported which is an alias of "col".
collect Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
colnames Column names
colnames<- Column names
coltypes coltypes
coltypes<- coltypes
column Though scala functions has "col" function, we don't expose it in SparkR because we don't want to conflict with the "col" function in the R base package and we also have "column" function exported which is an alias of "col".
Column-class S4 class that represents a SparkDataFrame column
columns Column names
columns Get schema object
concat concat
concat_ws concat_ws
contains S4 class that represents a SparkDataFrame column
conv conv
corr corr
corr crosstab
cos cos
cosh cosh
count count
count nrow
count-method Count
countDistinct Count Distinct
cov cov
cov crosstab
covar_pop covar_pop
covar_pop crosstab
covar_samp cov
covar_samp crosstab
crc32 crc32
createDataFrame Create a SparkDataFrame
createExternalTable Create an external table
crosstab crosstab
cume_dist cume_dist

-- D --

dapply dapply
dapplyCollect dapply
dataFrame S4 class that represents a SparkDataFrame
datediff datediff
date_add date_add
date_format date_format
date_sub date_sub
dayofmonth dayofmonth
dayofyear dayofyear
decode decode
dense_rank dense_rank
desc S4 class that represents a SparkDataFrame column
describe summary
dim Returns the dimensions (number of rows and columns) of a SparkDataFrame
distinct Distinct
drop drop
dropDuplicates dropDuplicates
dropna dropna
dropTempTable Drop Temporary Table
dtypes DataTypes
dtypes Get schema object

-- E --

encode encode
endsWith S4 class that represents a SparkDataFrame column
except except
exp exp
explain Explain
explode explode
expm1 expm1
expr expr

-- F --

factorial factorial
fillna dropna
filter Filter
first Return the first row of a SparkDataFrame
fitted Get fitted result from a k-means model
fitted-method Get fitted result from a k-means model
floor floor
format_number format_number
format_string format_string
freqItems crosstab
from_unixtime from_unixtime
from_utc_timestamp from_utc_timestamp

-- G --

GeneralizedLinearRegressionModel-class S4 class that represents a generalized linear model
generateAliasesForIntersectedCols Creates a list of columns by replacing the intersected ones with aliases. The name of the alias column is formed by concatanating the original column name and a suffix.
getField S4 class that represents a SparkDataFrame column
getItem S4 class that represents a SparkDataFrame column
glm Fits a generalized linear model (R-compliant).
glm-method Fits a generalized linear model (R-compliant).
greatest greatest
groupBy GroupBy
groupedData S4 class that represents a GroupedData
GroupedData-class S4 class that represents a GroupedData
group_by GroupBy

-- H --

hash hash
hashCode Compute the hashCode of an object
head Head
hex hex
histogram Histogram
hour hour
hypot hypot

-- I --

ifelse ifelse
infer_type infer the SQL type
initcap initcap
insertInto insertInto
instr instr
intersect Intersect
is.nan is.nan
isLocal isLocal
isNaN S4 class that represents a SparkDataFrame column
isnan is.nan
isNotNull S4 class that represents a SparkDataFrame column
isNull S4 class that represents a SparkDataFrame column

-- J --

join Join
jsonFile Create a SparkDataFrame from a JSON file.

-- K --

KMeansModel-class S4 class that represents a KMeansModel
kurtosis kurtosis

-- L --

lag lag
last last
last_day last_day
lead lead
least least
length length
levenshtein levenshtein
like S4 class that represents a SparkDataFrame column
limit Limit
lit lit
loadDF Load a SparkDataFrame
locate locate
log log
log10 log10
log1p log1p
log2 log2
lower lower
lpad lpad
ltrim ltrim

-- M --

max max
md5 md5
mean mean
merge Merges two data frames
min min
minute minute
month month
months_between months_between
mutate Mutate

-- N --

n count
na.omit dropna
NaiveBayesModel-class S4 class that represents a NaiveBayesModel
names Column names
names<- Column names
nanvl nanvl
ncol Returns the number of columns in a SparkDataFrame
negate negate
next_day next_day
nrow nrow
ntile ntile
n_distinct Count Distinct

-- O --

orderBy Arrange
otherwise S4 class that represents a SparkDataFrame column
otherwise otherwise
over over

-- P --

parquetFile Create a SparkDataFrame from a Parquet file.
partitionBy partitionBy
percent_rank percent_rank
persist Persist
pmod pmod
predict Make predictions from a generalized linear model
predict-method Make predictions from a generalized linear model
print.jobj Print a JVM object reference.
print.structField Print a Spark StructField.
print.structType Print a Spark StructType.
print.summary.GeneralizedLinearRegressionModel Print the summary of GeneralizedLinearRegressionModel
printSchema Print Schema of a SparkDataFrame
printSchema Get schema object

-- Q --

quarter quarter

-- R --

rand rand
randn randn
rangeBetween rangeBetween
rank rank
rbind rbind
read.df Load a SparkDataFrame
read.jdbc Create a SparkDataFrame representing the database table accessible via JDBC URL
read.json Create a SparkDataFrame from a JSON file.
read.ml Load a fitted MLlib model from the input path.
read.parquet Create a SparkDataFrame from a Parquet file.
read.text Create a SparkDataFrame from a text file.
regexp_extract regexp_extract
regexp_replace regexp_replace
registerTempTable Register Temporary Table
rename rename
repartition Repartition
reverse reverse
rint rint
rlike S4 class that represents a SparkDataFrame column
round round
rowsBetween rowsBetween
row_number row_number
rpad rpad
rtrim rtrim

-- S --

sample Sample
sampleBy crosstab
sample_frac Sample
saveAsParquetFile write.parquet
saveAsTable saveAsTable
saveDF Save the contents of the SparkDataFrame to a data source
schema Get schema object
sd sd
second second
select Select
select-method Select
selectExpr Select
selectExpr SelectExpr
setJobGroup Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setLogLevel Set new log level
sha1 sha1
sha2 sha2
shiftLeft shiftLeft
shiftRight shiftRight
shiftRightUnsigned shiftRightUnsigned
show show
show-method show
showDF showDF
sign signum
signum signum
sin sin
sinh sinh
size size
skewness skewness
sort_array sort_array
soundex soundex
spark.glm Fits a generalized linear model
spark.glm-method Fits a generalized linear model
spark.kmeans Fit a k-means model
spark.kmeans-method Fit a k-means model
spark.lapply Run a function over a list of elements, distributing the computations with Spark.
spark.naiveBayes Fit a Bernoulli naive Bayes model
spark.naiveBayes-method Fit a Bernoulli naive Bayes model
spark.survreg Fit an accelerated failure time (AFT) survival regression model.
spark.survreg-method Fit an accelerated failure time (AFT) survival regression model.
SparkDataFrame-class S4 class that represents a SparkDataFrame
sparkR.init Initialize a new Spark Context.
sparkR.stop Stop the Spark context.
sparkRHive.init Initialize a new HiveContext.
sparkRSQL.init Initialize a new SQLContext.
sql SQL Query
sqrt sqrt
startsWith S4 class that represents a SparkDataFrame column
stddev sd
stddev_pop stddev_pop
stddev_samp stddev_samp
str Compactly display the structure of a dataset
struct struct
structField structField
structType structType
subset Subset
substr substr
substring_index substring_index
sum sum
sumDistinct sumDistinct
summarize Count
summarize summarize
summary summary
summary-method summary

-- T --

tableNames Table Names
tables Tables
tableToDF Create a SparkDataFrame from a SparkSQL Table
take Take the first NUM rows of a SparkDataFrame and return a the results as a data.frame
tan tan
tanh tanh
toDegrees toDegrees
toRadians toRadians
to_date to_date
to_utc_timestamp to_utc_timestamp
transform Mutate
translate translate
trim trim

-- U --

unbase64 unbase64
uncacheTable Uncache Table
unhex unhex
unionAll rbind
unique Distinct
unix_timestamp unix_timestamp
unpersist Unpersist
upper upper

-- V --

var var
variance var
var_pop var_pop
var_samp var_samp

-- W --

weekofyear weekofyear
when S4 class that represents a SparkDataFrame column
when when
where Filter
window window
window.orderBy window.orderBy
window.partitionBy window.partitionBy
WindowSpec-class S4 class that represents a WindowSpec
with Evaluate a R expression in an environment constructed from a SparkDataFrame
with-method Evaluate a R expression in an environment constructed from a SparkDataFrame
withColumn WithColumn
withColumnRenamed rename
write.df Save the contents of the SparkDataFrame to a data source
write.jdbc Saves the content of the SparkDataFrame to an external database table via JDBC
write.json write.json
write.ml Save the Bernoulli naive Bayes model to the input path.
write.parquet write.parquet
write.text write.text

-- Y --

year year

-- misc --

$ Select
$<- Select
%in% Match a column with given values.
[ Subset
[[ Subset