Package pyspark :: Package mllib :: Module linalg :: Class Vectors
[frames] | no frames]

Class Vectors

source code

object --+
         |
        Vectors

Factory methods for working with vectors. Note that dense vectors are simply represented as NumPy array objects, so there is no need to covert them for use in MLlib. For sparse vectors, the factory methods in this class create an MLlib-compatible type, or users can pass in SciPy's scipy.sparse column vectors.

Instance Methods

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Static Methods
 
sparse(size, *args)
Create a sparse vector, using either a dictionary, a list of (index, value) pairs, or two separate arrays of indices and values (sorted by index).
source code
 
dense(elements)
Create a dense vector of 64-bit floats from a Python list.
source code
 
stringify(vector)
Converts a vector into a string, which can be recognized by Vectors.parse().
source code
Properties

Inherited from object: __class__

Method Details

sparse(size, *args)
Static Method

source code 

Create a sparse vector, using either a dictionary, a list of
(index, value) pairs, or two separate arrays of indices and
values (sorted by index).

@param size: Size of the vector.
@param args: Non-zero entries, as a dictionary, list of tupes,
             or two sorted lists containing indices and values.

>>> print Vectors.sparse(4, {1: 1.0, 3: 5.5})
(4,[1,3],[1.0,5.5])
>>> print Vectors.sparse(4, [(1, 1.0), (3, 5.5)])
(4,[1,3],[1.0,5.5])
>>> print Vectors.sparse(4, [1, 3], [1.0, 5.5])
(4,[1,3],[1.0,5.5])

dense(elements)
Static Method

source code 

Create a dense vector of 64-bit floats from a Python list. Always returns a NumPy array.

>>> Vectors.dense([1, 2, 3])
array([ 1.,  2.,  3.])

stringify(vector)
Static Method

source code 

Converts a vector into a string, which can be recognized by Vectors.parse().

>>> Vectors.stringify(Vectors.sparse(2, [1], [1.0]))
'(2,[1],[1.0])'
>>> Vectors.stringify(Vectors.dense([0.0, 1.0]))
'[0.0,1.0]'