org.apache.spark.graphx
Class GraphLoader

Object
  extended by org.apache.spark.graphx.GraphLoader
All Implemented Interfaces:
Logging

public class GraphLoader
extends Object
implements Logging

Provides utilities for loading Graphs from files.


Constructor Summary
GraphLoader()
           
 
Method Summary
static Graph<Object,Object> edgeListFile(SparkContext sc, String path, boolean canonicalOrientation, int numEdgePartitions, StorageLevel edgeStorageLevel, StorageLevel vertexStorageLevel)
          Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id.
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

GraphLoader

public GraphLoader()
Method Detail

edgeListFile

public static Graph<Object,Object> edgeListFile(SparkContext sc,
                                                String path,
                                                boolean canonicalOrientation,
                                                int numEdgePartitions,
                                                StorageLevel edgeStorageLevel,
                                                StorageLevel vertexStorageLevel)
Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id. Skips lines that begin with #.

If desired the edges can be automatically oriented in the positive direction (source Id < target Id) by setting canonicalOrientation to true.

Parameters:
sc - SparkContext
path - the path to the file (e.g., /home/data/file or hdfs://file)
canonicalOrientation - whether to orient edges in the positive direction
numEdgePartitions - the number of partitions for the edge RDD Setting this value to -1 will use the default parallelism.
edgeStorageLevel - the desired storage level for the edge partitions
vertexStorageLevel - the desired storage level for the vertex partitions
Returns:
(undocumented)