org.apache.spark.streaming.kinesis
Class KinesisUtils

Object
  extended by org.apache.spark.streaming.kinesis.KinesisUtils

public class KinesisUtils
extends Object


Constructor Summary
KinesisUtils()
           
 
Method Summary
static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc, String streamName, String endpointUrl, Duration checkpointInterval, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, StorageLevel storageLevel)
          Create an input stream that pulls messages from a Kinesis stream.
static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc, String kinesisAppName, String streamName, String endpointUrl, String regionName, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, Duration checkpointInterval, StorageLevel storageLevel)
          Create an input stream that pulls messages from a Kinesis stream.
static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc, String kinesisAppName, String streamName, String endpointUrl, String regionName, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, Duration checkpointInterval, StorageLevel storageLevel, String awsAccessKeyId, String awsSecretKey)
          Create an input stream that pulls messages from a Kinesis stream.
static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc, String streamName, String endpointUrl, Duration checkpointInterval, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, StorageLevel storageLevel)
          Create an input stream that pulls messages from a Kinesis stream.
static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc, String kinesisAppName, String streamName, String endpointUrl, String regionName, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, Duration checkpointInterval, StorageLevel storageLevel)
          Create an input stream that pulls messages from a Kinesis stream.
static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc, String kinesisAppName, String streamName, String endpointUrl, String regionName, com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream, Duration checkpointInterval, StorageLevel storageLevel, String awsAccessKeyId, String awsSecretKey)
          Create an input stream that pulls messages from a Kinesis stream.
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KinesisUtils

public KinesisUtils()
Method Detail

createStream

public static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc,
                                                        String kinesisAppName,
                                                        String streamName,
                                                        String endpointUrl,
                                                        String regionName,
                                                        com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                        Duration checkpointInterval,
                                                        StorageLevel storageLevel)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: The AWS credentials will be discovered using the DefaultAWSCredentialsProviderChain on the workers. See AWS documentation to understand how DefaultAWSCredentialsProviderChain gets the AWS credentials.

Parameters:
ssc - StreamingContext object
kinesisAppName - Kinesis application name used by the Kinesis Client Library (KCL) to update DynamoDB
streamName - Kinesis stream name
endpointUrl - Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
regionName - Name of region used by the Kinesis Client Library (KCL) to update DynamoDB (lease coordination and checkpointing) and CloudWatch (metrics)
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
storageLevel - Storage level to use for storing the received objects. StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)

createStream

public static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc,
                                                        String kinesisAppName,
                                                        String streamName,
                                                        String endpointUrl,
                                                        String regionName,
                                                        com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                        Duration checkpointInterval,
                                                        StorageLevel storageLevel,
                                                        String awsAccessKeyId,
                                                        String awsSecretKey)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: The given AWS credentials will get saved in DStream checkpoints if checkpointing is enabled. Make sure that your checkpoint directory is secure.

Parameters:
ssc - StreamingContext object
kinesisAppName - Kinesis application name used by the Kinesis Client Library (KCL) to update DynamoDB
streamName - Kinesis stream name
endpointUrl - Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
regionName - Name of region used by the Kinesis Client Library (KCL) to update DynamoDB (lease coordination and checkpointing) and CloudWatch (metrics)
awsAccessKeyId - AWS AccessKeyId (if null, will use DefaultAWSCredentialsProviderChain)
awsSecretKey - AWS SecretKey (if null, will use DefaultAWSCredentialsProviderChain)
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
storageLevel - Storage level to use for storing the received objects. StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)

createStream

public static ReceiverInputDStream<byte[]> createStream(StreamingContext ssc,
                                                        String streamName,
                                                        String endpointUrl,
                                                        Duration checkpointInterval,
                                                        com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                        StorageLevel storageLevel)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: - The AWS credentials will be discovered using the DefaultAWSCredentialsProviderChain on the workers. See AWS documentation to understand how DefaultAWSCredentialsProviderChain gets AWS credentials. - The region of the endpointUrl will be used for DynamoDB and CloudWatch. - The Kinesis application name used by the Kinesis Client Library (KCL) will be the app name in SparkConf.

Parameters:
ssc - Java StreamingContext object
streamName - Kinesis stream name
endpointUrl - Endpoint url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
storageLevel - Storage level to use for storing the received objects StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)

createStream

public static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc,
                                                            String kinesisAppName,
                                                            String streamName,
                                                            String endpointUrl,
                                                            String regionName,
                                                            com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                            Duration checkpointInterval,
                                                            StorageLevel storageLevel)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: The AWS credentials will be discovered using the DefaultAWSCredentialsProviderChain on the workers. See AWS documentation to understand how DefaultAWSCredentialsProviderChain gets the AWS credentials.

Parameters:
jssc - Java StreamingContext object
kinesisAppName - Kinesis application name used by the Kinesis Client Library (KCL) to update DynamoDB
streamName - Kinesis stream name
endpointUrl - Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
regionName - Name of region used by the Kinesis Client Library (KCL) to update DynamoDB (lease coordination and checkpointing) and CloudWatch (metrics)
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
storageLevel - Storage level to use for storing the received objects. StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)

createStream

public static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc,
                                                            String kinesisAppName,
                                                            String streamName,
                                                            String endpointUrl,
                                                            String regionName,
                                                            com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                            Duration checkpointInterval,
                                                            StorageLevel storageLevel,
                                                            String awsAccessKeyId,
                                                            String awsSecretKey)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: The given AWS credentials will get saved in DStream checkpoints if checkpointing is enabled. Make sure that your checkpoint directory is secure.

Parameters:
jssc - Java StreamingContext object
kinesisAppName - Kinesis application name used by the Kinesis Client Library (KCL) to update DynamoDB
streamName - Kinesis stream name
endpointUrl - Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
regionName - Name of region used by the Kinesis Client Library (KCL) to update DynamoDB (lease coordination and checkpointing) and CloudWatch (metrics)
awsAccessKeyId - AWS AccessKeyId (if null, will use DefaultAWSCredentialsProviderChain)
awsSecretKey - AWS SecretKey (if null, will use DefaultAWSCredentialsProviderChain)
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
storageLevel - Storage level to use for storing the received objects. StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)

createStream

public static JavaReceiverInputDStream<byte[]> createStream(JavaStreamingContext jssc,
                                                            String streamName,
                                                            String endpointUrl,
                                                            Duration checkpointInterval,
                                                            com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream initialPositionInStream,
                                                            StorageLevel storageLevel)
Create an input stream that pulls messages from a Kinesis stream. This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.

Note: - The AWS credentials will be discovered using the DefaultAWSCredentialsProviderChain on the workers. See AWS documentation to understand how DefaultAWSCredentialsProviderChain gets AWS credentials. - The region of the endpointUrl will be used for DynamoDB and CloudWatch. - The Kinesis application name used by the Kinesis Client Library (KCL) will be the app name in SparkConf.

Parameters:
jssc - Java StreamingContext object
streamName - Kinesis stream name
endpointUrl - Endpoint url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com)
checkpointInterval - Checkpoint interval for Kinesis checkpointing. See the Kinesis Spark Streaming documentation for more details on the different types of checkpoints.
initialPositionInStream - In the absence of Kinesis checkpoint info, this is the worker's initial starting position in the stream. The values are either the beginning of the stream per Kinesis' limit of 24 hours (InitialPositionInStream.TRIM_HORIZON) or the tip of the stream (InitialPositionInStream.LATEST).
storageLevel - Storage level to use for storing the received objects StorageLevel.MEMORY_AND_DISK_2 is recommended.
Returns:
(undocumented)