Installing Bigstream on AWS EMR

This document has 2 sections: 
1- SUPPORTED PLATFORMS
2- SOFTWARE INSTALLATION



1- SUPPORTED PLATFORMS

   Supported in this
 release
 Caveats
 Spark Release  Spark 2.1.1, Spark 2.2.0  
 Storage /
 Streaming
 HDFS, S3, Kafka  
 Data types  AVRO, CSV, JSON  LZO compression  is not supported in this  release
 Operators  Dataframes/SQL No RDD Support


2- SOFTWARE INSTALLATION

Deploying Bigstream Hyper-acceleration software on your AWS EMR solution is quite simple.

If you have not already subscribed to our software, please go to AWS Marketplace product page URL listed below, and follow the steps until your registration is completed.

http://bigstream.co/aws/hyper-acceleration


Once your subscription and registration is complete, you will be enabled to use an S3 bootstrap URI that you can use for the deployment of Bigstream software when you provision a new EMR cluster.  Every time you provision a new cluster, you can use the same URI.  If you do not add the URI when you provision a new EMR cluster, Bigstream software is not installed.


STEP 1- Start your EMR provisioning tool on AWS EMR.



STEP 2- After Clicking on "Create Cluster", select "Go to advanced options"


STEP 3- Setup your Software Configuration as usual.


STEP 4- Setup your Hardware Configuration and click on "Next"

STEP 5- In the "Additional Options" field, select "Bootstrap Actions"

STEP 6- Then select "Custom Actions"

STEP 7- Click on "Configure and add" button

STEP 8- Paste the deployment URI that was emailed to you in the "Script Location" field and click "Add".


STEP 9- Click the "Next" Button.   

Bigstream will be installed on your cluster automatically and you can accelerate your spark cluster via the following configuration: 

--conf spark.bigstream.nsd.accelerate=true


 

Did you find this article helpful?