Find Jobs
Hire Freelancers

Big Data ( Cassandra & Spark )

$250-750 USD

Closed
Posted about 9 years ago

$250-750 USD

Paid on delivery
We need to build (Cassandra cluster) & (Apache Spark / Hadoop cluster) in AWS to investigate / POC and Demo to our clients such technology. Build the Cassandra cluster (3 nodes) that can expand as needed scaling the cluster should be matter of minutes with zero or minimum configuration (you need to build the image for the x-node) Build the initial app, this app will create the initial data structure and add the dummy data (Randomly generate by you) Users Table [UserID, Username, FirstName, LastName] Accounts Table (User --[one-to-many]-->Account) [AccountNo, Currency, Balance ] Transaction (Account --[one-to-many]-->Transaction) [TransactionID, time-stamp, details[String, 256 char], category[String, 12 char] ] Build the Spark cluster (same as Cassandra) Update app This app will continuously update Transaction table with dummy transaction data (100-1000 transaction / seconds ) You need to install/configure a driver/pipe to make data available to Spark (from Cassandra) We will need to create a query app (Java + SQL) that will connected and execute some queries (on both Cassandra & Spark) TERMS & CONDITIONS Please bid if you already experienced with (Cassandra / Apache Spark / Hadoop cluster in AWS) Only SQL or Java are allowed (no python or other scripts unless it's used for configuration, i.e. bash scrip is welcome) Only Linux OS. (Documentation + screenshots) of steps taken to create the cluster / scripts and any configuration in nodes/aws We will provide an account in AWS (for that we need some legal document from you , i.e. passport, id , certificate, etc. and signing NDA) The use of "app" does not mean it's a mobile app, it means small application. This will be used by skilled developers, so command line is welcome. (don't waste your time making fancy UI) While in this step Spark is reading data from Cassandra, in the near future Spark should be able to read from other sources and structure and non-structure data (i.e. log files) The idea in this project is to show the speed and scalability of Cassandra / Spark and speed is the key factor for a successful implementation i.e. retrieve the last 3 years transactions in matter of 100-1000 milliseconds This is POC to us and to you (Contractor) so future work could be needed.
Project ID: 7261424

About the project

12 proposals
Remote project
Active 9 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
12 freelancers are bidding on average $1,926 USD for this job
User Avatar
George Bailey here from Los Angeles, USA. I have done similar project already and expert in desired skills for the project. Please get back to me to discuss further and finalize the agreement. Regards,
$2,577 USD in 10 days
5.0 (10 reviews)
5.6
5.6
User Avatar
Hi , we are ready to help. could you provide us with more details of your project. Best Regards Dmytro Usenko
$1,444 USD in 18 days
5.0 (1 review)
4.1
4.1
User Avatar
Hi I am very interested in your project. I am a highly skilled Java developer with over 10 years experience and currently exploring big data technologies: hadoop, storm, kafka, spark, cassandra, etc. I want to take this challenge to the next level.
$1,250 USD in 20 days
4.9 (2 reviews)
4.0
4.0
User Avatar
Hi, we are a team especialized in Web development. We can do your project. For more information, visit our portfolio. Kind regards, William.
$1,250 USD in 20 days
5.0 (1 review)
0.7
0.7
User Avatar
A proposal has not yet been provided
$1,200 USD in 5 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello sir, i read your requirements very-well and ready to start work from now. Hope you will give me a chance to do work with you, waiting for your reply, Thank you Bhadresh Skype- bd_orbit
$2,500 USD in 25 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I have 4+ years of working experience in data mining and machine learning domain and have master degree in computer science. Worked on many projects mainly in Predictive analytics, Natural language language, text mining, web mining etc. Expertise in R, Python, Hadoop, MapReduce, Hbase,Hive,Pig etc.
$1,500 USD in 30 days
0.0 (0 reviews)
0.0
0.0
User Avatar
A proposal has not yet been provided
$5,555 USD in 5 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hi, I have total 5 years of experience in Hadoop and Big Data processing application development as well as Hadoop Administration, Hadoop cluster configuration, security configuration, HA cluster set-up, disaster recovery cluster set-up, YARN Configuration etc using Cloudera, Hortonworks, PivotalHD. Also have very good experience of cluster setup on Cloud ( a different approach) cluster creation and automatic deployment. Please contact me to get more detail about me and let me know when we can schedule a demo session. I am open to answer your question before taking the assignment. Thanks for your time! Hadoopdoop Developer freelancer bid I have done various industrial project in Hadoop and Big Data using Map Reduce, Pig, Hive, Sqoop, Oozie, Flume, Kafka, Cassandra, Hbase, Spark for reputed company of UK, France and USA. I have total 5 years of experience in Hadoop and Big Data processing system development as well as Hadoop Administration, Hadoop cluster configuration, security configuration, HA cluster setup, disaster recovery cluster setup etc. Having good knowledge/experience of structured and unstructured data and exposure of data migration from RDBMS to Hadoop , Hadoop to RDBMS, Big Data ETL framework.I am a certified hadoop developer and I consider myself as a good candidate for the project. Please contact me to get more detail about me. I am open to answer your question before taking the project. Looking forward to get in touch with you for further discus
$833 USD in 10 days
0.0 (0 reviews)
0.0
0.0
User Avatar
The installation/setup is trivial, as we just recently went through this exercise to set up the environment similar to yours: AWS+Cassandra+Spark. We tried quite a few of different configurations, including Mesos, but ended up with a standalone cluster, managed by Datastax Max - really simplifies the integration. I have a few questions to clarify: - You mentioned Java. Is Scala ok? Given that it's Spark's native language and executed in a JVM. - You mentioned SQL. Did you imply SparkSQL needs to be used? In one of our projects, we have a setup identical to yours: AWS+Cassandra+Spark. JDBC over SparkSQL is a pain, and peformance is not great; yet Cassandra-Spark connector works quite nicely. - Which version of Cassandra/Spark are you planning to use? An open source Apache or Datastax Max, which includes Cassandra/Spark/Solr? - Your last requirement is about performance/scalability. Spark is lightning fast once the data is in memory, but fetching it from disk/db takes time, of course. So when you require "X rows to be processed in N seconds" - did you mean the first access, or the performance once the data has been loaded? Also, I must tell you that Spark is one of my "sweet spots", I'm working on a number of side projects related to Spark, in particular to Spark+Cassandra ecosystem. My goal is to expose Spark/Spark Streaming API for analysts, not developers, so that they could create data flows easily. If you're interested, I could share more detail
$2,777 USD in 10 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I have extensive experience in both cloud computing and Big Data, as you can see from my Resume, I've been working in the industry for large corporate clients, and governments, solving big data problems for massive organizations. I have experience with Hadoop, deploying the solution, configuring the infrastructure, and overall solution architecture. I also have strong experience with AWS, and automation of deployment. I propose a solution to design a mostly automated solution on AWS for your prototype/POC, as outlined in the requirements you have provided. Please see my proposed milestones, for an outline of the key deliverables, as the project progresses, with controlled risk for you throughout the process. I always strive towards a complete holistic solution, so an early detailed assessment (under NDA and awarded contract) of your entire requirements and needs is important to me. I see this as a first step to a potentially long term fruitful relationship between myself, and your organization. Thank you for your time in reviewing my bid, and please don't hesitate to contact me with any further questions, or clarifications. I'll look forward to hearing from you.
$1,222 USD in 15 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of SAUDI ARABIA
Jeddah, Saudi Arabia
5.0
1
Payment method verified
Member since Jan 23, 2009

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.