Highly effective Big Data Engiener over 25 years of experience. I am specilaising Big Data and have extensive ETL / ELT / Pipeline build using varous technologies including
AWS (EMR, Authiena, S3, Redshift etc.), Azure (ADF, BLOB, DataBricks, HDInsights) and Hadoop Stacks (Spark - PySpark and Scala, HDFS, Hive, YARN). I also have extensive data visualizaiton
experience using industry leading tools including Power BI, Tableau, DOMO and OBIEE.
TECHNICAL SKILLS
• Big Data (Hortonworks and Cloudera) – Spark(PySpark, Scala), Kafka, Hive,Impala, NiFi, HDFS, Sqoop, Ranger, Yarn, Solr, SAM, Schema Registry, SuperSet, Druid
• Language: Python, Scala, R, JavaScript
• Data Visualization – Tableau, PowerBI, OBIEE, DOMO
• Plunk & ELK Stack – ElasticSearch, Logstash, Filebeat, Kibana
• AWS – S3, RefShift, DynamoDB, Athena, Kinesis, EMR, Aurora, Glue, Lambda, Presto
• Azure – ADF, Databricks, Azure Data Warehouse, Polybase, SQL Server, HDInsight, SSIS, SSAS
• ETL – Informatica Power Centre, Informatica Big Data Management (BDM), SSIS
• Data Science & Engineering – R / RStudio / SparkR Packages: dplyr, ggplot2, stringr, plyr, carrat, SparkR, NLP, tibble, TensorFlow, curl, Python / PySpark, MLLib
• Libraries: MLLib, NumPy, SciPy, Pandas, Matplotlib, Seaborn, SciKit-Learn
• DBMS / OLAP – Oracle, SQL Server, TeraData, MySQL, Postgres, Essbase, SSAS
• Data Modelling – Kimball, Vault