Cleared Databricks Spark Certification

Last Week I cleared Databricks spark developer certification. Well exam was not that much difficult but it had some tricky question and bunch of scala/python spark program challenges.

1. You must go through ‘Learning Spark’ book. Try practicing all transformation and action given in book.

2. There were 40 questions in exam and all were MCQ. I am not sure about passing percentage; I scored 70% to pass.

3. 90% question were from Spark Core , Spark SQL and Spark Streaming . There were 2 questions from Machine learning and 1 question from GraphX. For GraphX I referred MapR blog , it has got nice explanation to begin with.


4. Make sure you do enough hands on before appearing for the exam. Most of the questions were related to programming so must have through understanding how API works in detail.


5. Try examples in all three language, you don’t need to be expert but you should be able to understand code. In my case most of the questions were in Scala and python.

Topics  one must cover:
  1. All transformation and Actions on basic and pairRDD.
  2. Accumulator and Broadcast variable.
  3. Lazy evaluation model.
  4. Lineage graph in spark.
  5. Word count in all three languages.
  6. Spark SQL – DataFrame example – Infer Schema , Programmed Schema
  7. Window operation in Spark Streaming , Check Pointing , DStream recovery on failure and Statefull transformation.
  8. Machine learning : K-mean , Regression , Clustering
  9. GrapghX basic : Vertex , Edge RDD and Triplets in graphs.


Reference Link :

  • Apache Spark Documentation : http://spark.apache.org/
  • DataBricks Knowledge Base : https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/
  • Spark Training : https://spark-summit.org/2014/training





Comments

  1. Thanks for sharing the roadmap

    ReplyDelete
  2. Hi Shashi,

    i have few questions -

    1. How was your preparation strategy ? Which books and material you used ?
    2. Is there distributed network required for handson or standalone laptop (localmode) is enough from test perspective ?
    3. The test was "multiple option answer test" or it was real time whiteboard programming test?

    Thanks for the sharing the detail with us.

    ReplyDelete
  3. Hi Shashi,

    i have few questions -

    1. How was your preparation strategy ? Which books and material you used ?
    2. Is there distributed network required for handson or standalone laptop (localmode) is enough from test perspective ?
    3. The test was "multiple option answer test" or it was real time whiteboard programming test?

    Thanks for the sharing the detail with us.

    ReplyDelete
  4. Hi Anupam,

    I have referred learning Spark book and tried all examples given in book and apache spark documentation is also good resource.

    I think Sandbox is good enough to practice.

    Test was MCQ based.

    ReplyDelete
    Replies
    1. thanks shashi
      i will appear the exam next month, will let you know about my experience

      Delete
  5. Very Well written . Good Work bro

    ReplyDelete

Post a Comment

Popular posts from this blog

JDBC Hive Connection fails : Unable to read HiveServer2 uri from ZooKeeper

Access Kubernetes ConfigMap in Spring Boot Application

Developing Custom Processor in Apache Nifi