Google’s Professional Data Engineer certification exam is said to be difficult. I didn’t believe it much until I took the exam myself. Now I also confirm that it’s indeed a difficult exam and I don’t think it can be achieved without thorough preparation and a solid engineering mindset.
I wanted to share my own preparation strategy and do this as plainly as possible; with a very little amount of sugar-coated sentences.
Make your plan
Think of for how long do you want to prepare for the exam. How exactly are you going to implement this plan, considering your daily and weekly schedule?
Set overall and weekly goals for your target stages. Review this plan as needed. As you prepare, you will come across new material that you want to go through. Fitting these new items into your plan will shift things.
Setting timed goals will let you set aside dedicated time for your goal. As you know, interesting new things to do always pop up in life. When you have scheduled goals, you will evaluate your how you use your time, keeping your goals in mind. If doing that interesting new thing instead means postponing your targeted time for your goal, then you would evaluate this opportunity cost and make an informed decision.
Follow the courses
All the existing blog posts about how to prepare for this certification already mention the following courses. Here’s what I think about them:
Google Cloud’s own course set on Coursera
I watched the video courses and I did the labs. Thinking back now, I could have progressed faster with these. Maybe do only a few labs to get a feeling of the Google Cloud platform, and skip most of them. Most labs contain repetitive tasks and I don’t think there was a need to repeatedly do them for the purpose of exam preparation.
I did the quizzes too. Besides the last course’s longer quiz, which is supposed to serve as a practice exam, they were very easy. So, they were useless and they could have been prepared way better, to be more “teaching” (I wonder if there’s a better word for this).
Linux Academy
I didn’t watch most of the videos, as I felt like I did enough watching. I went through their data dossier. It is like the study notes of a hardworking student, so I found it quite useful.
I did all their quizzes and practice exams, and they were relatively way better questions than Google’s. I did them over and over because from time to time I got different questions. So I did them until I’m comfortable with all the questions.
I took screenshots of the questions that I either did wrong or did correct but felt like I may want to re-do them. I pasted these screenshots on my study document.
Needless to say; but whenever I didn’t understand why I answered a question wrong, I went over the related documentation of that Google product. Deliberately learning from your mistakes is very important when you are trying to learn anything.
I went through the study cards of peer students on Linux Academy’s platform. Again, I took the screenshots of the cards that I want to go over again and pasted them on my study document. By the way, not all the things written on the study cards were correct. Don’t forget that these are prepared by peers who are also studying for the certification. So, take them with a grain of salt. When you feel uneasy about something you read, double-check it against the documentation.
Dive deep into Google Documentation
This was by far my best source to study for the exam. Yes, it is a bottomless pit. It doesn’t have boundaries. You don’t know how deep you should dive into. You don’t know when to stop. You don’t know which parts are important and which parts you can skip. So, you are by yourself and you will find the answers to these questions using your critical mind.
Besides the product pages, here’s what I read and re-read in minimum:
- Best practices
- Scenarios
- Use cases
- Importing data into this product from other ecosystem products
- Exporting data out of this product into other ecosystem products
- Pricing comparison between related products
- Feature comparison of Google platform products against Hadoop ecosystem products
- For which scenarios it makes sense to prefer the Google solution
- For which it makes sense to do the contrary
Again, I took screenshots of the sections I deemed important and pasted them on my study document. Copy-pasting screenshots is much faster than copy-pasting text. Also, it preserves the format and reading is a better experience.
Read Google Solutions Blog Posts
I read quite a few articles on Google’s Cloud Solutions Architecture Reference I found interesting, covering the GCP products of my interest. Some were migration stories, or performance improvement stories or integration stories between legacy and newer systems. I am interested in reading architecture articles in any case, so they broadened my horizon and I’m sure they somehow helped me with my understanding of the products.
Listen to Google Cloud Platform Podcast
I either bike to work or take the bus. Either way, my commutes are a good opportunity to listen to podcasts. I listen to them at home when my hands are busy but my ears are not.
I subscribed to the GCP podcast and until I took the exam, whenever I had the chance to listen to something, I preferred to listen to something related to GCP. This meant that for this period I couldn’t listen to my beloved Planet Money or Bullseye. So for the sake of focusing, I made this sacrifice.
Listen to Data Engineering Podcasts covering GCP products
I have subscriptions to a few podcasts covering data engineering and machine learning engineering topics. I listened to their episodes that were about the guests’ experience in using or migrating to a GCP product and how they incorporated it into their workflows, how they decided to use this product, what they achieved, what problems they had on the way, etc. Listening to these helped me a lot in rather hard to describe ways I guess. For instance, sometimes the host or the guest says something that answers a question on your mind or something makes you curious about a topic, then you read on it and learn something that in turn improves your understanding about that topic.
Collect and manage online material
I created a bookmark folder and kept everything that I want to read in this folder. I created another folder to keep my read items. As I read the material, I moved it from the parent folder to this read items folder, because I didn’t want to remove the bookmark. Also moving the items from one folder to another somehow gives you a sense of progress. You may feel like you are doing good with your preparation goals.
Review constantly
I exported this document with my pasted screenshots into a pdf and I put it on my phone. This way, whenever I had some time, I reviewed it easily. Waiting on the queue, waiting for my commute, my commute itself were all good opportunities.
This document was very large. I didn’t feel like going over all of its parts at a given time. Sometimes I felt like reviewing the practice questions, sometimes the study cards and sometimes the screenshots from the Google documentation. So making the study document navigable with links was a good idea.
Find free practice questions
You can find some sets of free practice questions here and there. Some of them are lame but you can benefit from some questions. The actual exam questions will be much more complex and harder but still, these will get your engines warm.
Prepare mentally for the difficulty
In my exam, around 10/50 questions were super easy, where you had the answer in 2 seconds. 20/50 questions were at a normal difficulty and pretty doable if you are comfortable with all the capabilities of the GCP products. The remaining 20/50 were quite tough and consumed more than half of my exam duration.
Here, tough means that the question is describing a complex case and you are not sure if the given products have those claimed features in the options. In these cases, having an engineering perspective will be the best tool in your belt. For instance, when you are given an I/O bound performance problem, you would know that you shouldn’t solve it as if you are solving a CPU bound performance problem.
For these tough questions, I can’t even formulate how to prepare. For instance, you cannot know every Hadoop product’s details in action as you mostly focus on the GCP products, or you cannot know every option to a command in the gcloud SDK. So, the best thing to do will be to stay calm and use your engineering mindset to put your best guess.
So, embrace the fact that this exam will be much harder than all the practice questions you’ve done and you will read many forum posts from people failing the exam. Prepare yourself mentally, but don’t get demotivated. If you are properly prepared for the exam, focus on your inner strength and let your previous achievements be your spring of confidence for the things you will next achieve.
Enjoy your swag!
One of the best parts of earning your certificate will be reading the email from Google, asking you to pick your well deserved swag from the Google Merchandise Store. I got myself a hoodie and whenever I wear it, its cuddlesome comfort gets combined with my inner comfort for achieving this.