I am planning on getting the AWS Certified Data Engineer certification. This post is a collection of my notes I plan on taking throughout my preparation journey. Feel free to use this resource for your own preparation. This is a work in progress so check back soon for further additions to the content.

DISCLAIMER: The layout of these notes and the ordering of the content is strictly based on the official study guide provided by AWS, which is an excellent resource and I highly recommend checking it out. These notes are by no means an exhaustive explanation of the exam content.

Fundamentals of Analytics

Data Analytics is the whole process of taking raw data and going through different steps of extracting meaningful information from it that could prove valuable to a company. Data Analysis is a small part of this ordeal, where we analyze a single dataset to learn trends and interpreting it so that it leads to meaningful decisions. There are four types of data analytics:

  • Descriptive
  • Diagnostic
  • Predictive
  • Prescriptive

Descriptive Analytics

Descriptive Analytics aims to answer the “what” behind a dataset. Such kind of analytics allows us to understand what happened in historical datasets, but does not provide us with any predictions or explanations as to why it happened. Visualizations are relied upon heavily in descriptive analytics, such as bar graphs and pie charts. Such visualizations require processing data and plotting it, therefore requiring the greatest human intervention and effort.

Diagnostic Analytics

Diagnostic Analytics, on the other hand, tries to understand the reasoning behind the behavior of a dataset. In historical datasets, it answers questions related to why certain trends are present. Most common diagnostic techniques include drill-down, data discovery, data mining, and correlations. Diagnostic analytics require human judgement to take the insights from the dataset and turn them into actionable information.

Predictive Analytics

Predictive Analytics tries to predict future consequences and what could happen if certain decisions are taken. This requires intelligent and automated techniques such as machine learning, forecasting, pattern matching, and predictive modeling. In such techniques, computers can learn the causal relationships in the data and predict what can happen in the future. For example, an e-commerce company can analyze their historical data to figure out which promotional offers have been successful, allowing them to anticipate future customer engagement and improve it accordingly.

Prescriptive Analytics

Prescriptive Analytics builds upon predictive analytics by taking it one step further and providing certain actionable steps that can be taken. Certain techniques, such as machine learning, graph analysis, simulation, and even processing, can analyze different courses of action and choose the best one.