Oliver Kaus
Oliver Kaus
Creator of this blog.
Oct 7, 2021 4 min read

📚 Online Courses for Software Engineers and Data Scientists

thumbnail for this post

Image by pxhere.com

Introduction

The online courses that I have completed are the following:

  • AWS Certified Solutions Architect - Associate by Linux Academy
  • Machine Learning Engineer Udacity Nanogree by Udacity
  • Machine Learning DevOps Engineer by Udacity (in progress)

I will provide a summary and evaluation and outline key learnings from these courses taking a Data Scientist lense.

AWS Certified Solutions Architect - Associate by Linux Academy

When I completed the course, the course was hosted on Linux Academy. Since Linux Academy has been acquired by A Cloud Guru, the course instructor Adrian Cantrill has hosted the course on his personal website.

Summary & Evaluation

Overall, the course was great. The course went into more detail than required for the exam but offered an excellent prepration for the AWS certification exam and provided an in-depth understand of AWS services. On a downside, it took me ~3-4 months to complete the course.

Key Takeaways

This course in combination with the AWS certification exam was the most useful course that I have taken in my Data Science career. Understanding the AWS services on such a low level provided great benefits and made me stand out as a Data Scientist.

The course covered all of the most important AWS concepts:

  • Identity and Access Management (IAM)
  • Simple Storage Service (S3)
  • Elastic Compute Cloud (EC2)
  • Virtual Private Cloud (VPC)
  • Autoscaling and Launch Templates
  • Database Solutions including DynamoDB, RDS and Aurora
  • Application Services (SNS, SQS, Kinesis, IOT, SES, Step Functions)
  • Serverless Architecture (Lambda & API Gateway)
  • Monitoring, Deployment and Security

Machine Learning Engineer Udacity Nanogree by Udacity

The course has been renamed to AWS Machine Learning Engineer to reflect that SageMaker is used throughout the whole course. Already when I completed it, SageMaker was used as the tool. The new course can be found here.

Summary & Evaluation

Udacity stands out by offering assignments that students need to complete within SageMaker to successfully complete the course. This was why I decided to try my first Udacity course and I have not been disappointed. The quality was very good and the videos were very helpful. What made this course so useful was to get my hands dirty within SageMaker and complete four assignments in SageMaker.

Key Takeaways

The key takeaway for me where to understand how to fully productionise a model from data cleaning, feature engineering to model training and model deployment using API endpoints. Additionally, I learned how to call these end points and build a dashboard out of them.

In detail, the course covered:

  • Software Engineering Practices
    • Write clean, model and efficient code and good documentation
    • Test code and log outputs
    • Object oriented programming
    • Creating my own PiPy package
  • Model Deployment
    • Building and deploying a model within SageMaker using the SageMaker API
    • Hyperparameter tune the model using the SageMaker API
    • Deploying the model as an API endpoint
  • ML use cases
    • Productionise various models ranging from unsupervised models (PCA & k-means) to supervised models (simple linear regression, deep learning models in Pytorch)
  • My own student project
    • My project was around predicting based on several features whether an error in a tennis match was a forced or an unforced error

Machine Learning DevOps Engineer by Udacity

I was very excited to see that there is a Machine Learning DevOps course when it was launched in July 2021. I have not fully completed the course yet so I will update this page when I have. You can find the course here.

Summary & Evaluation

The course is all about putting Machine Learning models into production. The evaluation will be filled after I have taken the course.

Key Takeaways

The curriculum of the course is as follows:

  • Apply best coding principles
  • Building a Reproducable Model Workflow
  • Deploying a Scalable ML Pipeline in Production
  • ML Model Scoring and Monitoring

Conclusion

I can’t emphasize enough how useful I think these online courses are. We live in a century in which knowledge is freely available. The number of resources available online in the Software Engineering and Data Science field is exploding. This makes it difficult sometimes and makes it even more important to lay a solid foundation through a well-structured online course.