No Code, All Insight: SageMaker Canvas Connects Data Analysts to Machine Learning
What is No-Code ML?
As a data scientist, I was always skeptical of no-code solutions since they usually provide so little flexibility that makes them practically useless or tries to provide too much flexibility that makes their UI/UX impossible to navigate and use! And honestly, I gave SageMaker a try with the same mindset but a few minutes in I solved a real-world machine-learning problem and had my model ready to deploy! This breakthrough can help close the gap between data scientists and data analysts in your team.
What is SageMaker Canvas?
No-Code Machine Learning tools like Canvas provide a web application that lets users train and deploy machine learning models without writing any line of code and only by using the website. Canvas has the sweet balance between being customizable and being easy to navigate and use. Currently, Canvas can solve these problems:
Tabular Data for Regression or Classification
Image Data for Classification
Text Data for NLP Problems
As well as the ability to use pre-trained models.
How to access the Canvas?
Go to SageMaker Dashboard and choose Canvas
Create a SageMaker Domain
Create a UserProfile (wait till the Domain status is InService)
Open the Canvas (this may take a few minutes)
Note: Ensure you meet any prerequisites, like having an AWS account.
An example usage of Amazon Canvas for creating custom models
How to use the Canvas?
AWS provided very interesting demos and learning materials for Canvas. The first step can be this example for package tracking which solves a regression problem on tabular data: SageMaker Canvas Demo (awsplayer.com) or this AWS Hands-on Lab Amazon SageMaker Canvas | Hands-on lab (awsplayer.com)
Amazon SageMaker Canvas offers a 2-month free tier under the AWS Free Tier. The pricing model charges based on session duration ($1.90/hour) plus the cost of training and/or deploying models. For detailed pricing, visit the AWS pricing page.
Some Notes on This Pricing:
It can be super expensive for very large datasets compared to a solution that a data scientist can code and run so keep that in mind. For example, training a classification model on 1M rows and ten columns will cost you about 300$ (!) which can be done for a few dollars if you do it yourself!
The pricing for the NLP and Computer Visions tasks is more reasonable; my explanation is that for Tabular data, SageMaker uses their AutoML service, which trains 250 models in parallel to find the right model, so it can be costly!
When to use it?
You are a data analyst who understands the problem but doesn’t have the time/expertise to code it yourself.
The dataset size is not huge (i.e., <100K rows)
The model and algorithm you want to use are relatively standard and nothing new or fancy.
Your company has a group of data scientists and analysts who want to collaborate. You can create models in SageMaker Canvas and then share them with data scientists to use in the SageMaker Studio.
When to avoid it?
You are an experienced data scientist who feels free to code. This still can be an easy solution for you if the cost is not a significant factor!
You want to build a custom model with specific architecture.