Create your first project
this page is WIP
You'll complete this simple workflow:
- Deploy the engine
- Connect to your data
- Run an analytics workload
- Suspend the project
Before you begin
-
From your AI Unlimited admin, get these items:
- AWS
- Azure
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
, andAWS_SESSION_TOKEN
ARM_SUBSCRIPTION_ID
,ARM_CLIENT_ID
, andARM_CLIENT_SECRET
-
Access the AI Unlimited manager and get your API key.
link to API key topic - not all users will go through this workflow, so we need a separate topic - it should be easy to find - first one under Explore and Analzye data
-
Connect to JupyterLab, open a notebook, and select the AI Unlimited kernel.
If you don't have JupyterLab or the AI Unlimited kernel, see Jupyter installaton options.
Connect, and run your first workload
Run %help
or %help <command>
for details on all magic commands or any one of them. Or learn about the magic commands provided by the AI Unlimited kernel specifically.
-
Connect to the AI Unlimited manager.
we are aware of the horizonal scroll bar vs. copy icon issue - styles are being tweaked
Assume no TLS for the sake of this sample workflow? But tell them what it means (very briefly). TA: the workflow section need not provide details of the variables as they are explained in detail in the magic commands
-
Create a new project.
They can name it what they want.
Would be good to have text that spells out CSP.
But what about the project team in this simple workflow? Do we expect a team to use it? But... I still need to learn about the "team" concept.
-
Optionally, create an authorization object to store the CSP credentials.
Normally they create a shared authorization or one for a single user. In this sample workflow, maybe this is not optional? Otherwise, they'd have to use SQL to create an authoriation for themselves? TA: Auth is required only for external connectivity, in this example as there is no external connection, this step is optional
Replace
ACCESS_KEY_ID
,SECRET_ACCESS_KEY
, andREGION
with your values.These look like AWS. Do AWS-Azure tabs?
-
Deploy the engine.
Replace the
Project_Name
. (didn't they already name it?) The size can be small, medium, large, or extralarge. The default is small.The deployment process takes a few minutes. It generates a password.
-
Connect to the project.
When the connection is made, provide the generated password. how?
-
Run the sample workload.
TA: this example will change I don't yet understand all this, where the data comes from, etc. TA: Idea is to create a table and then load data from an Excel file or from the sample repo within Jupyter (FILEPATH=notebooks/sql/data/salescenter.csv) when did they select a DB in their data lake? something to do with the object authorization magic command? TA: No, this is data load from an Excel* is the idea that they might coincidentally happen to have tables with those names in their DB - or maybe someone in their org ran this sample workload prior? *TA: idea is to create new tables, users can create by themselves
noteMake sure you do not have tables named SalesCenter or SalesDemo in the selected database.
a. Create a table to store the sales center data.
b. Load data into the SalesCenter table using the
%dataload
magic command.noteUnable to locate the salescenter.csv file? Download the file from GitHub Demo: Charting and Visualization Data.
Verify that the data was inserted.
c. Create a table with the sales demo data.
d. Load data into the SalesDemo table using the
%dataload
magic command.noteUnable to locate the salesdemo.csv file? Download the file from GitHub Demo: Charting and Visualization Data.
Verify that the sales demo data was inserted successfully.
Open the Navigator for your connection and verify that the tables were created. Run a row count on the tables to verify that the data was loaded.
safe to assume they know how? probably not our job to teach them
e. Use charting magic to visualize the result.
safe to assume they know how? probably not our job to teach them
Provide X and Y axes for your chart.
f. Drop the tables.
-
Back up your project metadata and object definitions in your Git repository.
Is "metadata" too ENG-ish for a new data scientist? Should we say "data object definitions"?
-
Suspend the engine.
You're done! You've run your first workload.