Introduction

  • Starting an AI project

    • Workflow of projects

    • Selecting AI projects

    • Organizing data and team for the projects

Workflow of a machine learning project

How do you build, say a speech recognition engine

  • Key Steps:

    1. Collect Data: people saying “Alexa”, and other words

    2. Train model: learns A to B mapping… audio clip to “word”

      • many iterations

    3. Deploy the model: implement in to a smart speaker

      • Will collect new data (get data back), to  maintain /update the model

How do you build, say a self driving car

  • Key steps:

    1. Collect Data: images – > positions of other cars, draw rectangles around cars

    2. Train model: need to iterate and precisely identify cars

    3. Deploy model: may learn that golf carts are identified and positions well. keep iterating.

Workflow of a data science project

output: actionable insights

Optimize a sales funnel

  • Key steps:

    1. Collect Data: where are people coming from, time of day, machines type, etsc…

    2. Analyze the data: Iterate many time to get good insights insights from the data collected.

    3. Suggest hypotheses/actions: Deploy changes, re-analyze new data periodically.

Optimizing a manufacturing line

  • Key steps:

    1. Collect Data: clay supplier, mixing time, ingredients, lead times, relative humidity, temperature, kiln duration, etc…

    2. Analyze the data: Iterate many time to get good insights insights from the data collected. 

    3. Suggest hypotheses/actions: Deploy changes, re-analyze new data periodically.

Every job function needs to learn how to use data

  • Use data to optimize workflows through data science based analysis, and to take on tasks with machine learning (remember less than a second), Inputs (A) to Output (B).

  • From Sales, recruiting, marketing, to agriculture, and beyond DS and ML are having huge impacts

How to choose an AI project

  • Bring together a cross-functional team knowledgeable in AI, plus domain experts.

  • Brainstorming framework:

    • Think about automating “tasks,” vs automating “jobs.”

    • what are the main drivers of business values?

    • What are the main pain point in your business?

    • Note: you can make progress without big data

      • Having more data almost never hurts

      • Data makes some business [Google, Facebook, Netflix, Amazon] defensible.

      • With small datasets, you can still make progress. The amount of data you need is problem dependant.

  • Due diligence on project

    • What AI can do + Valuable for your business

      • Technical diligence

        • Can AI system meet desired performance (e.g. accuracy, speed, etc)

        • How much data is need to meet performance goals

        • Engineering timline

      • Business diligence

        • Current business: Lower costs

        • Current business: Increase revenue ( getting more people to check out)

        • New business: New product or business

      • *Ethical diligence*

        • money vs impact on society

  • Build vs. buy

    • ML projects can be in-house or outsourced

    • DS projects are more commonly in-house

    • Some things will be industry standard, avoid building those.

  • “Don’t sprint in front of a train.”

    • some times it makes sense to adopt another’s platform or approach than to build your own. resource constraints, capability constraints…

Working with an AI team

  • Specify your acceptance criteria

    • Goal: defects with 95% accuracy…How do you measure accuracy

      • Test Set (n1000): labelled training dataset to measure performance. 

    • Training Set: Pictures with labels

      • Learn mapping from A to B

    • Test Set: Another data set to test the mappings. Often more than 1 test set will be requested.

  • Pitfall of expecting 100% accuracy. Discuss with AI engineers what’s reasonable.

    • Limitations of ML

    • Insufficient data

    • Mislabeled data

    • Ambiguous labels

Technical Tools for AI teams

  • CPU vs. GPU [Great for deep Learning/Neural Networks] Nvidia

  • Cloud vs. On-prem, ….Edge [Processor, where data is collected.]