Python® for Data Science for Beginners

(PYTHON-DS.AE1)/ISBN:978-1-64459-462-9

This course includes
Lessons
TestPrep
Hands-On Labs

Lessons

23+ Lessons | 40+ Exercises | 170+ Quizzes | 76+ Flashcards | 76+ Glossary of terms

TestPrep

Hands-On Labs

30+ LiveLab | 15+ Video tutorials | 24+ Minutes

Here's what you will learn

Download Course Outline

Lessons 1: Introduction

  • About This Course
  • False Assumptions
  • Icons Used in This Course
  • Where to Go from Here

Lessons 2: Discovering the Match between Data Science and Python

  • Defining the Sexiest Job of the 21st Century
  • Creating the Data Science Pipeline
  • Understanding Python’s Role in Data Science
  • Learning to Use Python Fast

Lessons 3: Introducing Python’s Capabilities and Wonders

  • Why Python?
  • Working with Python
  • Performing Rapid Prototyping and Experimentation
  • Considering Speed of Execution
  • Visualizing Power
  • Using the Python Ecosystem for Data Science

Lessons 4: Setting Up Python for Data Science

  • Considering the Off-the-Shelf Cross-Platform Scientific Distributions
  • Installing Anaconda on Windows
  • Installing Anaconda on Linux
  • Installing Anaconda on Mac OS X
  • Downloading the Datasets and Example Code

Lessons 5: Working with Google Colab

  • Defining Google Colab
  • Getting a Google Account
  • Working with Notebooks
  • Performing Common Tasks
  • Using Hardware Acceleration
  • Executing the Code
  • Viewing Your Notebook
  • Sharing Your Notebook
  • Getting Help

Lessons 6: Understanding the Tools

  • Using the Jupyter Console
  • Using Jupyter Notebook
  • Performing Multimedia and Graphic Integration

Lessons 7: Working with Real Data

  • Uploading, Streaming, and Sampling Data
  • Accessing Data in Structured Flat-File Form
  • Sending Data in Unstructured File Form
  • Managing Data from Relational Databases
  • Interacting with Data from NoSQL Databases
  • Accessing Data from the Web

Lessons 8: Conditioning Your Data

  • Juggling between NumPy and pandas
  • Validating Your Data
  • Manipulating Categorical Variables
  • Dealing with Dates in Your Data
  • Dealing with Missing Data
  • Slicing and Dicing: Filtering and Selecting Data
  • Concatenating and Transforming
  • Aggregating Data at Any Level

Lessons 9: Shaping Data

  • Working with HTML Pages
  • Working with Raw Text
  • Using the Bag of Words Model and Beyond
  • Working with Graph Data

Lessons 10: Putting What You Know in Action

  • Contextualizing Problems and Data
  • Considering the Art of Feature Creation
  • Performing Operations on Arrays

Lessons 11: Getting a Crash Course in MatPlotLib

  • Starting with a Graph
  • Setting the Axis, Ticks, Grids
  • Defining the Line Appearance
  • Using Labels, Annotations, and Legends

Lessons 12: Visualizing the Data

  • Choosing the Right Graph
  • Creating Advanced Scatterplots
  • Plotting Time Series
  • Plotting Geographical Data
  • Visualizing Graphs

Lessons 13: Stretching Python’s Capabilities

  • Playing with Scikit-learn
  • Performing the Hashing Trick
  • Considering Timing and Performance
  • Running in Parallel on Multiple Cores

Lessons 14: Exploring Data Analysis

  • The EDA Approach
  • Defining Descriptive Statistics for Numeric Data
  • Counting for Categorical Data
  • Creating Applied Visualization for EDA
  • Understanding Correlation
  • Modifying Data Distributions

Lessons 15: Reducing Dimensionality

  • Understanding SVD
  • Performing Factor Analysis and PCA
  • Understanding Some Applications

Lessons 16: Clustering

  • Clustering with K-means
  • Performing Hierarchical Clustering
  • Discovering New Groups with DBScan

Lessons 17: Detecting Outliers in Data

  • Considering Outlier Detection
  • Examining a Simple Univariate Method
  • Developing a Multivariate Approach

Lessons 18: Exploring Four Simple and Effective Algorithms

  • Guessing the Number: Linear Regression
  • Moving to Logistic Regression
  • Making Things as Simple as Naïve Bayes
  • Learning Lazily with Nearest Neighbors

Lessons 19: Performing Cross-Validation, Selection, and Optimization

  • Pondering the Problem of Fitting a Model
  • Cross-Validating
  • Selecting Variables Like a Pro
  • Pumping Up Your Hyperparameters

Lessons 20: Increasing Complexity with Linear and Nonlinear Tricks

  • Using Nonlinear Transformations
  • Regularizing Linear Models
  • Fighting with Big Data Chunk by Chunk
  • Understanding Support Vector Machines
  • Playing with Neural Networks

Lessons 21: Understanding the Power of the Many

  • Starting with a Plain Decision Tree
  • Making Machine Learning Accessible
  • Boosting Predictions

Lessons 22: Ten Essential Data Resources

  • Discovering the News with Subreddit
  • Getting a Good Start with KDnuggets
  • Locating Free Learning Resources with Quora
  • Gaining Insights with Oracle’s Data Science Blog
  • Accessing the Huge List of Resources on Data Science Central
  • Learning New Tricks from the Aspirational Data Scientist
  • Obtaining the Most Authoritative Sources at Udacity
  • Receiving Help with Advanced Topics at Conductrics
  • Obtaining the Facts of Open Source Data Science from Masters
  • Zeroing In on Developer Resources with Jonathan Bower

Lessons 23: Ten Data Challenges You Should Take

  • Meeting the Data Science London + Scikit-learn Challenge
  • Predicting Survival on the Titanic
  • Finding a Kaggle Competition that Suits Your Needs
  • Honing Your Overfit Strategies
  • Trudging Through the MovieLens Dataset
  • Getting Rid of Spam E-mails
  • Working with Handwritten Information
  • Working with Pictures
  • Analyzing Amazon.com Reviews
  • Interacting with a Huge Graph

Hands-on LAB Activities

Conditioning Your Data

  • Checking the Version of Pandas
  • Creating Categorical Variables
  • Finding the Missing Data
  • Encoding Missingness
  • Sorting and Shuffling
  • Creating n-grams
  • Calculating TF-IDF
  • Modifying Graphs Using NetworkX
  • Creating an Adjacency Matrix Using NetworkX
  • Defining a Plot
  • Creating a Line Plot
  • Creating a Legend
  • Creating a Pie Chart
  • Creating a Scatterplot
  • Creating an Undirected Graph
  • Using Parallel Coordinates
  • Calculating Descriptive Statistics
  • Visualizing the Validation Curve
  • Visualizing a Subset of Images
  • Adding New Cases and Variables

Shaping Data

  • Extracting a Telephone Number

Putting What You Know in Action

  • Using Vectorization
  • Performing Matrix Multiplication

Stretching Python’s Capabilities

  • Building a Predictor

Exploring Data Analysis

  • Loading the Iris Dataset

Reducing Dimensionality

  • Creating a Numpy Array

Clustering

  • Understanding Centroid-Based Algorithms

Exploring Four Simple and Effective Algorithms

  • Using K-Nearest Neighbors and PCA

Performing Cross-Validation, Selection, and Optimization

  • Loading the Boston Housing Dataset

Understanding the Power of the Many

  • Optimizing the Depth of Decision Tree