Skip to main content

DS102: Introduction to Data Science with Python

Start Date: Dec 1, 2017
Price: Free

COURSE OVERVIEW

This course is devoted to the usage of Python programing language in Data Science. Why Python? First of all because it is a very powerful programming language, used for many different applications. In recent years, many tools specifically for Data Science have been built - analyzing data with Python has never been easier.

Machine learning is the opportunity to see the future. It is the art of finding the consequences of some phenomenon or action based on the set of reasons caused this event knowing about results of the previous events or without this. Knowledge of machine learning methods is considered as an aerobatics in Data Science. The course will introduce a range of model based and algorithmic machine learning methods including regression, several classification algorithms, decision trees, Naive Bayes, random forests, k-means clustering, etc. The course will cover the complete process of building prediction functions including data collection, feature creation and extraction, algorithms training and testing, its evaluation and improvement. You will solve problems of text (spam/ham) classifcation, house price regression building, prediction of whether a person was survived or not in the Titanic catastrophe and many others.

WHAT YOU'LL LEARN

Perhaps, you had deal with MS Excel and know about how many things are possible to do with this tool. There is a wonderful python library which is more powerful and quick than Excel. It is called pandas. We will introduce you with this perfect tool and show how to process large and small datasets stored in various formats, how to transform, quantize, clean, filter and aggregate them and how to visualize data or some operations results with its help.

You will learn:

- the basics of Python programming language including its data structures, conditional statements, loops, generators, work with files and modules, OOP, etc.;

- features of relational and so-called NoSQL databases (particularly, Neo4j, MongoDB and Cassandra), show its structure and how to save, extract and process data for different databases;

- how to crawl Web pages, get access to Web API and how interact with databases using Python.

SYLLABUS

WEEK 1. Crash course into Python

WEEK 2. Basic intro into pandas

WEEK 3. Basic intro into visualization with Python.

WEEK 4. Intro to Machine Learning with scikit-learn

WEEK 5. Advanced topics of Machine Learning with scikit-learn

WEEK 6. SQL with Python. Relational databases

WEEK 7. Web Scraping with Python

WEEK 8. Web API with Python. MongoDB. Cassandra

WEEK 9. Final project on https://www.kaggle.com/

PREREQUISITES

This course designed for users who have never learned Python and even have no skills in programming or who have a beginning knowledge of Python. But even you have a good Python level you will find many interesting in this segment of the course.

To fulfil the Week 8 assignment successfully, register a developer's account on Twitter. Visit https://developer.twitter.com/en/apply/user and follow the registration procedure step by step. The verification usually takes several days (around 2 days).

Enroll