神刀安全网

Python Ecosystem for Machine Learning

The Python ecosystem is growing and may become the dominant platform for machine learning.

The primarily rationale for adopting Python for machine learning is because it is a general purpose programming language that you can use both for research and development and in production.

In this post you will discover the Python ecosystem for machine learning.

Python Ecosystem for Machine Learning

Python Ecosystem for Machine Learning

Photo by Stewart Black , some rights reserved.

Python

Python is a general purpose interpreted programming language. It is easy to learn and use primarily because the language focuses on readability.

The philosophy of Python is captured in the Zen of Python which includes phrases like:

  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Complex is better than complicated.
  • Flat is better than nested.
  • Sparse is better than dense.
  • Readability counts.

You can see the full Zen of Python in your Python environment by typing:

importthis 

It is a popular language in general, consistently appearing in the top 10 programming languages in surveys on StackOverflow (for example the 2015 survey results ). It’s a dynamic language and very suited to interactive development and quick prototyping with the power to support the development of large applications.

It is also widely used for machine learning and data science because of the excellent library support and because it is a general purpose programming language (unlike R or Matlab). For example, see the results of the Kaggle platform survey results in 2011 and the KDD Nuggets 2015 tool survey results .

This is a simple and very important consideration.

It means that you can perform your research and development (figuring out what models to use) in the same programming language that you use in operations. Greatly simplifying the transition from development to operations.

SciPy

SciPy is an ecosystem of Python libraries for mathematics, science and engineering. It is an add-on to Python that you will need for machine learning.

The SciPy ecosystem is comprised of the following core modules relevant to machine learning:

  • NumPy : A foundation for SciPy that allows you to efficiently work with data in arrays.
  • Matplotlib : Allows you to create 2D charts and plots from data.
  • pandas : Tools and data structures to organize and analyze your data.

To be effective at machine learning in Python you must install and become familiar with SciPy. Specifically:

  • You will use Pandas to load explore and better understand your data.
  • You will use Matplotlib (and wrappers of Matplotlib in other frameworks) to create plots and charts of your data.
  • You will prepare your data as NumPy arrays for modeling in machine learning algorithms.

You can learn more about Pandas in the posts Prepare Data for Machine Learning in Python with Pandas and Quick and Dirty Data Analysis with Pandas .

scikit-learn

The scikit-learn library is how you can develop and practice machine learning in python.

It is built upon and requires the SciPy ecosystem. The name “ scikit ” suggests that it is a SciPy plugin or toolkit. You can review a full list of available SciKits .

The focus of the library is machine learning algorithms for classification, regression, clustering and more. It also provides tools for related tasks such as evaluating models, tuning parameters and pre-processing data.

Like Python and SciPy, scikit-learn is open source and commercially usable under the BSD license. This means that you can learn about machine learning, develop models and put them into operations all with the same ecosystem and code. A powerful reason to use scikit-learn.

You can learn more about scikit-learn in the post A Gentle Introduction to scikit-learn .

Python Ecosystem Installation

There are multiple ways to install the Python ecosystem for machine learning. In this section we cover how to install the Python ecosystem for machine learning.

How To Install Python

The first step is to install Python. I prefer to use and recommend Python 2.7.

This will be specific to your platform. For instructions see Downloading Python in the Python Beginners Guide .

Once installed you can confirm the installation was successful. Open a command line and type:

python --version 

You should see a response like the following:

Python 2.7.11 

How To Install SciPy

There are many ways to install SciPy. For example two popular ways are to use package management on your platform (e.g. yum on RedHat or macports on OS X) or use a Python package management tool like pip.

The SciPy documentation is excellent and covers how-to instructions for many different platforms on the page Installing the SciPy Stack .

When installing SciPy, ensure that you install the following packages as a minimum:

  • scipy
  • numpy
  • matplotlib
  • pandas

Once installed, you can confirm that the installation was successful. Open the python interactive environment by typing “ python ” at the command line, then type in and run the following python code to print the versions of the installed libraries.

# scipy importscipy print('scipy: {}'.format(scipy.__version__)) # numpy importnumpy print('numpy: {}'.format(numpy.__version__)) # matplotlib importmatplotlib print('matplotlib: {}'.format(matplotlib.__version__)) # pandas importpandas print('pandas: {}'.format(pandas.__version__)) 

On my workstation at the time of posting I see the following output.

scipy: 0.17.0 numpy: 1.10.4 matplotlib: 1.5.1 pandas: 0.17.1 

What output do you see? Post it in the comments.

If you have an error, you may need to consult the documentation for your platform.

How To Install scikit-learn

I would suggest that you use the same method to install scikit-learn as you used to install SciPy.

There are instructions for installing scikit-learn , but they are limited to using the Python pip and  conda package managers.

Like SciPy, you can confirm that scikit-learn was installed succesfully. Start your Python interactive environment and type and run the following code.

# scikit-learn importsklearn print('sklearn: {}'.format(sklearn.__version__)) 

It will print the version of the scikit-learn library installed. On my workstation I see the following output:

sklearn: 0.17.1 

How To Install The Ecosystem: An Easier Way

If you are not confident at installing software on your machine, there is an easier option for you.

There is a distribution called Anaconda that you can download and install for free .

It supports the three main platforms of Microsoft Windows, Mac OS X and Linux.

It includes Python, SciPy and scikit-learn. Everything you need to learn, practice and use machine learning with the Python Environment.

Your Guide to Machine Learning with Scikit-Learn

Python Ecosystem for Machine Learning Python and scikit-learn are the rising platform among professional data scientists for applied machine learning.

PDF and Email Course.

FREE 14-Day Mini-Course inMachine Learning with Python and scikit-learn

Download Your FREE Mini-Course >>

Download your PDF containing all 14 lessons.

Get your daily lesson via email with tips and tricks.

Summary

In this post you discovered the Python ecosystem for machine learning.

You learned about:

  • Python and it’s rising use for machine learning.
  • SciPy and the functionality it provides with NumPy, Matplotlib and Pandas.
  • scikit-learn that provides all of the machine learning algorithms.

You also learned how to install the Python ecosystem for machine learning on your workstation.

Do you have any questions about Python for machine learning or this post? Ask your question in the comments and I will do my best to answer.

Need Help With Machine Learning in Python?

Python Ecosystem for Machine Learning Finally understand how to work through a machine learning problem, step-by-step in the new Ebook: 

Machine Learning Mastery with Python

Take the next step with 16 self-study lessons covering data preparation, feature selection, ensembles and more.

Includes 3 end-to-end projects and a project template to tie it all together.

Ideal for beginners and intermediate levels.

Apply Machine Learning Like A Professional With Python

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Python Ecosystem for Machine Learning

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址