编辑推荐
暂无
内容简介
Python是一种多范式的编程语言,既适合面向对象的应用开发,也适合函数式设计模式。Python已然成为数据科学家们在数据分析、可视化和机器学习方面的**语言,它可以带来高效率和高生产力。
伊德里斯所*的《Python数据分析(影印版)(英文版)》将教会初学者如何发掘Python的*大潜力用于数据分析,包括从数据获取、清洗、操作、可视化以及存储到复分析和建模等一切相关主题。它聚焦于一系列开源Python模块,比如NumPy、SciPy、matplotlib、pandas、IPython、Cython、scikit-learn以及NLTK等。在后面的章节里,本书涵盖了数据可视化、信号处理与时间序列分析、数据库、可预测分析及机器学习等主题。该书可以让你分分钟变成**数据分析师。
作者简介
暂无
目录
Preface
Chapter 1: Getting Started with Python Libraries
 ; Software used in this book
 ;  ; Installing software and setup
 ;  ; On Windows
 ;  ; On Linux
 ;  ; On Mac OS X
 ; Building NumPy SciPy, matplotlib, and IPython from source
 ; Installing with setuptools
 ; NumPy arrays
 ; A simple application
 ; Using IPython as a shell
 ; Reading manual pages
 ; IPython notebooks
 ; Where to find help and references
 ; Summary
Chapter 2: NumPy Arrays
 ; The NumPy array object
 ;  ; The advantages of NumPy arrays
 ; Creating a multidimensional array
 ; Selecting NumPy array elements
 ; NumPy numerical types
 ;  ; Data type objects
 ;  ; Character codes
 ;  ; The dtype constructors
 ;  ; The dtype attributes
 ; One-dimensional slicing and indexing
 ; Manipulating array shapes
 ;  ; Stacking arrays
 ;  ; Splitting NumPy arrays
 ;  ; NumPy array attributes
 ;  ; Converting arrays
 ; Creating array views and copies
 ; Fancy indexing
 ; Indexing with a list of locations
 ; Indexing NumPy arrays with Booleans
 ; Broadcasting NumPy arrays
 ; Summary
Chapter 3: Statistics and Linear Algebra
 ; NumPy and SciPy modules
 ; Basic descriptive statistics with NumPy
 ; Linear algebra with NumPy
 ;  ; Inverting matrices with NumPy,
 ;  ; Solving linear systems with NumPy
 ; Finding eigenvalues and eigenvectors with-NumPy
 ; NumPy random numbers
 ;  ; Gambling with the binomial distribution
 ;  ; Sampling the normal distribution
 ;  ; Performing a normality test with SciPy
 ; Creating a NumPy-masked array
 ;  ; Disregarding negative and extreme values
 ; Summary
Chapter 4: pandas Primer
 ; Installing and exploring pandas
 ; pandas DataFrames
 ; pandas Series
 ; Querying data in pandas
 ; Statistics with pandas DataFrames
 ; Data aggregation with pandas DataFrames
 ; Concatenating and appending DataFrames
 ; Joining DataFrames
 ; Handling missing values
 ; Dealing with dates
 ; Pivot tables
 ; Remote data access
 ; Summary
Chapter 5: Retrieving, Processing, and Storing Data
 ; Writing CSV files withNumPy and pandas
 ; Comparing the NumPy .npy binary format and pickling
 ; pandas DataFrames
 ; Storing data with PyTables
 ; Reading and writing pandas DataFrames to HDF5 stores
 ; Reading and writing to Excel with pandas
 ; Using REST web services and JSON
 ; Reading and writing JSON with pandas
 ; Parsing RSS and Atom feeds
 ; Parsing HTML with Beautiful Soup
 ; Summary
Chapter 6: Data Visualization
 ; matplotlib subpackages
 ; Basic matplotlib plots
 ; Logarithmic plots
 ; Scatter plots
 ; Legends and annotations
 ; Three-dimensional plots
 ; Plotting in pandas
 ; Lag plots
 ; Autocorrelation plots
Plot.ly
Summary
Chapter 7: Signal Processing and Time Series
statsmodels subpackages
Moving averages
Window functions
Defining cointegration
Autocorrelation
Autoregressive models
ARMA models
Generating periodic signals
Fourier analysis
Spectral analysis
Filtering
Summary
Chapter 8: Working with Databases
Lightweight access with sqlite3
Accessing databases from pandas
SQLAIchemy
Installing and setting up SQLAIchemy
Populating a database with SQLAIchemy
Querying the database with SQLAIchemy
Pony ORM
Dataset - databases for lazy people
PyMongo and MongoDB
Storing data in Redis
Apache Cassandra
Summary
Chapter 9: Analyzing Textual Data and Social Media
Installing NLTK
Filtering out stopwords, names, and numbers
The bag-of-words model
Analyzing word frequencies
Naive Bayes classification
Sentiment analysis
Creating word clouds
Social network analysis
Summary
Chapter 10: Predictive Analytics and Machine Learning
A tour of scikit-learn
Preprocessing
Classification with logistic regression
Classification with support vector machines
Regression with ElasticNetCV
Support vector regression
Clustering with affinity propagation
Mean Shift
Genetic algorithms
Neural networks
Decision trees
Summary
Chapter 11: Environments Outside the Python Ecosystem and Cloud Computing
Exchanging information with MATLAB/Octave
Installing rpy2
Interfacing with R
Sending NumPy arrays to Java
Integrating SWIG and NumPy
Integrating Boost and Python
Using Fortran code through f2py
Setting up Google App Engine
Running programs on PythonAnywhere
Working with Wakari
Summary
Chapter 12: Performance Tuning, Profiling, and Concurrency
Profiling the code
Installing Cython
Calling C code
Creating a process pool with multiprocessing
Speeding up embarrassingly parallel for loops with Joblib
Comparing Bottleneck to NumPy functions
Performing MapReduce with Jug
Installing MPI for Python
IPython Parallel
Summary
Appendix A: Key Concepts
Appendix B: Useful Functions
matplotlib
NumPy
pandas
Scikit-learn
SciPy
scipy.fftpack
scipy.signal
scipy.stats
Appendix C: Online Resources
Index
Python数据分析(影印版) pdf下载声明
本pdf资料下载仅供个人学习和研究使用,不能用于商业用途,请在下载后24小时内删除。如果喜欢,请购买正版