PYTHON IN DATA ANALYSIS
Python for Data Analysis
Data Wrangling with Pandas, NumPy, and IPython
Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.
Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.
Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.
- Use the IPython interactive shell as your primary development environment
- Learn basic and advanced NumPy (Numerical Python) features
- Get started with data analysis tools in the pandas library
- Use high-performance tools to load, clean, transform, merge, and reshape data
- Create scatter plots and static or interactive visualizations with matplotlib
- Apply the pandas groupby facility to slice, dice, and summarize datasets
- Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
- Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Table of Contents
Chapter 1 Preliminaries
What Is This Book About?
Why Python for Data Analysis?
Essential Python Libraries
Installation and Setup
Community and Conferences
Navigating This Book
Acknowledgements
Chapter 2 Introductory Examples
1.usa.gov data from bit.ly
MovieLens 1M Data Set
US Baby Names 1880-2010
Conclusions and The Path Ahead
Chapter 3 IPython: An Interactive Computing and Development Environment
IPython Basics
Using the Command History
Interacting with the Operating System
Software Development Tools
IPython HTML Notebook
Tips for Productive Code Development Using IPython
Advanced IPython Features
Credits
Chapter 4 NumPy Basics: Arrays and Vectorized Computation
The NumPy ndarray: A Multidimensional Array Object
Universal Functions: Fast Element-wise Array Functions
Data Processing Using Arrays
File Input and Output with Arrays
Linear Algebra
Random Number Generation
Example: Random Walks
Chapter 5 Getting Started with pandas
Introduction to pandas Data Structures
Essential Functionality
Summarizing and Computing Descriptive Statistics
Handling Missing Data
Hierarchical Indexing
Other pandas Topics
Chapter 6 Data Loading, Storage, and File Formats
Reading and Writing Data in Text Format
Binary Data Formats
Interacting with HTML and Web APIs
Interacting with Databases
Chapter 7 Data Wrangling: Clean, Transform, Merge, Reshape
Combining and Merging Data Sets
Reshaping and Pivoting
Data Transformation
String Manipulation
Example: USDA Food Database
Chapter 8 Plotting and Visualization
A Brief matplotlib API Primer
Plotting Functions in pandas
Plotting Maps: Visualizing Haiti Earthquake Crisis Data
Python Visualization Tool Ecosystem
Chapter 9 Data Aggregation and Group Operations
GroupBy Mechanics
Data Aggregation
Group-wise Operations and Transformations
Pivot Tables and Cross-Tabulation
Example: 2012 Federal Election Commission Database
Chapter 10 Time Series
Date and Time Data Types and Tools
Time Series Basics
Date Ranges, Frequencies, and Shifting
Time Zone Handling
Periods and Period Arithmetic
Resampling and Frequency Conversion
Time Series Plotting
Moving Window Functions
Performance and Memory Usage Notes
Chapter 11 Financial and Economic Data Applications
Data Munging Topics
Group Transforms and Analysis
More Example Applications
Chapter 12 Advanced NumPy
ndarray Object Internals
Advanced Array Manipulation
Broadcasting
Advanced ufunc Usage
Structured and Record Arrays
More About Sorting
NumPy Matrix Class
Advanced Array Input and Output
Performance Tips
Appendix Python Language Essentials
The Python Interpreter
The Basics
Data Structures and Sequences
Functions
Files and the operating system
Colophon
GET THIS BOOK AT AMAZAN
No comments:
Post a Comment