封面
版权页
Credits
About the Authors
About the Reviewers
www.PacktPub.com
eBooks discount offers and more
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Chapter 1. Getting Started
Computer science
Artificial intelligence
Machine learning
Statistics
Mathematics
Knowledge domain
Data information and knowledge
The data analysis process
Quantitative versus qualitative data analysis
Importance of data visualization
What about big data?
Quantified self
Tools and toys for this book
Summary
Chapter 2. Preprocessing Data
Data sources
Data scrubbing
Data formats
Data reduction methods
Getting started with OpenRefine
Summary
Chapter 3. Getting to Grips with Visualization
What is visualization?
Working with web-based visualization
Exploring scientific visualization
Visualization in art
The visualization life cycle
Visualizing different types of data
Getting started with D3.js
Interaction and animation
Data from social networks
An overview of visual analytics
Summary
Chapter 4. Text Classification
Learning and classification
Bayesian classification
E-mail subject line tester
The data
The algorithm
Classifier accuracy
Summary
Chapter 5. Similarity-Based Image Retrieval
Image similarity search
Dynamic time warping
Processing the image dataset
Implementing DTW
Analyzing the results
Summary
Chapter 6. Simulation of Stock Prices
Financial time series
Random Walk simulation
Monte Carlo methods
Generating random numbers
Implementation in D3js
Quantitative analyst
Summary
Chapter 7. Predicting Gold Prices
Working with time series data
Smoothing time series
Lineal regression
The data - historical gold prices
Nonlinear regressions
Summary
Chapter 8. Working with Support Vector Machines
Understanding the multivariate dataset
Dimensionality reduction
Getting started with SVM
Summary
Chapter 9. Modeling Infectious Diseases with Cellular Automata
Introduction to epidemiology
The epidemic models
Modeling with Cellular Automaton
Simulation of the SIRS model in CA with D3.js
Summary
Chapter 10. Working with Social Graphs
Structure of a graph
Social networks analysis
Acquiring the Facebook graph
Working with graphs using Gephi
Statistical analysis
Degree distribution
Transforming GDF to JSON
Graph visualization with D3.js
Summary
Chapter 11. Working with Twitter Data
The anatomy of Twitter data
Using OAuth to access Twitter API
Getting started with Twython
Summary
Chapter 12. Data Processing and Aggregation with MongoDB
Getting started with MongoDB
Data preparation
Group
Aggregation framework
Summary
Chapter 13. Working with MapReduce
An overview of MapReduce
Programming model
Using MapReduce with MongoDB
Filtering the input collection
Grouping and aggregation
Counting the most common words in tweets
Summary
Chapter 14. Online Data Analysis with Jupyter and Wakari
Getting started with Wakari
Getting started with IPython notebook
Introduction to image processing with PIL
Getting started with pandas
Sharing your Notebook
Summary
Chapter 15. Understanding Data Processing using Apache Spark
Platform for data processing
An introduction to the distributed file system
An introduction to Apache Spark
Summary
更新时间:2021-07-08 11:21:57