Data Analytics with Spark Using Python

SØG - mellem flere end 8 millioner bøger:

Viser: Data Analytics with Spark Using Python

Data Analytics with Spark Using Python, 1. udgave

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Pearson International

190,00 kr.

Leveres umiddelbart efter køb

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Pearson International

230,00 kr.

Leveres umiddelbart efter køb

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Pearson International

271,00 kr.

Leveres umiddelbart efter køb

Jeffrey Aven

(2018)

Sprog: Engelsk

Pearson Education, Limited

380,00 kr.

Denne udgivelse har været planlagt men er nu opgivet og udkommer derfor ikke.

Porto kan blive tilføjet, se takster her

Du kan vælge at få vare sendt eller afhente den i en af vores butikker. Vælger du at få den sendt koster det:

Levering på adressen
1 - 1 vare – kr.	59.5
2 - 5 varer – kr.	59.5
6 - 10 varer – kr.	59.5
11 - 20 varer – kr.	59.5
21 - 49 varer – kr.	79.5
50 - 9999 varer – kr.	119.5

Levering på firma adressen
1 - 1 vare – kr.	39.5
2 - 5 varer – kr.	39.5
6 - 10 varer – kr.	39.5
11 - 20 varer – kr.	39.5
21 - 49 varer – kr.	79.5
50 - 9999 varer – kr.	119.5

Afhent i postbutik eller pakkeboks
1 - 1 vare – kr.	39.5
2 - 5 varer – kr.	39.5
6 - 10 varer – kr.	39.5
11 - 20 varer – kr.	39.5
21 - 49 varer – kr.	59.5
50 - 9999 varer – kr.	99.5

Læs mere om forsendelse her

Detaljer om varen

1. Udgave
Vital Source 90 day rentals (dynamic pages)
Udgiver: Pearson International (Juni 2018)
ISBN: 9780134844879R90

Beskrivelse
Om e-bogen

Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem. Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers—even those with little Hadoop or Spark experience. Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems. Coverage includes: • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems • Create Spark clusters using various deployment modes • Control and optimize the operation of Spark clusters and applications • Master Spark Core RDD API programming techniques • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning • Efficiently integrate Spark with both SQL and nonrelational data stores • Perform stream processing and messaging with Spark Streaming and Apache Kafka • Implement predictive modeling with SparkR and Spark MLlib

Licens varighed:
Bookshelf online: 90 dage fra købsdato.
Bookshelf appen: 90 dage fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

1. Udgave
Vital Source 180 day rentals (dynamic pages)
Udgiver: Pearson International (Juni 2018)
ISBN: 9780134844879R180

Beskrivelse
Om e-bogen

Licens varighed:
Bookshelf online: 180 dage fra købsdato.
Bookshelf appen: 180 dage fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

1. Udgave
Vital Source 365 day rentals (dynamic pages)
Udgiver: Pearson International (Juni 2018)
ISBN: 9780134844879R365

Beskrivelse
Om e-bogen

Licens varighed:
Bookshelf online: 5 år fra købsdato.
Bookshelf appen: 5 år fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

Hardback: 320 sider
Udgiver: Pearson Education, Limited (Juni 2018)
ISBN: 9780134846019

Beskrivelse
Indhold

Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools

Spark is at the heart of today's Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem.

Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide's focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers--even those with little Hadoop or Spark experience.

Aven's broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You'll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems.

Coverage includes:
* Understand Spark's evolving role in the Big Data and Hadoop ecosystems
* Create Spark clusters using various deployment modes
* Control and optimize the operation of Spark clusters and applications
* Master Spark Core RDD API programming techniques
* Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning
* Efficiently integrate Spark with both SQL and nonrelational data stores
* Perform stream processing and messaging with Spark Streaming and Apache Kafka
* Implement predictive modeling with SparkR and Spark MLlib

Preface xi Introduction 1
PART I: SPARK FOUNDATIONS
Chapter 1 Introducing Big Data, Hadoop, and Spark 5 Introduction to Big Data, Distributed Computing, and Hadoop 5 A Brief History of Big Data and Hadoop 6 Hadoop Explained 7 Introduction to Apache Spark 13 Apache Spark Background 13 Uses for Spark 14 Programming Interfaces to Spark 14 Submission Types for Spark Programs 14 Input/Output Types for Spark Applications 16 The Spark RDD 16 Spark and Hadoop 16 Functional Programming Using Python 17 Data Structures Used in Functional Python Programming 17 Python Object Serialization 20 Python Functional Programming Basics 23 Summary 25
Chapter 2 Deploying Spark 27 Spark Deployment Modes 27 Local Mode 28 Spark Standalone 28 Spark on YARN 29 Spark on Mesos 30 Preparing to Install Spark 30 Getting Spark 31 Installing Spark on Linux or Mac OS X 32 Installing Spark on Windows 34 Exploring the Spark Installation 36 Deploying a Multi-Node Spark Standalone Cluster 37 Deploying Spark in the Cloud 39 Amazon Web Services (AWS) 39 Google Cloud Platform (GCP) 41 Databricks 42 Summary 43
Chapter 3 Understanding the Spark Cluster Architecture 45 Anatomy of a Spark Application 45 Spark Driver 46 Spark Workers and Executors 49 The Spark Master and Cluster Manager 51 Spark Applications Using the Standalone Scheduler 53 Spark Applications Running on YARN 53 Deployment Modes for Spark Applications Running on YARN 53 Client Mode 54 Cluster Mode 55 Local Mode Revisited 56 Summary 57
Chapter 4 Learning Spark Programming Basics 59 Introduction to RDDs 59 Loading Data into RDDs 61 Creating an RDD from a File or Files 61 Methods for Creating RDDs from a Text File or Files 63 Creating an RDD from an Object File 66 Creating an RDD from a Data Source 66 Creating RDDs from JSON Files 69 Creating an RDD Programmatically 71 Operations on RDDs 72 Key RDD Concepts 72 Basic RDD Transformations 77 Basic RDD Actions 81 Transformations on PairRDDs 85 MapReduce and Word Count Exercise 92 Join Transformations 95 Joining Datasets in Spark 100 Transformations on Sets 103 Transformations on Numeric RDDs 105 Summary 108
PART II: BEYOND THE BASICS
Chapter 5 Advanced Programming Using the Spark Core API 111 Shared Variables in Spark 111 Broadcast Variables 112 Accumulators 116 Exercise: Using Broadcast Variables and Accumulators 119 Partitioning Data in Spark 120 Partitioning Overview 120 Controlling Partitions 121 Repartitioning Functions 123 Partition-Specific or Partition-Aware API Methods 125 RDD Storage Options 127 RDD Lineage Revisited 127 RDD Storage Options 128 RDD Caching 131 Persisting RDDs 131 Choosing When to Persist or Cache RDDs 134 Checkpointing RDDs 134 Exercise: Checkpointing RDDs 136 Processing RDDs with External Programs 138 Data Sampling with Spark 139 Understanding Spark Application and Cluster Configuration 141 Spark Environment Variables 141 Spark Configuration Properties 145 Optimizing Spark 148 Filter Early, Filter Often 149 Optimizing Associative Operations 149 Understanding the Impact of Functions and Closures 151 Considerations for Collecting Data 152 Configuration Parameters for Tuning and Optimizing Applications 152 Avoiding Inefficient Partitioning 153 Diagnosing Application Performance Issues 155 Summary 159
Chapter 6 SQL and NoSQL Programming with Spark 161 Introduction to Spark SQL 161 Introduction to Hive 162 Spark SQL Architecture 166 Getting Started with DataFrames 168 Using DataFrames 179 Caching, Persisting, and Repartitioning DataFrames 187 Saving DataFrame Output 188 Accessing Spark SQL 191 Exercise: Using Spark SQL 194 Using Spark with NoSQL Systems 195 Introduction to NoSQL 196 Using Spark with HBase 197 Exercise: Using Spark with HBase 200 Using Spark with Cassandra 202 Using Spark with DynamoDB 204 Other NoSQL Platforms 206 Summary 206
Chapter 7 Stream Processing and Messaging Using Spark 209 Introducing Spark Streaming 209 Spark Streaming Architecture 210 Introduction to DStreams 211 Exercise: Getting Started with Spark Streaming 218 State Operations 219 Sliding Window Operations 221

De oplyste priser er inkl. moms

Senest sete

Data Analytics with Spark Us...

Hjem

SØG - mellem flere end 8 millioner bøger:

Viser: Data Analytics with Spark Using Python

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python

Jeffrey Aven

(2018)

Sprog: Engelsk

Detaljer om varen

Detaljer om varen

Detaljer om varen

Detaljer om varen

Senest sete

Polyteknisk Boghandel

Trykt eller digital bog?

Hjem

SØG - mellem flere end 8 millioner bøger:

Viser: Data Analytics with Spark Using Python

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven

(2018)

Data Analytics with Spark Using Python

Jeffrey Aven

(2018)

Sprog: Engelsk

Vi har lige nu lager i disse butikker:

Detaljer om varen

Detaljer om varen

Detaljer om varen

Detaljer om varen

Senest sete

Polyteknisk Boghandel

Trykt eller digital bog?