SØG - mellem flere end 8 millioner bøger:

Søg på: Titel, forfatter, forlag - gerne i kombination.
Eller blot på isbn, hvis du kender dette.

Viser: Data Analytics with Spark Using Python

Data Analytics with Spark Using Python, 1. udgave

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven
(2018)
Pearson International
190,00 kr.
Leveres umiddelbart efter køb
Data Analytics with Spark Using Python, 1. udgave

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven
(2018)
Pearson International
230,00 kr.
Leveres umiddelbart efter køb
Data Analytics with Spark Using Python, 1. udgave

Data Analytics with Spark Using Python Vital Source e-bog

Jeffrey Aven
(2018)
Pearson International
271,00 kr.
Leveres umiddelbart efter køb
Data Analytics with Spark Using Python

Data Analytics with Spark Using Python

Jeffrey Aven
(2018)
Sprog: Engelsk
Pearson Education, Limited
380,00 kr.
Denne udgivelse har været planlagt men er nu opgivet og udkommer derfor ikke.

Detaljer om varen

  • 1. Udgave
  • Vital Source 90 day rentals (dynamic pages)
  • Udgiver: Pearson International (Juni 2018)
  • ISBN: 9780134844879R90
Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem. Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers—even those with little Hadoop or Spark experience. Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems. Coverage includes: • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems • Create Spark clusters using various deployment modes • Control and optimize the operation of Spark clusters and applications • Master Spark Core RDD API programming techniques • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning • Efficiently integrate Spark with both SQL and nonrelational data stores • Perform stream processing and messaging with Spark Streaming and Apache Kafka • Implement predictive modeling with SparkR and Spark MLlib
Licens varighed:
Bookshelf online: 90 dage fra købsdato.
Bookshelf appen: 90 dage fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

  • 1. Udgave
  • Vital Source 180 day rentals (dynamic pages)
  • Udgiver: Pearson International (Juni 2018)
  • ISBN: 9780134844879R180
Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem. Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers—even those with little Hadoop or Spark experience. Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems. Coverage includes: • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems • Create Spark clusters using various deployment modes • Control and optimize the operation of Spark clusters and applications • Master Spark Core RDD API programming techniques • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning • Efficiently integrate Spark with both SQL and nonrelational data stores • Perform stream processing and messaging with Spark Streaming and Apache Kafka • Implement predictive modeling with SparkR and Spark MLlib
Licens varighed:
Bookshelf online: 180 dage fra købsdato.
Bookshelf appen: 180 dage fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

  • 1. Udgave
  • Vital Source 365 day rentals (dynamic pages)
  • Udgiver: Pearson International (Juni 2018)
  • ISBN: 9780134844879R365
Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem. Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers—even those with little Hadoop or Spark experience. Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems. Coverage includes: • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems • Create Spark clusters using various deployment modes • Control and optimize the operation of Spark clusters and applications • Master Spark Core RDD API programming techniques • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning • Efficiently integrate Spark with both SQL and nonrelational data stores • Perform stream processing and messaging with Spark Streaming and Apache Kafka • Implement predictive modeling with SparkR and Spark MLlib
Licens varighed:
Bookshelf online: 5 år fra købsdato.
Bookshelf appen: 5 år fra købsdato.

Udgiveren oplyser at følgende begrænsninger er gældende for dette produkt:
Print: 2 sider kan printes ad gangen
Copy: højest 2 sider i alt kan kopieres (copy/paste)

Detaljer om varen

  • Hardback: 320 sider
  • Udgiver: Pearson Education, Limited (Juni 2018)
  • ISBN: 9780134846019
Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools

Spark is at the heart of today's Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem.

Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide's focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developers--even those with little Hadoop or Spark experience.

Aven's broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You'll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems.

Coverage includes:
* Understand Spark's evolving role in the Big Data and Hadoop ecosystems
* Create Spark clusters using various deployment modes
* Control and optimize the operation of Spark clusters and applications
* Master Spark Core RDD API programming techniques
* Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning
* Efficiently integrate Spark with both SQL and nonrelational data stores
* Perform stream processing and messaging with Spark Streaming and Apache Kafka
* Implement predictive modeling with SparkR and Spark MLlib

Preface xi Introduction 1
PART I: SPARK FOUNDATIONS
Chapter 1 Introducing Big Data, Hadoop, and Spark 5 Introduction to Big Data, Distributed Computing, and Hadoop 5 A Brief History of Big Data and Hadoop 6 Hadoop Explained 7 Introduction to Apache Spark 13 Apache Spark Background 13 Uses for Spark 14 Programming Interfaces to Spark 14 Submission Types for Spark Programs 14 Input/Output Types for Spark Applications 16 The Spark RDD 16 Spark and Hadoop 16 Functional Programming Using Python 17 Data Structures Used in Functional Python Programming 17 Python Object Serialization 20 Python Functional Programming Basics 23 Summary 25
Chapter 2 Deploying Spark 27 Spark Deployment Modes 27 Local Mode 28 Spark Standalone 28 Spark on YARN 29 Spark on Mesos 30 Preparing to Install Spark 30 Getting Spark 31 Installing Spark on Linux or Mac OS X 32 Installing Spark on Windows 34 Exploring the Spark Installation 36 Deploying a Multi-Node Spark Standalone Cluster 37 Deploying Spark in the Cloud 39 Amazon Web Services (AWS) 39 Google Cloud Platform (GCP) 41 Databricks 42 Summary 43
Chapter 3 Understanding the Spark Cluster Architecture 45 Anatomy of a Spark Application 45 Spark Driver 46 Spark Workers and Executors 49 The Spark Master and Cluster Manager 51 Spark Applications Using the Standalone Scheduler 53 Spark Applications Running on YARN 53 Deployment Modes for Spark Applications Running on YARN 53 Client Mode 54 Cluster Mode 55 Local Mode Revisited 56 Summary 57
Chapter 4 Learning Spark Programming Basics 59 Introduction to RDDs 59 Loading Data into RDDs 61 Creating an RDD from a File or Files 61 Methods for Creating RDDs from a Text File or Files 63 Creating an RDD from an Object File 66 Creating an RDD from a Data Source 66 Creating RDDs from JSON Files 69 Creating an RDD Programmatically 71 Operations on RDDs 72 Key RDD Concepts 72 Basic RDD Transformations 77 Basic RDD Actions 81 Transformations on PairRDDs 85 MapReduce and Word Count Exercise 92 Join Transformations 95 Joining Datasets in Spark 100 Transformations on Sets 103 Transformations on Numeric RDDs 105 Summary 108
PART II: BEYOND THE BASICS
Chapter 5 Advanced Programming Using the Spark Core API 111 Shared Variables in Spark 111 Broadcast Variables 112 Accumulators 116 Exercise: Using Broadcast Variables and Accumulators 119 Partitioning Data in Spark 120 Partitioning Overview 120 Controlling Partitions 121 Repartitioning Functions 123 Partition-Specific or Partition-Aware API Methods 125 RDD Storage Options 127 RDD Lineage Revisited 127 RDD Storage Options 128 RDD Caching 131 Persisting RDDs 131 Choosing When to Persist or Cache RDDs 134 Checkpointing RDDs 134 Exercise: Checkpointing RDDs 136 Processing RDDs with External Programs 138 Data Sampling with Spark 139 Understanding Spark Application and Cluster Configuration 141 Spark Environment Variables 141 Spark Configuration Properties 145 Optimizing Spark 148 Filter Early, Filter Often 149 Optimizing Associative Operations 149 Understanding the Impact of Functions and Closures 151 Considerations for Collecting Data 152 Configuration Parameters for Tuning and Optimizing Applications 152 Avoiding Inefficient Partitioning 153 Diagnosing Application Performance Issues 155 Summary 159
Chapter 6 SQL and NoSQL Programming with Spark 161 Introduction to Spark SQL 161 Introduction to Hive 162 Spark SQL Architecture 166 Getting Started with DataFrames 168 Using DataFrames 179 Caching, Persisting, and Repartitioning DataFrames 187 Saving DataFrame Output 188 Accessing Spark SQL 191 Exercise: Using Spark SQL 194 Using Spark with NoSQL Systems 195 Introduction to NoSQL 196 Using Spark with HBase 197 Exercise: Using Spark with HBase 200 Using Spark with Cassandra 202 Using Spark with DynamoDB 204 Other NoSQL Platforms 206 Summary 206
Chapter 7 Stream Processing and Messaging Using Spark 209 Introducing Spark Streaming 209 Spark Streaming Architecture 210 Introduction to DStreams 211 Exercise: Getting Started with Spark Streaming 218 State Operations 219 Sliding Window Operations 221
De oplyste priser er inkl. moms

Polyteknisk Boghandel

har gennem mere end 50 år været studieboghandlen på DTU og en af Danmarks førende specialister i faglitteratur.

 

Vi lagerfører et bredt udvalg af bøger, ikke bare inden for videnskab og teknik, men også f.eks. ledelse, IT og meget andet.

Læs mere her


Trykt eller digital bog?

Ud over trykte bøger tilbyder vi tre forskellige typer af digitale bøger:

 

Vital Source Bookshelf: En velfungerende ebogsplatform, hvor bogen downloades til din computer og/eller mobile enhed.

 

Du skal bruge den gratis Bookshelf software til at læse læse bøgerne - der er indbygget gode værktøjer til f.eks. søgning, overstregning, notetagning mv. I langt de fleste tilfælde vil du samtidig have en sideløbende 1825 dages online adgang. Læs mere om Vital Source bøger

 

Levering: I forbindelse med købet opretter du et login. Når du har installeret Bookshelf softwaren, logger du blot ind og din bog downloades automatisk.

 

 

Adobe ebog: Dette er Adobe DRM ebøger som downloades til din lokale computer eller mobil enhed.

 

For at læse bøgerne kræves særlig software, som understøtter denne type. Softwaren er gratis, men du bør sikre at du har rettigheder til installere software på den maskine du påtænker at anvende den på. Læs mere om Adobe DRM bøger

 

Levering: Et download link sendes pr email umiddelbart efter købet.

 


Ibog: Dette er en online bog som kan læses på udgiverens website. 

Der kræves ikke særlig software, bogen læses i en almindelig browser.

 

Levering: Vores medarbejder sender dig en adgangsnøgle pr email.

 

Vi gør opmærksom på at der ikke er retur/fortrydelsesret på digitale varer.