Learning Spark, 2nd Edition 新书_图书内容介绍_剧情呢
剧情呢 国产剧 港剧 泰剧

Learning Spark, 2nd Edition读书介绍

类别 页数 译者 网友评分 年代 出版社
书籍 300页 2020 O'Reilly Media
定价 出版日期 最近访问 访问指数
USD 35.99 2020-01-10 … 2020-05-27 … 8
主题/类型/题材/标签
Spark,计算机科学,分布式,软件工程,数据分析,大数据,BigData,
作者
Tathagata Das      ISBN:9781492050049    原作名/别名:《》
内容和作者简介
Learning Spark, 2nd Edition摘要

Data is getting bigger, arriving faster, and coming in varied formats—and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark.

Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you’ll be able to:

Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets

Peek under the hood of the Spark SQL engine to understand Spark transformations and performance

Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI

Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka

Perform analytics on batch and streaming data using Structured Streaming

Build reliable data pipelines with open source Delta Lake and Spark

Develop machine learning pipelines with MLlib and productionize models using MLflow

Use open source Pandas framework Koalas and Spark for data transformation and feature engineering

作者简介

Holden Karau是Databricks的软件开发工程师,活跃于开源社区。她还著有《Spark快速数据处理》。

Andy Konwinski是Databricks联合创始人,Apache Spark项目技术专家,还是Apache Mesos项目的联合发起人。

Patrick Wendell是Databricks联合创始人,也是Apache Spark项目技术专家。他还负责维护Spark核心引擎的几个子系统。

Matei Zaharia是Databricks的CTO,同时也是Apache Spark项目发起人以及Apache基金会副主席。

本书后续版本
未发行或暂未收录
喜欢读〖Learning Spark, 2nd Edition〗的人也喜欢:

  • Introduction to Probability, 2nd Edition 数学,概率论,Probability,概率论与数理统计,概率,Mathematics,概率统计,概率导论, 2020-02-20 …
  • Python Algorithms 2nd edition 算法,python,Python,软件开发,计算机,algorithms,计算机科学,编程, 2020-02-20 …
  • The Craft of Research, 2nd edition 写作,research,科研,学术论文写作,writing,思维,研究,笑来推荐, 2020-02-20 …
  • Building Microservices, 2nd Edition 微服务,软件工程,计算机科学,分布式,go, 2021-07-13 …
  • The Elements of Programming Style, 2nd Edition programming,编程,计算机,经典,程序设计,Programming,style,软件工程, 2020-02-20 …
  • The C++ Standard Library, 2nd Edition C++,STL,标准库,Programming,C/C++,计算机,编程,程序设计, 2020-02-20 …
  • 3D Math Primer for Graphics and Game Development, 数学,图形学,游戏开发,计算机图形学,Graphics,3D,游戏,计算机科学, 2020-02-20 …
  • Learning Spark 大数据,spark,Spark,分布式,机器学习,计算机,编程,技术, 2020-02-20 …
  • Learning Spark, 2nd Edition Spark,计算机科学,分布式,软件工程,数据分析,大数据,BigData, 2020-01-10 …
  • Hands-on Machine Learning with Scikit-Learn, Keras 机器学习,tensorflow,Python,计算机科学,AI,deeplearning,keras,MachineLearning, 2019-10-11 …
  • 友情提示

    剧情呢,免费看分享剧情、挑选影视作品、精选好书简介分享。