Customer Service

infooverflow.org@gmail.com
Online User - 0
PHP Tutorial

PHP Tutorial


Spark Question


Shuffling is the process of redistributing data across partitions that may lead to data movement across the executors. The shuffle operation is implemented differently in Spark compared to Hadoop.

Shuffling has 2 important compression parameters:

spark.shuffle.compress – checks whether the engine would compress shuffle outputs or not spark.shuffle.spill.compress – decides whether to compress intermediate shuffle spill files or not

It occurs while joining two tables or while performing byKey operations such as GroupByKey or ReduceByKey


PHP Tutorial

PHP Tutorial
PHP Tutorial
Get In Touch

Pune

Pune Maharashtra

infooverflow.org@gmail.com

Online User - 0
News Letter
Follow Us

© Domain. All Rights Reserved. Designed by info Over Flow

Last Updated On 27-Jan-2024

website counter