Out Of Memory is the most common type of error you might face in Spark
1. You can get this error at Driver
2. or You can get this error at the Executor level.
1. Driver
Collect Operation
If we have big files and we try to collect those in Driver. You will see Driver OOM(i.e Out Of Memory Error)
Broadcast Join
If we try to broadcast large files. Then again you will see OOM.
2. Executor
High Concurrency
If we assign more cores to a single executor then it might possible for each partition we have fewer resources(memory) but need to process large files then we get OOM.
As a benchmark - use 3 or 5 cores to a single executor, not more than this
Large Partitions causes OOM
If you have a larger partition then if the executor tries to process that data then chances are it fails with OOM
As a benchmark - Use appropriate partitions
Yarn Memory Overhead OOM
Executor container consists of Yarn Memory Overhead + Executor memory
Normally Yarn Memory Overhead(off-heap memory) should be 10% of Executor memory.
This Yarn Memory Overhead memory is used to store spark internal objects. Some time those need more space and throw OOM
If this error you will get, try to increase the Yarn Memory Overhead parameter.
I hope you like this blog.
Please subscribe and follow this blog for more information on Big Data Journey