Hadoop Admin Interview Questions and Answers : Part 1

Though big data platform has introduced lot of new frameworks like Spark, Druid, Delta Lake, Hudi, Governed tables, Snowflake..etc and covered a huge distance, explored various issues after introduction of Hadoop, but still there are lot of data stores of project set up using Hadoop. So, even nowadays there are ample opportunities available for Hadoop admins. Below is the list of questions I have collated after reaching out to my colleagues who got offers in multiple product companies as an Hadoop admin. This includes a collection of HDFS, MapReduce Framework, YARN and file system commands..etc

  1. How to delete the checkpoint files from trash folder? Most of the cases, checkpoint files will be created by Spark Streaming jobs in filesystems like HDFS or S3 based on our configuration. Not taking proper action on these files results in space results issue. These files can be removed by hadoop fs -expunge -immediate -fs <hdfs/s3/other file systems path>

Other useful hadoop filesystem commands includes copyFromLocal, copyToLocal, cp, put, get, du, chown, chmod, mkdir, ls, mv, rm, text.

Please stay tuned for my next set of interview questions related to big data and it’s related frameworks.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store