!link rel="preload" href="/prism.css" as="style"> <!link rel="preload" as="style" onload="this.rel='stylesheet'" href="/prism.css"> <!link rel="preload" href="/assets/photoswipe-7040b3866657667f9495bb5044f92ebac59fc22fb902ddc931d9d459fddf375a.css" as="style"> <!link rel="preload" as="style" onload="this.rel='stylesheet'" href="/assets/photoswipe-7040b3866657667f9495bb5044f92ebac59fc22fb902ddc931d9d459fddf375a.css"> <!link rel="preload" href="/assets/photoswipe/default-skin-c14dd1e3e6f807580f5d28233a07fd4c9aa4af12e2a3da0ccd0c95b52695c124.css" as="style"> <!link rel="preload" as="style" onload="this.rel='stylesheet'" href="/assets/photoswipe/default-skin-c14dd1e3e6f807580f5d28233a07fd4c9aa4af12e2a3da0ccd0c95b52695c124.css">Skip to main
Hadoop is a framework written in Java which is basically used for processing big data applications. Hadoop has two basic components:
1.1 Installing Hadoop on Windows
Step 1: Download & Install java 8 on your system.
Step 2: Download hadoop archive(Hadoop 3.1.0) zip file & extract it in C:\Users\DELL location.
Step 3: Right click on This PC. Click on Properties. Click on Advanced system settings. Click on the environment variables button.
Step 4: Create a new user variable called JAVA_HOME & put java installation directory path upto bin directory inside it.
Step 5: Create a new user variable called HADOOP_HOME & put hadoop extracted directory path inside it.
Step 6: Edit Path system variable. It should contain the HADOOP_HOME path as well as JAVA_HOME path.
Step 7: Go to hadoop-3.1.0 directory(extracted). Go to the etc/hadoop folder. Edit core-site.xml file using any editor. For the first time, you will get <configuration> </configuration> empty tag. You need to put the required configuration inside it.
Step 8: Go to hadoop main directory. Create a data folder inside it. Inside data folder create two subfolders namely namenode & datanode. Again go to etc/hadoop directory & edit hdfs-site.xml as follows:
Step 9: Edit mapred-site.xml file as follows:
Step 10: Edit yarn-site.xml file as follows:
Step 11: Edit hadoop-env.cmd file. Edit JAVA_HOME variable inside it. Put the java installation main directory path inside it. If the path contains spaces then put the path inside double quotes.
Step 12: Download hadoop windows compatible files from https://github.com/s911415/apache-hadoop-3.1.0-winutils. Learn more skills from Mulesoft Certification
Step 13: If you extract this folder, then you will get bin directory. Copy all the files from the bin folder. Either you can paste and replace all the files in the bin folder of hadoop directory or you can rename the bin folder from hadoop directory and paste the complete bin folder.
Step 14: Check java version using the command java -version.
Step 15: Check hadoop installation is done properly or not using a command: hadoop version
Step 16: Open the command prompt and do the command hdfs namenode -format to format the namenode. It is done only once when the hadoop is installed. If you do it again, it will delete all the data.
Step 17: Give read/write permission for access to namenode and datanode folder chmod command using windows utility installed inside bin folder of hadoop directory.
If you do not give the appropriate permission, then it will throw permission denied exception while running the hadoop daemons.
Step 18: Start hdfs and yarn using a command start-all.cmd by navigating inside sbin directory of hadoop or you can give two seperate commands start-dfs.cmd and start-yarn.cmd. It will open two new windows after making the start-dfs command. One window will show the start of a namenode and another window will show the start of a datanode. After making start-yarn command, two new windows will appear. One window will show the start of a resource manager and another will show the start of a nodemanager.
Step 19: You can view resource manager current jobs, finished jobs etc by visiting following link: http://localhost:8088/cluster
Step 20: You can view HDFS information by using following link: http://localhost:9870/
3.0 Integration HDFS connector With MuleSoft
HDFS connector in mule 4 is used to connect hadoop and mule applications. It is used to integrate HDFS operations with mule. In order to use this connector, you need to import it anypoint exchange. Various operations that can be performed using HDFS connector are:
Make directories connector configuration:
You can test it using postman and verify it using dashboard.
Copy from local connector configuration:
Using the above connector, we can copy the abc.txt file from desktop to /abc /folder1 directory on HDFS. You can verify it using postman and user interface.
Get metadata connector configuration:
You will get metadata of file abc.txt. You can verify it using a postman. You will get a json object as an output.
Delete a file connector configuration:
Delete a file postman output:
You can verify this operation by viewing a dashboard. You will not be able to see the abc.txt file.
Delete directory connector configuration:
Delete directory postman output:
You can verify whether a directory is deleted or not using a dashboard:
Folder1 is deleted from folder abc.