3. 部署Hadoop on Windows
3.1 源码
core-site.xml 源码
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:19000</value>
</property>
</configuration>
Hadoop-env.cmd 源码
set HADOOP_PREFIX=E:\bigdata\hadoop
set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop
set YARN_CONF_DIR=%HADOOP_CONF_DIR%
set PATH=%PATH%;%HADOOP_PREFIX%\bin
hdfs-site.xml 源码
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapre-site.xml 源码
<configuration>
<property>
<name>mapreduce.job.user.name</name>
<value>wxl</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.apps.stagingDir</name>
<value>/user/wxl/staging</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>local</value>
</property>
</configuration>
yarn-site.xml 源码
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.server.resourcemanager.address</name>
<value>0.0.0.0:8020</value>
</property>
<property>
<name>yarn.server.resourcemanager.application.expiry.interval</name>
<value>60000</value>
</property>
<property>
<name>yarn.server.nodemanager.address</name>
<value>0.0.0.0:45454</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.server.nodemanager.remote-app-log-dir</name>
<value>/app-logs</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/dep/logs/userlogs</value>
</property>
<property>
<name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name>
<value>0.0.0.0</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>-1</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>%HADOOP_CONF_DIR%,%HADOOP_COMMON_HOME%/share/hadoop/common/*,%HADOOP_COMMON_HOME%/share/hadoop/common/lib/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*</value>
</property>
</configuration>
slaves 源码
localhost
3.2 初始化环境变量,运行hadoop-env.cmd文件(双击、或着回车执行)
E:\bigdata\hadoop\etc\hadoop\hadoop-env.cmd
3.3 格式化namenode
%HADOOP_PREFIX%\bin\hdfs namenode -format
![successfully formatted](../Imgs/HadoopforWindows/successfully formatted.png)
3.4 启动namenode和datanode
%HADOOP_PREFIX%\sbin\start-dfs.cmd
会弹出两个cmd窗口,分别是datanode和namenode。 查看是否启动成功,在原先窗口输入jps查看,如图。
3.5 运行hdfs命令,上传一个文件 在当前cmd目录下,如bigdata下创建一个myfile.text文件 执行
%HADOOP_PREFIX%\bin\hdfs dfs -put myfile.txt /
查看上传的文件信息执行
%HADOOP_PREFIX%\bin\hdfs dfs -ls /
4. Hadoop YARN
4.1 启动YARN
%HADOOP_PREFIX%\sbin\start-yarn.cmd
输入jps命令可查看当前启动的节点,如图
4.2 运行一个小例子world count
%HADOOP_PREFIX%\bin\yarn jar %HADOOP_PREFIX%\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.6.4.jar wordcount /myfile.txt /out
4.3 web端查看Hadoop作业http://localhost:8088/cluster
5. 结束Hadoop
%HADOOP_PREFIX%\sbin\stop-yarn.cmd
%HADOOP_PREFIX%\sbin\stop-dfs.cmd
6. 遇到问题请看下面解决方案,坑都在这里了:
6.1 . 运行以下命令出错
mvn package -Pdist,native-win -DskipTests -Dtar
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (pre-dist) on project hadoop-project-dist: An Ant BuildException has occured: Execute failed: java.io.IOException: Cannot run program “sh” (in directory “E:\bigdata\hadoop\hadoop-project-dist\target”): CreateProcess error=2, 系统找不到指定的文件。
6.2 ‘-Xmx512m’ is not recognized as an internal or external command
6.3 . 节点启动失败
org.apache.hadoop.io.nativeio.NativeIOWindows.acce![org.apache.hadoop.io.nativeio.NativeIO Windows.access0](../Imgs/HadoopforWindows/log_org.apache.hadoop.io.nativeio.NativeIO$Windows.access0.png)
解决: 下载Hadoop2.6.4 bin编译的包,并且复制到Hadoop目录的bin目录下。
![ org.apache.hadoop.io.nativeio.NativeIO$Windows.access0](../Imgs/HadoopforWindows/log_org解决apache.hadoop.io.nativeio.NativeIO$Windows.access0.png)
6.4 . 运行worldcount出错
Application application_1467703817455_0002 failed 2 times due to AM Container for appattempt_1467703817455_0002_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://DESKTOP-KV0K24Q:8088/proxy/application_1467703817455_0002/Then, click on links to logs of each attempt. Diagnostics: Failed to setup local dir /tmp/hadoop-wxl/nm-local-dir, which was marked as good. Failing this attempt. Failing the application.
log_Failed while trying to construct the redirect url to the log server.png
解决方案:权限问题
Mapreduce error: Failed to setup local dir
更多Hadoop相关信息见Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13
本文永久更新链接地址:http://www.linuxidc.com/Linux/2016-08/134131.htm