一:HDFS 用户指导
1.hdfs的牛逼特性
- Hadoop, including HDFS, is well suited for distributed storage and distributed processing using commodity hardware. It is fault tolerant, scalable, and extremely simple to expand. MapReduce, well known for its simplicity and applicability for large set of distributed applications, is an integral part of Hadoop. 分布式存储
- HDFS is highly configurable with a default configuration well suited for many installations. Most of the time, configuration needs to be tuned only for very large clusters. 适当的配置
- Hadoop is written in Java and is supported on all major platforms. 平台适应性
- Hadoop supports shell-like commands to interact with HDFS directly. shell-like的操作方式
- The NameNode and Datanodes have built in web servers that makes it easy to check current status of the cluster. 内置web服务,方便检查集群
- New features and improvements are regularly implemented in HDFS. The following is a subset of useful features in HDFS:
- File permissions and authentication. 文件权限验证
- Rack awareness: to take a node's physical location into account while scheduling tasks and allocating storage.
- Safemode: an administrative mode for maintenance. 安全模式,用于运维
- fsck: a utility to diagnose health of the file system, to find missing files or blocks. 检查文件系统的工具,发现丢失的文件或者块
- fetchdt: a utility to fetch DelegationToken and store it in a file on the local system.
- Balancer: tool to balance the cluster when the data is unevenly distributed among DataNodes.
- Upgrade and rollback: after a software upgrade, it is possible to rollback to HDFS' state before the upgrade in case of unexpected problems.
- Secondary NameNode: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.
- Checkpoint node: performs periodic checkpoints of the namespace and helps minimize the size of the log stored at the NameNode containing changes to the HDFS. Replaces the role previously filled by the Secondary NameNode, though is not yet battle hardened. The NameNode allows multiple Checkpoint nodes simultaneously, as long as there are no Backup nodes registered with the system.
- Backup node: An extension to the Checkpoint node. In addition to checkpointing it also receives a stream of edits from the NameNode and maintains its own in-memory copy of the namespace, which is always in sync with the active NameNode namespace state. Only one Backup node may be registered with the NameNode at once.
来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
- -report: reports basic statistics of HDFS. Some of this information is also available on the NameNode front page. 报告状态
- -safemode: though usually not required, an administrator can manually enter or leave Safemode. 开启安全模式
- -finalizeUpgrade: removes previous backup of the cluster made during last upgrade. 删除上次集群更新时的备份
- -refreshNodes: Updates the namenode with the set of datanodes allowed to connect to the namenode. Namenodes re-read datanode hostnames in the file defined bydfs.hosts, dfs.hosts.exclude. Hosts defined in dfs.hosts are the datanodes that are part of the cluster. If there are entries in dfs.hosts, only the hosts in it are allowed to register with the namenode. Entries in dfs.hosts.exclude are datanodes that need to be decommissioned. Datanodes complete decommissioning when all the replicas from them are replicated to other datanodes. Decommissioned nodes are not automatically shutdown and are not chosen for writing for new replicas.
- -printTopology : Print the topology of the cluster. Display a tree of racks and datanodes attached to the tracks as viewed by the NameNode. 打印拓扑
- dfs.namenode.checkpoint.period, set to 1 hour by default, specifies the maximum delay between two consecutive checkpoints, and
- dfs.namenode.checkpoint.txns, set to 1 million by default, defines the number of uncheckpointed transactions on the NameNode which will force an urgent checkpoint, even if the checkpoint period has not been reached.
- Policy to keep one of the replicas of a block on the same node as the node that is writing the block. 在当前读写的节点中保存一个数据备份。
- Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack. 保存数据分布到各个机架,可以允许整个机架的丢失
- One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
- Spread HDFS data uniformly across the DataNodes in the cluster.
来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer
<wiz_tmp_tag id="wiz-table-range-border" contenteditable="false" style="display: none;">
转载于:https://www.cnblogs.com/skyrim/p/7455503.html
一:HDFS 用户指导相关推荐
- 【BlackDuck】Black-Duck-User-Guide用户指导书
用户指导书地址: https://community.synopsys.com/servlet/fileField?entityId=kaC2H000000kBSXUA2&field=Atta ...
- [转]HDFS用户指南(中文版)
目的 本文档可以作为使用Hadoop分布式文件系统用户的起点,无论是将HDFS应用在一个Hadoop集群中还是作为一个单独的分布式文件系统使用.HDFS被设计成可以马上在许多环境中工作起来,那么一些H ...
- Hadoop官网翻译(HDFS用户概览)
Hadoop架构 HDFS目标 容忍硬件故障 批处理数据访问 支持大文件 简单的读写一致性模型 数据本地性 支持异构平台 hdfs通过追加写来简化读写一致性模型.关注吞吐率. NameNode和Dat ...
- AIRPAK3.0用户指导手册第一部分手册简介
一,手册主要内容 该手册包含了五个针对不同问题的实例教程,实例问题的特点和AIRPAK方案都会有详细论述 文档第一部分主要针对新手进行AIRPAK的详细说明,该部分介绍了问题的建立,解决方案的详细步骤 ...
- Eclipse 工作台用户指导视图和编辑器
在开始这一部分的工作台教程前,非常有必要熟悉一下工作台中不同的组成元素.一个工作台由以下几个部分组成: 透视图(Perspective) 视图(View) 编辑器(Editor) 透视图表现为工作台窗 ...
- FreeNAS 安装与用户指导页
下载: 附件:http://down.51cto.com/data/2351891 本文转自 张宇 51CTO博客,原文链接:http://blog.51cto.com/zhangyu/133657, ...
- cdh用户权限_cdh设置hdfs权限
通常会把 root 或者需要的用户添加到 supergroup组,但Linux下默认是没有supergroup组. # Linux下默认是没有supergroup组的 # hadoop:x:994:h ...
- java hdfs 指定用户目录_HDFS目录(文件 )权限管理
用户身份 在1.0.4这个版本的Hadoop中,客户端用户身份是通过宿主操作系统给出.对类Unix系统来说, 用户名等于`whoami`: 组列表等于`bash -c groups`. 将来会增加其他 ...
- HDFS将普通用户添加到超级用户组
文章目录 01 引言 02 操作 step1:校验是否有访问hdfs的权限 step2:添加用户到操作系统的supergroup step3:将信息同步到HDFS step4:验证 03 文末 01 ...
最新文章
- 2019区块链行业指南
- android gradle权威指南pdf_干货 | 携程 Android 10适配踩坑指南
- 微擎结合thinkphp5要带上uniacid_传统企业要做网络营销推广找哪家好?
- 关于geekcode
- 中信银行Java笔试题库,手撕面试官
- 基于深度学习知识追踪研究进展(综述)数据集模型方法
- 多家波卡生态项目招聘开发者,高薪职位等你来 Pick
- MarkMan – 马克鳗 IU好伙伴啊
- 程序员锻炼宽广的胸怀
- AndroidProjects个人项目归纳
- 探究App推广之路:流量思维永不死 ☞ iphone中App store上架优化建议
- 利用Excel Power Query获取基金历史净值、估值和日增长率等信息
- 调用钉钉api报错:机器人发送签名过期;solution:签名生成时间和发送时间请保持在 timestampms 以内
- 测试术语-bug分类
- 第一台计算机作文,世界上的第一台洗衣机
- c语言中move指令说明,MOVE指令使用
- 树莓派 java 驱动 微雪 墨水屏 4灰阶 epaper
- 2019 Java程序员(方向)
- java面向对象三大特性理解
- 生物、人工智能、经济、哲学和系统科学的新流派
热门文章
- java 编码过滤器_Java编码过滤器
- linux内核rcu锁实例,Linux Rcu到底有没有锁?
- 信息服务器已停止工作,游戏服务器已停止工作
- mysql copy pending_mysql 案例 ~ 主从复制延迟之并行复制
- python 写脚本 预约课程_Python盘纪念币系列之三:自动预约脚本编写 03 系列总结...
- Springboot的部分依赖及作用
- C指针6:指针变量作为函数参数
- Python+OpenCV实现AI人脸识别身份认证系统(2)—人脸数据采集、存储
- 读后感与机翻《基于理论的因果迁移:结合实例级的归纳和抽象级的结构学习》
- 数字图像处理——第七章 小波和多分辨处理