dfsadmin, fsck and balancer

dfsadmin

It runs a HDFS dfsadmin client. The hadoop dfsadmin command supports a few HDFS administration related operations. The bin/hadoop dfsadmin -help command lists all the commands currently supported. For e.g.:

  • Rreport : reports basic statistics of HDFS. Some of this information is also available on the NameNode front page.
  • Ssafemode : though usually not required, an administrator can manually enter or leave Safemode.
  • FinalizeUpgrade : removes previous backup of the cluster made during last upgrade.
  • FrefreshNodes : Updates the set of hosts allowed to connect to namenode. Re-reads the config file to update values defined by dfs.hosts and dfs.host.exclude and reads the entires (hostnames) in those files. Each entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned. Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped from decommissioning if it has already been marked for decommission. Entires not present in both the lists are decommissioned.
  • PrintTopology : Print the topology of the cluster. Display a tree of racks and datanodes attached to the tracks as viewed by the NameNode.

In Hadoop 1,

Usage: hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] [-setQuota <quota> <dirname>…<dirname>] [-clrQuota <dirname>…<dirname>] [-help [cmd]]

COMMAND_OPTIONDescription
-reportReports basic filesystem information and statistics.
-safemode enter | leave | get | waitSafe mode maintenance command. Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks.
Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.
-refreshNodesRe-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned.
-finalizeUpgradeFinalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process.
-upgradeProgress status | details | forceRequest current distributed upgrade status, a detailed status or force the upgrade to proceed.
-metasave filenameSave Namenode’s primary data structures to <filename> in the directory specified by hadoop.log.dir property. <filename> will contain one line for each of the following
1. Datanodes heart beating with Namenode
2. Blocks waiting to be replicated
3. Blocks currently being replicated
4. Blocks waiting to be deleted
-setQuota <quota> <dirname>…<dirname>Set the quota <quota> for each directory <dirname>. The directory quota is a long integer that puts a hard limit on the number of names in the directory tree.
Best effort for the directory, with faults reported if
1. N is not a positive integer, or
2. user is not an administrator, or
3. the directory does not exist or is a file, or
4. the directory would immediately exceed the new quota.
-clrQuota <dirname>…<dirname>Clear the quota for each directory <dirname>.
Best effort for the directory. with fault reported if
1. the directory does not exist or is a file, or
2. user is not an administrator.
It does not fault if the directory has no quota.
-help [cmd]Displays help for the given command or all commands if none is specified.

In Hadoop 2,

COMMAND_OPTIONDescription
-report [-live] [-dead] [-decommissioning]Reports basic filesystem information and statistics. Optional flags may be used to filter the list of displayed DataNodes.
-safemode enter|leave|get|waitSafe mode maintenance command. Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks.
Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.
-saveNamespaceSave current namespace into storage directories and reset edits log. Requires safe mode.
-rollEditsRolls the edit log on the active NameNode.
-restoreFailedStorage true|false|checkThis option will turn on/off automatic attempt to restore failed storage replicas. If a failed storage becomes available again the system will attempt to restore edits and/or fsimage during checkpoint. ‘check’ option will return current setting.
-refreshNodesRe-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned.
-setStoragePolicy <path> <policyName>Set a storage policy to a file or a directory.
-getStoragePolicy <path>Get the storage policy of a file or a directory.
-finalizeUpgradeFinalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process.
-metasave filenameSave Namenode’s primary data structures to filename in the directory specified by hadoop.log.dir property. filename is overwritten if it exists. filename will contain one line for each of the following
1. Datanodes heart beating with Namenode
2. Blocks waiting to be replicated
3. Blocks currently being replicated
4. Blocks waiting to be deleted
-refreshServiceAclReload the service-level authorization policy file.
-refreshUserToGroupsMappingsRefresh user-to-groups mappings.
-refreshSuperUserGroupsConfigurationRefresh superuser proxy groups mappings
-refreshCallQueueReload the call queue from config.
-refresh <host:ipc_port> <key> [arg1..argn]Triggers a runtime-refresh of the resource specified by <key> on <host:ipc_port>. All other args after are sent to the host.
-reconfig <datanode |…> <host:ipc_port> <start|status>Start reconfiguration or get the status of an ongoing reconfiguration. The second parameter specifies the node type. Currently, only reloading DataNode’s configuration is supported.
-printTopologyPrint a tree of the racks and their nodes as reported by the Namenode
-refreshNamenodes datanodehost:portFor the given datanode, reloads the configuration files, stops serving the removed block-pools and starts serving new block-pools.
-deleteBlockPool datanode-host:port blockpoolId [force]If force is passed, block pool directory for the given blockpool id on the given datanode is deleted along with its contents, otherwise the directory is deleted only if it is empty. The command will fail if datanode is still serving the block pool. Refer to refreshNamenodes to shutdown a block pool service on a datanode.
-setBalancerBandwidth <bandwidth in bytes per second>Changes the network bandwidth used by each datanode during HDFS block balancing. <bandwidth> is the maximum number of bytes per second that will be used by each datanode. This value overrides the dfs.balance.bandwidthPerSec parameter. NOTE: The new value is not persistent on the DataNode.
-allowSnapshot <snapshotDir>Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable.
-disallowSnapshot <snapshotDir>Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots.
-fetchImage <local directory>Downloads the most recent fsimage from the NameNode and saves it in the specified local directory.
-shutdownDatanode <datanode_host:ipc_port> [upgrade]Submit a shutdown request for the given datanode.
-getDatanodeInfo <datanode_host:ipc_port>Get the information about the given datanode.
-triggerBlockReport [-incremental] <datanode_host:ipc_port>Trigger a block report for the given datanode. If ‘incremental’ is specified, it will be otherwise, it will be a full block report.
-help [cmd]Displays help for the given command or all commands if none is specified.

fsck

It runs a HDFS filesystem checking utility. HDFS supports the fsck command to check for various inconsistencies. It is designed for reporting problems with various files, for example, missing blocks for a file or under-replicated blocks. Unlike a traditional fsck utility for native file systems, this command does not correct the errors it detects. Normally NameNode automatically corrects most of the recoverable failures. By default fsck ignores open files but provides an option to select all files during reporting. The HDFS fsck command is not a Hadoop shell command. It can be run as ‘bin/hadoop fsck’. fsck can be run on the whole file system or on a subset of files. By default, fsck will not operate on files still open for write by another client.

In Hadoop 1,

Usage: hadoop fsck [GENERIC_OPTIONS] <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]

COMMAND_OPTIONDescription
<path>Start checking from this path.
-moveMove corrupted files to /lost+found
-deleteDelete corrupted files.
-openforwritePrint out files opened for write.
-filesPrint out files being checked.
-blocksPrint out block report.
-locationsPrint out locations for every block.
-racksPrint out network topology for data-node locations.

In Hadoop 2,

COMMAND_OPTIONDescription
pathStart checking from this path.
-deleteDelete corrupted files.
-filesPrint out files being checked.
-files -blocksPrint out the block report
-files -blocks -locationsPrint out locations for every block.
-files -blocks -racksPrint out network topology for data-node locations.
-includeSnapshotsInclude snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it.
-list-corruptfileblocksPrint out list of missing blocks and files they belong to.
-moveMove corrupted files to /lost+found.
-openforwritePrint out files opened for write.

balancer

Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the rebalancing process.

HDFS data might not always be placed uniformly across the DataNode. One common reason is addition of new DataNodes to an existing cluster. While placing new blocks (data for a file is stored as a series of blocks), NameNode considers various parameters before choosing the DataNodes to receive these blocks. Some of the considerations are:

  • Policy to keep one of the replicas of a block on the same node as the node that is writing the block.
  • Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack.
  • One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
  • Spread HDFS data uniformly across the DataNodes in the cluster.

Due to multiple competing considerations, data might not be uniformly placed across the DataNodes. HDFS provides a tool for administrators that analyzes block placement and rebalanaces data across the DataNode.

In Hadoop 1,

Usage: hadoop balancer [-threshold <threshold>]

COMMAND_OPTIONDescription
-threshold <threshold>Percentage of disk capacity. This overwrites the default threshold.

In Hadoop 2,

Usage:

hdfs balancer

[-threshold <threshold>] [-policy <policy>] [-exclude [-f <hosts-file> | <comma-separated list of hosts>]] [-include [-f <hosts-file> | <comma-separated list of hosts>]] [-idleiterations <idleiterations>]
COMMAND_OPTIONDescription
-policy <policy>datanode (default): Cluster is balanced if each datanode is balanced.
blockpool: Cluster is balanced if each block pool in each datanode is balanced.
-threshold <threshold>Percentage of disk capacity. This overwrites the default threshold.
-exclude -f <hosts-file> | <comma-separated list of hosts>Excludes the specified datanodes from being balanced by the balancer.
-include -f <hosts-file> | <comma-separated list of hosts>Includes only the specified datanodes to be balanced by the balancer.
-idleiterations <iterations>Maximum number of idle iterations before exit. This overwrites the default idleiterations(5).
Share this post
[social_warfare]
Cluster Monitoring
Hadoop Logging

Get industry recognized certification – Contact us

keyboard_arrow_up