{"id":20958,"date":"2013-05-11T11:09:05","date_gmt":"2013-05-11T05:39:05","guid":{"rendered":"http:\/\/vskills.in\/certification\/tutorial\/?p=20958"},"modified":"2024-04-12T14:16:18","modified_gmt":"2024-04-12T08:46:18","slug":"debugging-and-profiling","status":"publish","type":"page","link":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/","title":{"rendered":"Hadoop &#038; Mapreduce Tutorial | Debugging and Profiling"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Debugging and Profiling<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Profiling<\/h2>\n\n\n\n<p>Profiling is a utility to get a representative (2 or 3) sample of built-in java profiler for a sample of maps and reduces.<\/p>\n\n\n\n<p>In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property mapred.task.profile. The value can be set using the api JobConf.setProfileEnabled(boolean). If the value is set true, the task profiling is enabled. The profiler information is stored in the user log directory. By default, profiling is not enabled for the job.<\/p>\n\n\n\n<p>Once user configures that profiling is needed, she\/he can use the configuration property mapred.task.profile.{maps|reduces} to set the ranges of MapReduce tasks to profile. The value can be set using the api JobConf.setProfileTaskRange(boolean,String). By default, the specified range is 0-2.<\/p>\n\n\n\n<p>User can also specify the profiler configuration arguments by setting the configuration property mapred.task.profile.params. The value can be specified using the api JobConf.setProfileParams(String). If the string contains a %s, it will be replaced with the name of the profiling output file when the task runs. These parameters are passed to the task child JVM on the command line. The default value for the profiling parameters is -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s<\/p>\n\n\n\n<p>In Hadoop 2, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property mapreduce.task.profile. The value can be set using the api Configuration.set(MRJobConfig.TASK_PROFILE, boolean). If the value is set true, the task profiling is enabled. The profiler information is stored in the user log directory. By default, profiling is not enabled for the job.<\/p>\n\n\n\n<p>Once user configures that profiling is needed, she\/he can use the configuration property mapreduce.task.profile.{maps|reduces} to set the ranges of MapReduce tasks to profile. The value can be set using the api Configuration.set(MRJobConfig.NUM_{MAP|REDUCE}_PROFILES, String). By default, the specified range is 0-2.<\/p>\n\n\n\n<p>User can also specify the profiler configuration arguments by setting the configuration property mapreduce.task.profile.params. The value can be specified using the api Configuration.set(MRJobConfig.TASK_PROFILE_PARAMS, String). If the string contains a %s, it will be replaced with the name of the profiling output file when the task runs. These parameters are passed to the task child JVM on the command line. The default value for the profiling parameters is -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s.<\/p>\n\n\n\n<p>Debugging<\/p>\n\n\n\n<p>The MapReduce framework provides a facility to run user-provided scripts for debugging. When a MapReduce task fails, a user can run a debug script, to process task logs for example. The script is given access to the task&#8217;s stdout and stderr outputs, syslog and jobconf. The output from the debug script&#8217;s stdout and stderr is displayed on the console diagnostics and also as part of the job UI.<\/p>\n\n\n\n<p><span lang=\"EN-US\">The user needs to use DistributedCache to distribute and symlink the script file.<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">In Hadoop 1, a quick way to submit the debug script is to set values for the properties mapred.map.task.debug.script and mapred.reduce.task.debug.script, for debugging map and reduce tasks respectively. These properties can also be set by using APIs JobConf.setMapDebugScript(String) and JobConf.setReduceDebugScript(String) . In streaming mode, a debug script can be submitted with the command-line options -mapdebug and -reducedebug, for debugging map and reduce tasks respectively. In Hadoop 2, to submit the debug script is to set values for the properties mapreduce.map.debug.script and mapreduce.reduce.debug.script, for debugging map and reduce tasks respectively. These properties can also be set by using APIs Configuration.set(MRJobConfig.MAP_DEBUG_SCRIPT, String) and Configuration.set(MRJobConfig.REDUCE_DEBUG_SCRIPT, String). In streaming mode, a debug script can be submitted with the command-line options -mapdebug and -reducedebug, for debugging map and reduce tasks respectively.<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">The arguments to the script are the task\u2019s stdout, stderr, syslog and jobconf files. The debug command, run on the node where the MapReduce task failed, is:<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">$script $stdout $stderr $syslog $jobconf<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">Pipes programs have the c++ program name as a fifth argument for the command. Thus for the pipes programs the command is<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">$script $stdout $stderr $syslog $jobconf $program<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">For pipes, a default script is run to process core dumps under gdb, prints stack trace and gives info about running threads.<\/span><\/p>\n\n\n\n<p><span lang=\"EN-US\">Hadoop MapReduce provides facilities for the application-writer to specify compression for both intermediate map-outputs and the job-outputs i.e. output of the reduces. It also comes bundled with CompressionCodec implementation for the zlib compression algorithm. The gzip, bzip2, snappy, and lz4 file format are also supported.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Apply for Big Data and Hadoop Developer Certification<\/strong><\/h3>\n\n\n\n<p><a href=\"https:\/\/www.vskills.in\/certification\/certified-big-data-and-apache-hadoop-developer\">https:\/\/www.vskills.in\/certification\/certified-big-data-and-apache-hadoop-developer<\/a><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><a href=\"https:\/\/www.vskills.in\/certification\/tutorial\/certified-hadoop-mapreduce\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Back to Tutorials<\/strong><\/a><\/h4>\n","protected":false},"excerpt":{"rendered":"<p>Debugging and Profiling Profiling Profiling is a utility to get a representative (2 or 3) sample of built-in java profiler for a sample of maps and reduces. In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property mapred.task.profile. The&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"categories":[65],"tags":[],"class_list":["post-20958","page","type-page","status-publish","hentry","category-hadoop-and-mapreduce"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Hadoop &amp; Mapreduce Tutorial | Debugging and Profiling<\/title>\n<meta name=\"description\" content=\"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop &amp; Mapreduce Tutorial | Debugging and Profiling\" \/>\n<meta property=\"og:description\" content=\"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/\" \/>\n<meta property=\"og:site_name\" content=\"Tutorial\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/vskills.in\/\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-12T08:46:18+00:00\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/\",\"url\":\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/\",\"name\":\"Hadoop & Mapreduce Tutorial | Debugging and Profiling\",\"isPartOf\":{\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#website\"},\"datePublished\":\"2013-05-11T05:39:05+00:00\",\"dateModified\":\"2024-04-12T08:46:18+00:00\",\"description\":\"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property\",\"breadcrumb\":{\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.vskills.in\/certification\/tutorial\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Hadoop &#038; Mapreduce Tutorial | Debugging and Profiling\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#website\",\"url\":\"https:\/\/www.vskills.in\/certification\/tutorial\/\",\"name\":\"Tutorial\",\"description\":\"Vskills - A initiative in elearning and certification\",\"publisher\":{\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.vskills.in\/certification\/tutorial\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#organization\",\"name\":\"Vskills\",\"url\":\"https:\/\/www.vskills.in\/certification\/tutorial\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.vskills.in\/certification\/tutorial\/wp-content\/uploads\/2017\/07\/vskills-min-logo.jpg\",\"contentUrl\":\"https:\/\/www.vskills.in\/certification\/tutorial\/wp-content\/uploads\/2017\/07\/vskills-min-logo.jpg\",\"width\":73,\"height\":55,\"caption\":\"Vskills\"},\"image\":{\"@id\":\"https:\/\/www.vskills.in\/certification\/tutorial\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/vskills.in\/\",\"https:\/\/x.com\/vskills_in\",\"https:\/\/www.linkedin.com\/company-beta\/1371554\/\",\"https:\/\/www.youtube.com\/channel\/UCMWnscxPwRF_PqXo9B7q_Tw\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop & Mapreduce Tutorial | Debugging and Profiling","description":"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop & Mapreduce Tutorial | Debugging and Profiling","og_description":"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property","og_url":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/","og_site_name":"Tutorial","article_publisher":"https:\/\/www.facebook.com\/vskills.in\/","article_modified_time":"2024-04-12T08:46:18+00:00","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/","url":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/","name":"Hadoop & Mapreduce Tutorial | Debugging and Profiling","isPartOf":{"@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#website"},"datePublished":"2013-05-11T05:39:05+00:00","dateModified":"2024-04-12T08:46:18+00:00","description":"In Hadoop 1, user can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property","breadcrumb":{"@id":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.vskills.in\/certification\/tutorial\/debugging-and-profiling\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.vskills.in\/certification\/tutorial\/"},{"@type":"ListItem","position":2,"name":"Hadoop &#038; Mapreduce Tutorial | Debugging and Profiling"}]},{"@type":"WebSite","@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#website","url":"https:\/\/www.vskills.in\/certification\/tutorial\/","name":"Tutorial","description":"Vskills - A initiative in elearning and certification","publisher":{"@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.vskills.in\/certification\/tutorial\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#organization","name":"Vskills","url":"https:\/\/www.vskills.in\/certification\/tutorial\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#\/schema\/logo\/image\/","url":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-content\/uploads\/2017\/07\/vskills-min-logo.jpg","contentUrl":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-content\/uploads\/2017\/07\/vskills-min-logo.jpg","width":73,"height":55,"caption":"Vskills"},"image":{"@id":"https:\/\/www.vskills.in\/certification\/tutorial\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/vskills.in\/","https:\/\/x.com\/vskills_in","https:\/\/www.linkedin.com\/company-beta\/1371554\/","https:\/\/www.youtube.com\/channel\/UCMWnscxPwRF_PqXo9B7q_Tw"]}]}},"_links":{"self":[{"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/pages\/20958","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/comments?post=20958"}],"version-history":[{"count":7,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/pages\/20958\/revisions"}],"predecessor-version":[{"id":127242,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/pages\/20958\/revisions\/127242"}],"wp:attachment":[{"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/media?parent=20958"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/categories?post=20958"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.vskills.in\/certification\/tutorial\/wp-json\/wp\/v2\/tags?post=20958"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}