Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default job name length for DistributedLzoIndexer #117

Merged
merged 8 commits into from
Sep 24, 2016

Conversation

zman0900
Copy link
Contributor

Truncates the job name to 200 characters by default and provides configuration options for overriding the name or length.

This fixes issue #25.

Truncates the job name to 200 characters by default and provides configuration options for overriding the name or length.
@zman0900 zman0900 changed the title Default job name lenght for DistributedLzoIndexer Default job name length for DistributedLzoIndexer Aug 24, 2016
Copy link
Collaborator

@sjlee sjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. The change looks good. One nit: it would be great if you could write a small unit test to test setJobName().

@zman0900
Copy link
Contributor Author

I added some tests and also ability to restore the old behavior by setting max length <= 0.

Copy link
Collaborator

@sjlee sjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks pretty good overall. Thanks for adding an option to disable it too.

My comments are mostly minor...

assertEquals(expected, job.getJobName());
}

public void testCustomeName() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: CustomeName -> CustomName

static void setJobName(Job job, String[] args) {
final Configuration conf = job.getConfiguration();

String name = conf.get(JOB_NAME_KEY, "Distributed Lzo Indexer " + Arrays.toString(args));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define a constant for "Distributed Lzo Indexer" so that it can be reused by the test?

assertEquals(customName, job.getJobName());
}

public void testCustomeNameTruncation() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same typo

assertEquals("12345...", job.getJobName());
}

public void testDefaultLengthTruncation() throws Exception {
Copy link
Collaborator

@sjlee sjlee Sep 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really think it's important to test the default length. It should be sufficient between the custom name test and custom length test IMO. I'm OK with dropping this test. The fact that this test needs to rely on the default value (200) implicitly is also bit of a drawback (should the default change in the future).

Copy link
Collaborator

@sjlee sjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Let's use the constants in place of literals.

Job job = new Job(conf);
DistributedLzoIndexer.setJobName(job, args);

String expected = "Distributed Lzo Indexer [hdfs://cluster/user/test/...";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the constant here too.

Job job = new Job(conf);
DistributedLzoIndexer.setJobName(job, args);

String expected = "Distributed Lzo Indexer [hdfs://cluster/user/test/output/file-m-00000.lzo, hdfs://cluster/user/test/output/file-m-00001.lzo, hdfs://cluster/user/test/output/file-m-00002.lzo, hdfs://cluster/user/test/output/file-m-00003.lzo, hdfs://cluster/user/test/output/file-m-00003.lzo]";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

String customName = "-<Custom Job Name>-";

Configuration conf = new Configuration(false);
conf.set("lzo.indexer.distributed.job.name", customName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the constant.

};

Configuration conf = new Configuration(false);
conf.set("lzo.indexer.distributed.job.name", "123456789");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the constants in these 2 lines.

};

Configuration conf = new Configuration(false);
conf.setInt("lzo.indexer.distributed.job.name.max.length", 50);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the constant.

};

Configuration conf = new Configuration(false);
conf.setInt("lzo.indexer.distributed.job.name.max.length", 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the constant.

Copy link
Collaborator

@sjlee sjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your contribution!

@sjlee sjlee merged commit 41400f3 into twitter:master Sep 24, 2016
@zman0900 zman0900 deleted the dist_indexer_job_name branch September 24, 2016 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants