Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaskContextNotFoundException when specifying a single-valued array in a discriminator #48

Open
GoogleCodeExporter opened this issue May 14, 2015 · 3 comments

Comments

@GoogleCodeExporter
Copy link

This works: 

Map<String, Object> dimReaders = new HashMap<String, Object>();
dimReaders.put(DIM_READER_TRAIN, TwentyNewsgroupsCorpusReader.class);
dimReaders.put(DIM_READER_TRAIN_PARAMS,
   Arrays.asList(new Object[] {...,
   TwentyNewsgroupsCorpusReader.PARAM_PATTERNS, new String[] { INCLUDE_PREFIX + "*.xml", INCLUDE_PREFIX + "*.xml.gz" } }));
Dimension.createBundle("readers", dimReaders);

This doesn't work:

Map<String, Object> dimReaders = new HashMap<String, Object>();
dimReaders.put(DIM_READER_TRAIN, TwentyNewsgroupsCorpusReader.class);
dimReaders.put(DIM_READER_TRAIN_PARAMS,
   Arrays.asList(new Object[] {...,
   TwentyNewsgroupsCorpusReader.PARAM_PATTERNS, new String[] { INCLUDE_PREFIX + "*/*.txt" } }));
Dimension.createBundle("readers", dimReaders);

To reproduce: go to 
de.tudarmstadt.ukp.dkpro.tc.examples.single.document.TwentyNewsgroupsDemo.java 
in the de.tudarmstadt.ukp.dkpro.tc.examples-gpl module of DKPro-TC, and add 
"new String[]" to the TwentyNewsgroupsCorpusReader.PARAM_PATTERNS parameter of 
the readers. Given the misuse of the PATTERNS parameter, this issue not be a 
big deal, but I'm keeping it here for the record.

Original issue reported on code.google.com by [email protected] on 20 May 2014 at 7:15

@GoogleCodeExporter
Copy link
Author

The problem is the way that the String array is converted to a String 
representation (see below '[Ljava.lang.String;@14bea80e]')

15:18:37,823  INFO BatchTaskCrossValidation:158 - [readerTrainParams]: 
[[sourceLocation, src/main/resources/data/twentynewsgroups/bydate-train, 
language, en, patterns, [Ljava.lang.String;@14bea80e]]

The whole thing works nicely if you use "asList(...)" instead of "String[] 
{...}":

TwentyNewsgroupsCorpusReader.PARAM_PATTERNS, asList( INCLUDE_PREFIX + "*/*.txt" 
) }));

The method that the Lab uses to convert parameter values to Strings is 
lab.Util.toString(Object). It can handle arrays at the top level, but otherwise 
largely relies on the toString() method of the parameter values themselves.

I'm not sure how much can be done here. It may be possible to push this a bit 
further to support more nested parameter values, but in general I would prefer 
to live with the workaround (using asList).

What do you think?

Original comment by richard.eckart on 20 May 2014 at 1:31

@GoogleCodeExporter
Copy link
Author

Thanks for the hint. The first snippet in my original post was working by 
incident, i.e. it also created [Ljava.lang.String;@14bea80e] for PARAM_PATTERNS 
but that didn't prevent the reader to read the files it was supposed to read. 
In the second snippet (used in a different project), it apparently did cause a 
problem.  However, not in the place I would expect the problem (namely, in the 
reader), but only later on, finally resulting in a 
TaskContextNotFoundException. 

That said, it might be a good idea to catch such misconfigurations earlier and 
throw a nicer exception. But that is definitely something that can slip into a 
later milestone.

Original comment by [email protected] on 20 May 2014 at 2:12

  • Changed state: WontFix

@GoogleCodeExporter
Copy link
Author

Further diagnosis shows that the problem is caused by the Spring type 
conversion.

We set the discriminator field using a Spring DirectFieldAccessor which 
internally invokes TypeConverterDelegate.convertIfNecessary (Spring 
3.1.2.RELEASE line 188). In that line, an array with a single value is 
converted into simply the first value (the array is removed):

if (convertedValue.getClass().isArray() && Array.getLength(convertedValue) == 
1) {
  convertedValue = Array.get(convertedValue, 0);
}

As a result, the value in the discriminator field (which is written to the 
DISCRIMINATORS.txt file) and the value in the parameter space (which is later 
matched against the DISCRIMINATORS.txt) are different. Thus, the context cannot 
be found.

A fix might involve using the Spring conversion service in the 
lab.Util.toString(Object) method too.

Original comment by richard.eckart on 20 May 2014 at 3:30

  • Changed state: Accepted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants