I try to use an open source namely BlazingCache http://blazingcache.org/ to implement a coordinator cache idea for my application.
So I just use WordCount example https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v2.0 to test this cache library. Here is my whole code:
public class WordCount2 {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
//...
private static Cache<String, String> cache;
@Override
public void setup(Context context) throws IOException,
InterruptedException {
//...
initCache();
}
private void initCache() {
CachingProvider provider = Caching.getCachingProvider();
Properties properties = new Properties();
properties.put("blazingcache.mode","clustered");
properties.put("blazingcache.zookeeper.connectstring","localhost:1281");
properties.put("blazingcache.zookeeper.sessiontimeout","40000");
properties.put("blazingcache.zookeeper.path","/blazingcache");
CacheManager cacheManager = provider.getCacheManager(provider.getDefaultURI(), provider.getDefaultClassLoader(), properties);
MutableConfiguration<String, String> cacheConfiguration = new MutableConfiguration<>();
cache = cacheManager.createCache("example", cacheConfiguration);
}
@Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
//...
cache.put(word.toString(), one.toString());
}
}
}
//...
}
The problem is at line:
cache.put(word.toString(), one.toString());
in map function.
When this line is inserted to the code, the performance of whole job degrade suddenly. (I'm using Eclipse to run the WordCount example in local mode).
Why is this happening and how can I fix it?
Aucun commentaire:
Enregistrer un commentaire