public class HadoopWordCount extends Object
For more details about the word count pipeline itself, please see the JavaDoc
for the WordCount
class in wordcount
sample.
HdfsSources.hdfs(JobConf, BiFunctionEx)
is a source
that can be used for reading from HDFS given a JobConf
with input paths and input formats. The files in the input folder
will be split among Jet processors, using InputSplit
s.
HdfsSinks.hdfs(JobConf, FunctionEx, FunctionEx)
writes the output to the given output path, with each
processor writing to a single file within the path. The files are
identified by the member ID and the local ID of the writing processor.
Unlike in MapReduce, the data in the output files is not sorted by key.
In this example, files are read from and written to using TextInputFormat
and TextOutputFormat
respectively, but the
example can be adjusted to be used with any input/output format.
Constructor and Description |
---|
HadoopWordCount() |
Copyright © 2019 Hazelcast, Inc.. All rights reserved.