public class TextDataConvertor extends AbstractDataConvertor
datetimeMatrix, preferenceMatrix, sparseTensor
PROGRESS_INTERVAL
Constructor and Description |
---|
TextDataConvertor(java.lang.String inputDataPath)
Initializes a newly created
TextDataConvertor object with the
path of the input data file. |
TextDataConvertor(java.lang.String dataColumnFormat,
java.lang.String inputDataPath)
Initializes a newly created
TextDataConvertor object with the
path and format of the input data file. |
TextDataConvertor(java.lang.String dataColumnFormat,
java.lang.String inputDataPath,
double binThold)
Initializes a newly created
TextDataConvertor object with the
path and format of the input data file. |
TextDataConvertor(java.lang.String dataColumnFormat,
java.lang.String inputDataPath,
double binThold,
com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> userIds,
com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> itemIds)
Initializes a newly created
TextDataConvertor object with the
path and format of the input data file. |
Modifier and Type | Method and Description |
---|---|
double |
getDataFileRate()
Return rate of alreadyLoaded/allData in one file.
|
double |
getFilePathRate()
Return rate of loading files in data directory.
|
int |
getItemId(java.lang.String rawId)
Return an item's inner id by its raw id.
|
com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> |
getItemIds()
Return item {rawid, inner id} mappings
|
double |
getLoadAllFileRate()
Return rate of alreadyLoaded/allData in all files.
|
int |
getUserId(java.lang.String rawId)
Return a user's inner id by his raw id.
|
com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> |
getUserIds()
Return user {rawid, inner id} mappings
|
int |
numItems()
Return the number of items.
|
int |
numUsers()
Return the number of users.
|
void |
processData()
Process the input data.
|
void |
progress()
Set the progress for job status.
|
void |
setTimeUnit(java.util.concurrent.TimeUnit timeUnit)
Set the time unit of the data file.
|
getDatetimeMatrix, getPreferenceMatrix, getSparseTensor
getJobStatus, progressx, run
public TextDataConvertor(java.lang.String inputDataPath)
TextDataConvertor
object with the
path of the input data file.inputDataPath
- the path of the input data filepublic TextDataConvertor(java.lang.String dataColumnFormat, java.lang.String inputDataPath)
TextDataConvertor
object with the
path and format of the input data file.dataColumnFormat
- the path of the input data fileinputDataPath
- the format of the input data filepublic TextDataConvertor(java.lang.String dataColumnFormat, java.lang.String inputDataPath, double binThold)
TextDataConvertor
object with the
path and format of the input data file.dataColumnFormat
- the path of the input data fileinputDataPath
- the format of the input data filebinThold
- the threshold to binarize a rating. If a rating is greater than the threshold, the value will be 1;
otherwise 0. To disable this appender, i.e., keep the original rating value, set the threshold a
negative valuepublic TextDataConvertor(java.lang.String dataColumnFormat, java.lang.String inputDataPath, double binThold, com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> userIds, com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> itemIds)
TextDataConvertor
object with the
path and format of the input data file.dataColumnFormat
- the path of the input data fileinputDataPath
- the format of the input data filebinThold
- the threshold to binarize a rating. If a rating is greater than the threshold, the value will be 1;
otherwise 0. To disable this appender, i.e., keep the original rating value, set the threshold a
negative valueuserIds
- userId to userIndex mapitemIds
- itemId to itemIndex mappublic void processData() throws java.io.IOException
java.io.IOException
- if the inputDataPath
is not valid.public void progress()
public double getFilePathRate()
loadFilePathRate
public double getDataFileRate()
loadDataFileRate
public double getLoadAllFileRate()
loadAllFileRate
public int numUsers()
public int numItems()
public int getUserId(java.lang.String rawId)
rawId
- raw user id as Stringpublic int getItemId(java.lang.String rawId)
rawId
- raw item id as Stringpublic com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> getUserIds()
userIds
public com.google.common.collect.BiMap<java.lang.String,java.lang.Integer> getItemIds()
itemIds
public void setTimeUnit(java.util.concurrent.TimeUnit timeUnit)
timeUnit
- the time unit to be set for the data fileCopyright © 2017. All Rights Reserved.