tvecs.model_generator package¶
Submodules¶
tvecs.model_generator.model_generator module¶
Used to generate Word2Vec Models for individual languages after preprocessing.
- Preprocessing Corpus - Implementation of BasePreprocessor module
HcCorpusPreprocessor
- Word2Vec Model Building
Gensim Word2Vec SkipGram implementation
-
tvecs.model_generator.model_generator.
construct_model
(preprocessed_corpus, language, output_dir_path='.', output_fname=None, iterations=5)[source]¶ Construct Model given the preprocessed corpus.
- API Documentation:
- param preprocessed_corpus
Instance of SubClass of BasePreprocessor.
- type preprocessed_corpus
Any class that inherits from
tvecs.preprocessor.base_preprocessor
- param language
Language for which model is generated.
- type language
String
- param output_dir_path
Output Dir Path where model is stored. [ Default Current Directory ]
- type output_dir_path
String
- param output_fname
Output file name set.
- type output_fname
String
- param iterations
Number of iterations for Word2Vec. [ Default value 5 ]
- type iterations
Integer
- return
Constructed Model based on the provided specifications.
- rtype
gensim.models.Word2Vec
See also
gensim.models.Word2Vec
-
tvecs.model_generator.model_generator.
generate_model
(preprocessor_type, language, corpus_fname, corpus_dir_path='.', output_fname=None, output_dir_path='data/models', need_preprocessing=True, iterations=5)[source]¶ Function used to preprocess and generate models.
- API Documentation
- param preprocessor_type
Class Name for preprocessor.
- type preprocessor_type
String
- param language
Language for which model is generated.
- type language
String
- param corpus_fname
Corpus Filename
- type corpus_fname
String
- param corpus_dir_path
Directory Path where corpus exists. [ Default Current Directory ]
- type corpus_dir_path
String
- param output_dir_path
Output Dir Path where model is stored
- type output_dir_path
String
- param output_fname
Output filename to be generated.
- type output_fname
String
- param need_preprocessing
Runs Preprocess with the same flag. [ Default True ]
- type need_preprocessing
Boolean
- param iterations
Number of iterations for Word2Vec. [ Default value 5 ]
- type iterations
Integer
- return
Constructed Model based on the provided specifications.
- rtype
gensim.models.Word2Vec