tvecs.vector_space_mapper package¶
Submodules¶
tvecs.vector_space_mapper.vector_space_mapper module¶
Module to map two Vector Spaces using a bilingual dictionary.
-
class
tvecs.vector_space_mapper.vector_space_mapper.
VectorSpaceMapper
(model_1, model_2, bilingual_dict, encoding='utf-8')[source]¶ Bases:
object
Class to map two vector spaces together.
Vector spaces obtained using the two Word2Vec models.
Bilingual Dict used to map semantic embeddings between vector spaces.
- Linear Regression utilised for the mapping from
sklearn.linear_model
- API Documentation:
- param model_1
Model constructed from Language 1 built using
tvecs.model_generator.model_generator
.- param model_2
Model constructed from Language 2 built using
tvecs.model_generator.model_generator
.- param bilingual_dict
Bilingual Dictionary for Language 1, Language 2.
- param encoding
Encoding utilised in the corpora
- type encoding
String
- type model_1
gensim.models.Word2Vec
- type model_2
gensim.models.Word2Vec
- type bilingual_dict
List[(lang1, lang2), (lang1, lang2)]
See also
gensim.models.Word2Vec
sklearn.linear_model
scipy.spatial.distance
-
get_recommendations_from_vec
(vector, topn=10)[source]¶ Get topn most similar words from model-2 [language 2].
Vector for the word in Model 1 [Language 1] should be provided
- API Documentation:
- param vector
Input a vector from Model 1, recommendations provided from Model 2.
- param topn
Number of recommendations to be provided.
- type vector
List
,numpy.array
- type topn
Integer
- return
Topn recommendations from Model 2.
- rtype
List
-
get_recommendations_from_word
(word, topn=10, pretty_print=False)[source]¶ Get topn most similar words from model-2 [language 2].
Word from Model 1 [Language 1] should be provided
- API Documentation:
- param word
Input a word from Model 1, recommendations provided from Model 2.
- param topn
Number of recommendations to be provided.
- param pretty_print
Pretty Print the recommendations correctly.
- type pretty_print
Boolean
- type word
String expected [ usually unicode preferred ]
- type topn
Integer
- return
Topn recommendations from Model 2.
- rtype
List
-
map_vector_spaces
()[source]¶ Perform linear regression upon the semantic embeddings.
- Semantic embeddings obtained from vector space of corresponding
bilingual words of the same language.
-
obtain_cosine_similarity
(word_1, word_2)[source]¶ Obtain cosine similarity.
Cosine Similarity between word_2 and predicted word using word_1
- API Documentation:
- param word_1
Used to predict possible vector from Model 2 using word from Model 1.
- param word_2
Used for comparison in cosine similarity.
- type word_1
String
- type word_2
String
- return
Cosine similarity between predicted word and actual word.
- rtype
Float