Archive for June 2019

The successes of deep learning for text analytics, also introduced in a recent post about sentiment analysis and published here are undeniable. Many other tasks in NLP have also benefitted from the superiority of deep learning methods over more traditional approaches. Such extraordinary results have also been possible due to the neural network approach to learn meaningful character and word embeddings, that is the representation space in which semantically similar objects are mapped to nearby vectors.
All this is strictly related to a field one might initially find disconnected or off-topic: biology.

 


Don't forget to subscribe to our Newsletter at amethix.com and get the latest updates in AI and machine learning. We do not spam. Promise!


 

References

[1] Rives A., et al., “Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences”, biorxiv, doi: https://doi.org/10.1101/622803

[2] Vaswani A., et al., “Attention is all you need”, Advances in neural information processing systems, pp. 5998–6008, 2017.

[3] Bahdanau D., et al., “Neural machine translation by jointly learning to align and translate”, arXiv, http://arxiv.org/abs/1409.0473.

Read Full Post »

The rapid diffusion of social media like Facebook and Twitter, and the massive use of different types of forums like Reddit, Quora, etc., is producing an impressive amount of text data every day. 

There is one specific activity that many business owners have been contemplating over the last five years, that is identifying the social sentiment of their brand, by analysing the conversations of their users.

In this episode I explain how one can get the best shot at classifying sentences with deep learning and word embedding.

 

 

Additional material

Schematic representation of how to learn a word embedding matrix E by training a neural network that, given the previous M words, predicts the next word in a sentence. 

 

word2vec_training.png?w=702&ssl=1

 

 

Word2Vec example source code

https://gist.github.com/rlangone/ded90673f65e932fd14ae53a26e89eee#file-word2vec_example-py

 

 

References

[1] Mikolov, T. et al., "Distributed Representations of Words and Phrases and their Compositionality", Advances in Neural Information Processing Systems 26, pages 3111-3119, 2013.

[2] The Best Embedding Method for Sentiment Classification, https://medium.com/@bramblexu/blog-md-34c5d082a8c5

[3] The state of sentiment analysis: word, sub-word and character embedding 
https://amethix.com/state-of-sentiment-analysis-embedding/

 

Read Full Post »

In this episode I speak to Alexandr Honchar, data scientist and owner of blog https://medium.com/@alexrachnog
Alexandr has written very interesting posts about time series analysis for financial data. His blog is in my personal list of best tutorial blogs. 

We discuss about financial time series and machine learning, what makes predicting the price of stocks a very challenging task and why machine learning might not be enough.
As usual, I ask Alexandr how he sees machine learning in the next 10 years. His answer - in my opinion quite futuristic - makes perfect sense. 

You can contact Alexandr on

 

Enjoy the show!

 

Read Full Post »