Archive for August 2018

The success of a machine learning model depends on several factors and events. True generalization to data that the model has never seen before is more a chimera than a reality. But under specific conditions a well trained machine learning model can generalize well and perform with testing accuracy that is similar to the one performed during training.

In this episode I explain when and why machine learning models fail from training to testing datasets.

Read Full Post »

In this episode I don't talk about data. In fact, I talk about metadata.

While many machine learning models rely on certain amounts of data eg. text, images, audio and video, it has been proved how powerful is the signal carried by metadata, that is all data that is invisible to the end user.
Behind a tweet of 140 characters there are more than 140 fields of data that draw a much more detailed profile of the sender and the content she is producing... without ever considering the tweet itself.

 

References
You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information https://www.ucl.ac.uk/~ucfamus/papers/icwsm18.pdf

Read Full Post »

Today’s episode is about text analysis with python.
Python is the de facto standard in machine learning. A large community, a generous choice in the set of libraries, at the price of less performant tasks, sometimes. But overall a decent language for typical data science tasks.

I am with Rebecca Bilbro, co-author of Applied Text Analysis with Python, with Benjamin Bengfort and Tony Ojeda.

We speak about the evolution of applied text analysis, tools and pipelines, chatbots.

 

Read Full Post »

Attacking deep learning models

Compromising AI for fun and profit

 

Deep learning models have shown very promising results in computer vision and sound recognition. As more and more deep learning based systems get integrated in disparate domains, they will keep affecting the life of people. Autonomous vehicles, medical imaging and banking applications, surveillance cameras and drones, digital assistants, are only a few real applications where deep learning plays a fundamental role. A malfunction in any of these applications will affect the quality of such integrated systems and compromise the security of the individuals who directly or indirectly use them.

In this episode, we explain how machine learning models can be attacked and what we can do to protect intelligent systems from being  compromised.

Read Full Post »