Welcome to the NLP and Social Sciences group! The aim of this group is to train participants on the most recent techniques in natural language processing, text analysis, scraping, network theory and machine learning on graphs.
Applying our knowledge on real-world projects is the ultimate goal of our group.
The program is divided into two parts. A first part, more technical and common to all the participants, that deals with NLP techniques, scraping 101 and fundamentals of network theory + learning on graph structures. We will approach these concepts from courses and materials available online, and we will discuss them together in online meetings.
The second part will be devoted to the application of NLP techniques to topics of interests and it will be possible to form various groups on different topics. We plan to create at least one group that will apply NLP techniques to political texts and political communication in general. Other groups may be created as well.
We plan to meet once every two/three weeks. No previous knowledge of NLP and text analysis is required, even though a familiarity with Python might be preferable.
Main contacts: S. Azeglio, L. Bottero, M.Rizzi, M.Olocco, A.Borriero (simone.azeglio ‘at’ edu.unito.it)
LearningNLP Github MLJC’s Repo
LearningNLP (this link opens in a new window) by MachineLearningJournalClub (this link opens in a new window)
Some Tutorials & in depth analysis of NLP’s algorithms with an ethical flavour
MLJC’s NLP team joins the HuggingFace’s 🤗 Flax/JAX community week
HuggingFace 🤗 partnered-up with Google’s Flax, JAX, and Cloud teams to organize a new community week from July 7th to July 14th 2021. In this occasion HuggingFace 🤗 and Google’s engineers will teach how to effectively use JAX/Flax for Natural Language Processing (NLP) and Computer Vision (CV).
Free access to a TPUv3-8 VM (120 GB of memory!!) will kindly be provided by the Google Cloud team!
We are taking part in the following projects:
flax-sentence-embeddings (this link opens in a new window) by nreimers (this link opens in a new window)
Shared code for training sentence embeddings with Flax / JAX
MLJC’s NLP team is taking part in the CommonLit Readability Challenge on Kaggle
CommonLit, Inc., is a nonprofit education technology organization serving over 20 million teachers and students with free digital reading and writing lessons for grades 3-12. Together with Georgia State University, an R1 public research university in Atlanta, they are challenging Kagglers to improve readability rating methods.
In this competition, you’ll build algorithms to rate the complexity of reading passages for grade 3-12 classroom use. To accomplish this, you’ll pair your machine learning skills with a dataset that includes readers from a wide variety of age groups and a large collection of texts taken from various domains. Winning models will be sure to incorporate text cohesion and semantics.
Some of our notebooks: