Related papers: Text-to-hashtag Generation using Seq2seq Learning
The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic…
Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is…
In the era of rapid technological advancement, social media platforms such as Twitter (X) have emerged as indispensable tools for gathering consumer insights, capturing diverse opinions, and understanding public attitudes. This research…
Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often…
2018's Brazilian presidential elections highlighted the influence of alternative media and social networks, such as Twitter. In this work, we perform an analysis covering politically motivated discourses related to the second round in…
In this paper, we describe a methodology to predict sentiment in code-mixed tweets (hindi-english). Our team called verissimo.manoel in CodaLab developed an approach based on an ensemble of four models (MultiFiT, BERT, ALBERT, and XLNET).…
This paper describes a preliminary study for producing and distributing a large-scale database of embeddings from the Portuguese Twitter stream. We start by experimenting with a relatively small sample and focusing on three challenges:…
A huge volume of user-generated content is daily produced on social media. To facilitate automatic language understanding, we study keyphrase prediction, distilling salient information from massive posts. While most existing methods extract…
Recent developments in online communication and their usage in everyday life have caused an explosion in the amount of a new genre of text data, short text. Thus, the need to classify this type of text based on its content has a significant…
We widely use emojis in social networking to heighten, mitigate or negate the sentiment of the text. Emoji suggestions already exist in many cross-platform applications but an emoji is predicted solely based a few prominent words instead of…
Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics. However, the semantic content of hashtags is not…
Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic…
With the rise of Social Media, people obtain and share information almost instantly on a 24/7 basis. Many research areas have tried to gain valuable insights from these large volumes of freely available user generated content. With the goal…
Word embeddings (e.g., word2vec) have been applied successfully to eCommerce products through~\textit{prod2vec}. Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to…
Legal Judgment Prediction is one of the most acclaimed fields for the combined area of NLP, AI, and Law. By legal prediction we mean an intelligent systems capable to predict specific judicial characteristics, such as judicial outcome, a…
This article investigates applying advanced machine learning models, specifically LSTM and BERT, for text classification to predict multiple categories in the retail sector. The study demonstrates how applying data augmentation techniques…
Lexicon based sentiment analysis usually relies on the identification of various words to which a numerical value corresponding to sentiment can be assigned. In principle, classifiers can be obtained from these algorithms by comparison with…
Access to sign language data is far from adequate. We show that it is possible to collect the data from social networking services such as TikTok, Instagram, and YouTube by applying data filtering to enforce quality standards and by…
Twitter social network contains a large amount of information generated by its users. That information is composed of opinions and comments that may reflect trends in social behavior. There is talk of trend when it is possible to identify…
Hashtags have become a powerful tool in social platforms such as Twitter to categorize and search for content, and to spread short messages across members of the social network. In this paper, we study temporal hashtag usage practices in…