Related papers: Text-to-hashtag Generation using Seq2seq Learning

Building a Sentiment Corpus of Tweets in Brazilian Portuguese

The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic…

Computation and Language · Computer Science 2017-12-27 Henrico Bertini Brum , Maria das Graças Volpe Nunes

Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is…

Computation and Language · Computer Science 2020-10-12 João A. Leite , Diego F. Silva , Kalina Bontcheva , Carolina Scarton

Sentiment Analysis of Social Media Data for Predicting Consumer Behavior Trends Using Machine Learning

In the era of rapid technological advancement, social media platforms such as Twitter (X) have emerged as indispensable tools for gathering consumer insights, capturing diverse opinions, and understanding public attitudes. This research…

Human-Computer Interaction · Computer Science 2025-10-23 S M Rakib Ul Karim , Rownak Ara Rasul , Tunazzina Sultana

Analyzing Trendy Twitter Hashtags in the 2022 French Election

Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often…

Social and Information Networks · Computer Science 2024-03-01 Aamir Mandviwalla , Lake Yin , Boleslaw K. Szymanski

Detecting Group Beliefs Related to 2018's Brazilian Elections in Tweets A Combined Study on Modeling Topics and Sentiment Analysis

2018's Brazilian presidential elections highlighted the influence of alternative media and social networks, such as Twitter. In this work, we perform an analysis covering politically motivated discourses related to the second round in…

Computation and Language · Computer Science 2020-06-02 Brenda Salenave Santana , Aline Aver Vanin

Deep Learning Brasil -- NLP at SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets

In this paper, we describe a methodology to predict sentiment in code-mixed tweets (hindi-english). Our team called verissimo.manoel in CodaLab developed an approach based on an ensemble of four models (MultiFiT, BERT, ALBERT, and XLNET).…

Computation and Language · Computer Science 2020-08-05 Manoel Veríssimo dos Santos Neto , Ayrton Denner da Silva Amaral , Nádia Félix Felipe da Silva , Anderson da Silva Soares

Learning Word Embeddings from the Portuguese Twitter Stream: A Study of some Practical Aspects

This paper describes a preliminary study for producing and distributing a large-scale database of embeddings from the Portuguese Twitter stream. We start by experimenting with a relatively small sample and focusing on three challenges:…

Computation and Language · Computer Science 2017-09-05 Pedro Saleiro , Luís Sarmento , Eduarda Mendes Rodrigues , Carlos Soares , Eugénio Oliveira

Topic-Aware Neural Keyphrase Generation for Social Media Language

A huge volume of user-generated content is daily produced on social media. To facilitate automatic language understanding, we study keyphrase prediction, distilling salient information from massive posts. While most existing methods extract…

Computation and Language · Computer Science 2019-06-11 Yue Wang , Jing Li , Hou Pong Chan , Irwin King , Michael R. Lyu , Shuming Shi

QiBERT -- Classifying Online Conversations Messages with BERT as a Feature

Recent developments in online communication and their usage in everyday life have caused an explosion in the amount of a new genre of text data, short text. Thus, the need to classify this type of text based on its content has a significant…

Computation and Language · Computer Science 2024-09-10 Bruno D. Ferreira-Saraiva , Zuil Pirola , João P. Matos-Carvalho , Manuel Marques-Pita

A `Sourceful' Twist: Emoji Prediction Based on Sentiment, Hashtags and Application Source

We widely use emojis in social networking to heighten, mitigate or negate the sentiment of the text. Emoji suggestions already exist in many cross-platform applications but an emoji is predicted solely based a few prominent words instead of…

Computation and Language · Computer Science 2021-03-16 Pranav Venkit , Zeba Karishma , Chi-Yang Hsu , Rahul Katiki , Kenneth Huang , Shomir Wilson , Patrick Dudas

Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics. However, the semantic content of hashtags is not…

Computation and Language · Computer Science 2019-06-17 Mounica Maddela , Wei Xu , Daniel Preoţiuc-Pietro

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic…

Computation and Language · Computer Science 2022-12-02 Frederico Dias Souza , João Baptista de Oliveira e Souza Filho

Social Media Text Processing and Semantic Analysis for Smart Cities

With the rise of Social Media, people obtain and share information almost instantly on a 24/7 basis. Many research areas have tried to gain valuable insights from these large volumes of freely available user generated content. With the goal…

Social and Information Networks · Computer Science 2017-09-12 João Filipe Figueiredo Pereira

BERT Goes Shopping: Comparing Distributional Models for Product Representations

Word embeddings (e.g., word2vec) have been applied successfully to eCommerce products through~\textit{prod2vec}. Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to…

Computation and Language · Computer Science 2021-06-24 Federico Bianchi , Bingqing Yu , Jacopo Tagliabue

Using attention methods to predict judicial outcomes

Legal Judgment Prediction is one of the most acclaimed fields for the combined area of NLP, AI, and Law. By legal prediction we mean an intelligent systems capable to predict specific judicial characteristics, such as judicial outcome, a…

Machine Learning · Computer Science 2022-12-29 Vithor Gomes Ferreira Bertalan , Evandro Eduardo Seron Ruiz

Multi-level Product Category Prediction through Text Classification

This article investigates applying advanced machine learning models, specifically LSTM and BERT, for text classification to predict multiple categories in the retail sector. The study demonstrates how applying data augmentation techniques…

Computation and Language · Computer Science 2024-03-05 Wesley Ferreira Maia , Angelo Carmignani , Gabriel Bortoli , Lucas Maretti , David Luz , Daniel Camilo Fuentes Guzman , Marcos Jardel Henriques , Francisco Louzada Neto

A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis

Lexicon based sentiment analysis usually relies on the identification of various words to which a numerical value corresponding to sentiment can be assigned. In principle, classifiers can be obtained from these algorithms by comparison with…

Computation and Language · Computer Science 2019-06-21 Mateus Machado , Evandro Ruiz , Kuruvilla Joseph Abraham

Towards Large-Scale Data Mining for Data-Driven Analysis of Sign Languages

Access to sign language data is far from adequate. We show that it is possible to collect the data from social networking services such as TikTok, Instagram, and YouTube by applying data filtering to enforce quality standards and by…

Computation and Language · Computer Science 2020-06-04 Boris Mocialov , Graham Turner , Helen Hastie

Balotage in Argentina 2015, a sentiment analysis of tweets

Twitter social network contains a large amount of information generated by its users. That information is composed of opinions and comments that may reflect trends in social behavior. There is talk of trend when it is possible to identify…

Information Retrieval · Computer Science 2016-11-09 Daniel Robins , Fernando Emmanuel Frati , Jonatan Alvarez , Jose Texier

Temporal Effects on Hashtag Reuse in Twitter: A Cognitive-Inspired Hashtag Recommendation Approach

Hashtags have become a powerful tool in social platforms such as Twitter to categorize and search for content, and to spread short messages across members of the social network. In this paper, we study temporal hashtag usage practices in…

Information Retrieval · Computer Science 2017-01-06 Dominik Kowald , Subhash Pujari , Elisabeth Lex