2012年1月7日土曜日

Twitter + 感情分析 (金融への応用含む)

Twitter mood predicts the stock market

http://arxiv.org/abs/1010.3003

Behavioral economics tells us that emotions can profoundly affect individual behavior and decision-making. Does this also apply to societies at large, i.e., can societies experience mood states that affect their collective decision making? By extension is the public mood correlated or even predictive of economic indicators? Here we investigate whether measurements of collective mood states derived from large-scale Twitter feeds are correlated to the value of the Dow Jones Industrial Average (DJIA) over time. We analyze the text content of daily Twitter feeds by two mood tracking tools, namely OpinionFinder that measures positive vs. negative mood and Google-Profile of Mood States (GPOMS) that measures mood in terms of 6 dimensions (Calm, Alert, Sure, Vital, Kind, and Happy). We cross-validate the resulting mood time series by comparing their ability to detect the public's response to the presidential election and Thanksgiving day in 2008. A Granger causality analysis and a Self-Organizing Fuzzy Neural Network are then used to investigate the hypothesis that public mood states, as measured by the OpinionFinder and GPOMS mood time series, are predictive of changes in DJIA closing values. Our results indicate that the accuracy of DJIA predictions can be significantly improved by the inclusion of specific public mood dimensions but not others. We find an accuracy of 87.6% in predicting the daily up and down changes in the closing values of the DJIA and a reduction of the Mean Average Percentage Error by more than 6%.



Younggue Bae , Hongchul Lee, A sentiment analysis of audiences on twitter: who is the positive or negative audience of popular twitterers?, Proceedings of the 5th international conference on Convergence and hybrid information technology, September 22-24, 2011, Daejeon, Korea
http://dl.acm.org/citation.cfm?id=1944594&CFID=76892041&CFTOKEN=90820401

Automated identification of diverse sentiment types can be beneficial for many NLP systems such as review summarization and public media analysis. In some of these systems there is an option of assigning a sentiment value to a single sentence or a very short text.In this paper we propose a supervised sentiment classification framework which is based on data from Twitter, a popular microblogging service. By utilizing 50 Twitter tags and 15 smileys as sentiment labels, this framework avoids the need for labor intensive manual annotation, allowing identification and classification of diverse sentiment types of short texts. We evaluate the contribution of different feature types for sentiment classification and show that our framework successfully identifies sentiment types of untagged sentences. The quality of the sentiment identification was also confirmed by human judges. We also explore dependencies and overlap between different sentiment types represented by smileys and Twitter hashtags.
http://dl.acm.org/citation.cfm?id=1944594&CFID=76892041&CFTOKEN=90820401


Effective sentiment stream analysis with self-augmenting training and demand-driven projection
http://dl.acm.org/citation.cfm?id=2009981&CFID=76892041&CFTOKEN=90820401
How do we analyze sentiments over a set of opinionated Twitter messages? This issue has been widely studied in recent years, with a prominent approach being based on the application of classification techniques. Basically, messages are classified according to the implicit attitude of the writer with respect to a query term. A major concern, however, is that Twitter (and other media channels) follows the data stream model, and thus the classifier must operate with limited resources, including labeled data for training classification models. This imposes serious challenges for current classification techniques, since they need to be constantly fed with fresh training messages, in order to track sentiment drift and to provide up-to-date sentiment analysis.

We propose solutions to this problem. The heart of our approach is a training augmentation procedure which takes as input a small training seed, and then it automatically incorporates new relevant messages to the training data. Classification models are produced on-the-fly using association rules, which are kept up-to-date in an incremental fashion, so that at any given time the model properly reflects the sentiments in the event being analyzed. In order to track sentiment drift, training messages are projected on a demand driven basis, according to the content of the message being classified. Projecting the training data offers a series of advantages, including the ability to quickly detect trending information emerging in the stream. We performed the analysis of major events in 2010, and we show that the prediction performance remains about the same, or even increases, as the stream passes and new training messages are acquired. This result holds for different languages, even in cases where sentiment distribution changes over time, or in cases where the initial training seed is rather small. We derive lower-bounds for prediction performance, and we show that our approach is extremely effective under diverse learning scenarios, providing gains that range from 7% to 58%.


Taketoshi Ushiama , Tomoya Eguchi, An information recommendation agent on microblogging service, Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications, June 29-July 01, 2011, Manchester, UK

Bernard J. Jansen , Kate Sobel , Geoff Cook, Gen X and Ys attitudes on using social media platforms for opinion sharing, Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems, April 10-15, 2010, Atlanta, Georgia, USA

Tiffany C. Chao, Data repositories: a home for microblog archives?, Proceedings of the 2011 iConference, p.655-656, February 08-11, 2011, Seattle, Washington

Wei Wu , Bin Zhang , Mari Ostendorf, Automatic generation of personalized annotation tags for Twitter users, Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, p.689-692, June 02-04, 2010, Los Angeles, California

Daniel M. Romero , Wojciech Galuba , Sitaram Asur , Bernardo A. Huberman, Influence and passivity in social media, Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases, September 05-09, 2011, Athens, Greece

Younggue Bae , Hongchul Lee, A sentiment analysis of audiences on twitter: who is the positive or negative audience of popular twitterers?, Proceedings of the 5th international conference on Convergence and hybrid information technology, September 22-24, 2011, Daejeon, Korea

Kamran Massoudi , Manos Tsagkias , Maarten de Rijke , Wouter Weerkamp, Incorporating query expansion and quality indicators in searching microblog posts, Proceedings of the 33rd European conference on Advances in information retrieval, April 18-21, 2011, Dublin, Ireland

Bernard J. Jansen , Kate Sobel , Geoff Cook, Being networked and being engaged: the impact of social networking on ecommerce information behavior, Proceedings of the 2011 iConference, p.130-136, February 08-11, 2011, Seattle, Washington

Dmitry Davidov , Oren Tsur , Ari Rappoport, Enhanced sentiment learning using Twitter hashtags and smileys, Proceedings of the 23rd International Conference on Computational Linguistics: Posters, p.241-249, August 23-27, 2010, Beijing, China

Claudia Wagner , Markus Strohmaier, The wisdom in tweetonomies: acquiring latent conceptual structures from social awareness streams, Proceedings of the 3rd International Semantic Search Workshop, p.1-10, April 26-26, 2010, Raleigh, North Carolina

Matthew Michelson , Sofus A. Macskassy, Discovering users' topics of interest on twitter: a first look, Proceedings of the fourth workshop on Analytics for noisy unstructured text data, October 26-26, 2010, Toronto, ON, Canada

Surender Reddy Yerva , Zoltán Miklós , Karl Aberer, What have fruits to do with technology?: the case of Orange, Blackberry and Apple, Proceedings of the International Conference on Web Intelligence, Mining and Semantics, May 25-27, 2011, Sogndal, Norway

Jingtao Wang , Shumin Zhai , John Canny, SHRIMP: solving collision and out of vocabulary problems in mobile predictive input with motion gesture, Proceedings of the 28th international conference on Human factors in computing systems, April 10-15, 2010, Atlanta, Georgia, USA

Takeshi Sakaki , Makoto Okazaki , Yutaka Matsuo, Earthquake shakes Twitter users: real-time event detection by social sensors, Proceedings of the 19th international conference on World wide web, April 26-30, 2010, Raleigh, North Carolina, USA

Sheila Kinsella , Mengjiao Wang , John G. Breslin , Conor Hayes, Improving categorisation in social media using hyperlinks to structured data sources, Proceedings of the 8th extended semantic web conference on The semanic web: research and applications, May 29-June 02, 2011, Heraklion, Crete, Greece

Jennifer Golbeck , Justin M. Grimes , Anthony Rogers, Twitter use by the U.S. Congress, Journal of the American Society for Information Science and Technology, v.61 n.8, p.1612-1621, August 2010

Jun Huang , Mizuho Iwaihara, Realtime social sensing of support rate for microblogging, Proceedings of the 16th international conference on Database systems for advanced applications, April 22-25, 2011, Hong Kong, China

Makoto Okazaki , Yutaka Matsuo, Semantic twitter: analyzing tweets for real-time event notification, Proceedings of the 2008/2009 international conference on Social software: recent trends and developments in social software, p.63-74, March 03-04, 2008, Cork, Ireland

Thomas Heverin , Lisl Zach, Twitter for city police department information sharing, Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem, October 22-27, 2010, Pittsburgh, Pennsylvania

Pedro Henrique Calais Guerra , Adriano Veloso , Wagner Meira, Jr. , Virgílio Almeida, From bias to opinion: a transfer-learning approach to real-time sentiment analysis, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, August 21-24, 2011, San Diego, California, USA

Miles Efron , Megan Winget, Questions are content: a taxonomy of questions in a microblogging environment, Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem, October 22-27, 2010, Pittsburgh, Pennsylvania

Anlei Dong , Ruiqiang Zhang , Pranam Kolari , Jing Bai , Fernando Diaz , Yi Chang , Zhaohui Zheng , Hongyuan Zha, Time is of the essence: improving recency ranking using Twitter data, Proceedings of the 19th international conference on World wide web, April 26-30, 2010, Raleigh, North Carolina, USA

Yegin Genc , Yasuaki Sakamoto , Jeffrey V. Nickerson, Discovering context: classifying tweets through a semantic transform based on wikipedia, Proceedings of the 6th international conference on Foundations of augmented cognition: directing the future of adaptive systems, July 09-14, 2011, Orlando, FL

Evangelos Kalampokis , Michael Hausenblas , Konstantinos Tarabanis, Combining social and government open data for participatory decision-making, Proceedings of the Third IFIP WG 8.5 international conference on Electronic participation, August 29-September 01, 2011, Delft, The Netherlands

Ismael Santana Silva , Janaína Gomide , Adriano Veloso , Wagner Meira, Jr. , Renato Ferreira, Effective sentiment stream analysis with self-augmenting training and demand-driven projection, Proceedings of the 34th international ACM SIGIR conference on Research and development in Information, July 24-28, 2011, Beijing, China

Bernard J. Jansen , Zhe Liu , Courtney Weaver , Gerry Campbell , Matthew Gregg, Real time search on the web: Queries, topics, and economic value, Information Processing and Management: an International Journal, v.47 n.4, p.491-506, July, 2011

Marti Motoyama , Brendan Meeder , Kirill Levchenko , Geoffrey M. Voelker , Stefan Savage, Measuring online service availability using twitter, Proceedings of the 3rd conference on Online social networks, p.13-13, June 22-25, 2010, Boston, MA

Zi Chu , Steven Gianvecchio , Haining Wang , Sushil Jajodia, Who is tweeting on Twitter: human, bot, or cyborg?, Proceedings of the 26th Annual Computer Security Applications Conference, December 06-10, 2010, Austin, Texas

Mike Thelwall , Kevan Buckley , Georgios Paltoglou, Sentiment in Twitter events, Journal of the American Society for Information Science and Technology, v.62 n.2, p.406-418, February 2011

Laurens De Vocht , Selver Softic , Martin Ebner , Herbert Mühlburger, Semantically driven social data aggregation interfaces for Research 2.0, Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, September 07-09, 2011, Graz, Austria

0 件のコメント:

コメントを投稿