Earnings calls datasets v1.0
04/11/2014
----------------------
The datasets were used in the following paper:

William Yang Wang, and Zhenhao Hua, 

"A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls", 
in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), 
Baltimore, MD, June 22-27, ACL. 

http://www.cs.cmu.edu/~yww/papers/acl2014_copula.pdf

pre2009: earnings calls prior to 2009.
2009: earnings calls from 2009.
post2009: earnings calls after 2009.

Features indices:

1-100: unigrams.
101-200: bigrams.
201-300: named entities.
301-400: POS tags.
401-500: frame-semantics features.

The details of the features and datasets can be found in the paper.

DISCLAIMER
----------------------
We aim at providing an accurate and clean dataset for
academic research purposes, but there is no guarantees
whatsoever about the quality or content of the data.
If you use this dataset for your research,
it is your sole responsibility when drawing any conclusions,
and it does not reflect any opinions from the authors of the dataset.

USAGE
----------------------
Note that the datasets are provided for research purposes only,
and no commercial use is allowed.

CONTACT
----------------------
William Wang
yww@cs.cmu.edu