LIWCDictionary

recognizer
Class LIWCDictionary

java.lang.Object recognizer.LIWCDictionary

public class LIWCDictionary
extends java.lang.Object

Interface to the LIWC dictionary, implementing patterns for each LIWC category based on the LIWC.CAT file (not included).

Version:: 1.01
Author:: Francois Mairesse,

Constructor Summary
`LIWCDictionary(java.io.File catFile)` Loads dictionary from LIWC dictionary tab-delimited text file (with variable names as first row).

Method Summary
`java.util.Map<java.lang.String,java.lang.Double>`	`getCounts(java.lang.String text, boolean absoluteCounts)` Returns a map associating each LIWC categories to the number of their occurences in the input text.
`static java.lang.String[]`	`splitSentences(java.lang.String text)` Splits a text into sentences separated by a dot, exclamation point or question mark.
`static java.lang.String[]`	`tokenize(java.lang.String text)` Splits a text into words separated by non-word characters.

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

LIWCDictionary

public LIWCDictionary(java.io.File catFile)

Loads dictionary from LIWC dictionary tab-delimited text file (with variable names as first row). Each word category is converted into a regular expression that is a disjunction of all its members.

Parameters:: catFile - dictionary file, it should be pointing to the LIWC.CAT file of the Linguistic Inquiry and Word Count software (Pennebaker & Francis, 2001).

Method Detail

getCounts

public java.util.Map<java.lang.String,java.lang.Double> getCounts(java.lang.String text, boolean absoluteCounts)

Returns a map associating each LIWC categories to the number of their occurences in the input text. The counts are computed matching patterns loaded. It doesn't produce punctuation counts.

Parameters:: text - input text.; absoluteCounts - includes counts that aren't relative to the total word count (e.g. actual word count).

Returns:

hashtable associating each LIWC category with the percentage of words in the text belonging to it.

splitSentences

public static java.lang.String[] splitSentences(java.lang.String text)

Splits a text into sentences separated by a dot, exclamation point or question mark.

Parameters:: text - text to tokenize.
Returns:: an array of sentences.

tokenize

public static java.lang.String[] tokenize(java.lang.String text)

Splits a text into words separated by non-word characters.

Parameters:: text - text to tokenize.
Returns:: an array of words.

LIWCDictionary

recognizer
Class LIWCDictionary

LIWCDictionary

getCounts

splitSentences

tokenize

Recommended For You

Unveiling Freddie Highmore's Net Worth: Surprising Discoveries

The Unsung Heroine Of Suits

Unlocking The Secrets Of Tim Tszyu's Mother: Discoveries And Insights

Uncover The Political Prowess Of Jodi Faeth: Insights And Discoveries

LIWCDictionary

recognizer Class LIWCDictionary

LIWCDictionary

getCounts

splitSentences

tokenize

Recommended For You

Unveiling Freddie Highmore's Net Worth: Surprising Discoveries

The Unsung Heroine Of Suits

Unlocking The Secrets Of Tim Tszyu's Mother: Discoveries And Insights

Uncover The Political Prowess Of Jodi Faeth: Insights And Discoveries

recognizer
Class LIWCDictionary