Smurf
文章50
标签0
分类6
BOW

BOW

BOW

1. Language model has a history of over one hundred years

Past:

  • 𝑛-gram Language Model

Present:

  • Neural Language Model

  • Pretrained Language Model

Future:

  • Brain-Inspired Language Model

2. Brain-Inspired Language Model

image-20211012154440684

  • Humans’ language system in their brains can be divided into three regions including storing languages, sentiment, and representation,
  • When people see the sentence, they will think of it as a picture in their brain. So though the two sentences above are very similar, the images may be different.

3. Text Classification

  • Assigning subject categories, topics, or genres
  • Spam detection
  • Authorship identification
  • Age/gender identification
  • Language Identification
  • Sentiment analysis

3.1 input

  • a document
  • a fixed set of classes

3.2 Output

  • a predicted class

3.3 methods

Rules-based on combinations of words or other features

  • spam: black-list-address OR (“折扣” AND “降价”)

Accuracy can be high

  • If rules are carefully refined by expert

  • But building and maintaining these rules is expensive

image-20211012155726651

image-20211012155823155

3.4 The Bag of Words Representation

image-20211012160011553

  • Count the words of the document, and get a dictionary about the document. Then we can get the frequency of each word from the document.
  • Sometimes, it’s useless for us to count all words of the document. So we usually calculate the useful words of the document.

image-20211012160539437

image-20211012160556598

4. How to learn the classifier

4.1 Let’s start with Naive Bayes

(Simple “naïve” classification method based on Bayes rule)

4.1.1 Imagine two people Alice and Bob whose word usage pattern you know:

Alice often uses words: love, great, wonderful

Bob often uses words: dog, ball, wonderful

Alice words probabilities: love(0.1), great(0.8), wonderful(0.1)

Bob words probabilities: love(0.3), ball(0.2), wonderful(0.5)

Can you guess who sends: “wonderful love”?
  • So Bob do it

4.1.2 Suppose there are two bowls of cookies:

  • Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.

  • Bowl 2 contains 20 of each.

image-20211012161703515

Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla.

What is the probability that it came from Bowl 1?

  • P(c|x) is the posterior probability of class c given features.
  • P(c) is the probability of class.
  • P(x|c) is the likelihood which is the probability of features given class.
  • P(x) is the prior probability of features.

4.1.3

image-20211012163148939

Based on our training set we can also say the following:

  • From 500 bananas 400 (0.8) are Long, 350 (0.7) are Sweet and 450 (0.9) are Yellow

  • Out of 300 oranges, 0 are Long, 150 (0.5) are Sweet and 300 (1) are Yellow

  • From the remaining 200 fruits, 100 (0.5) are Long, 150 (0.75) are Sweet and 50 (0.25) are Yellow

  • So it’s banana

4.2 Naive Bayes Classifier

例:给定好评,对应评论的概率? 是否感到很奇怪?

  • parameters
  • Assume that conditionally independent

5. Learning the Naive Bayes Model

image-20211012164608762

  • Simply use the frequencies in the data (maximum likelihood estimates)
  • is equal to the likelihood of documents from class
  • is equal to the likelihood of the word in class

  • Create mega-document for topic j by concatenating all docs in this topic

    • Use frequency of w in mega-document

5.1 Laplace (add-1) smoothing

image-20211012165302994

5.2 unknown word

Add one extra word to the vocabulary, the “unknown word”

6. Try again with Textual examples

image-20211012165648970

Priors

Conditional Probabilities:

Choosing a class:

7. Sentiment Classification: Dealing with Negation 否定词

  • I really like this movie

  • I really don’t like this movie

Negation changes the meaning of “like” to negative.

Negation can also change negative to positive-ish

  • Don’t dismiss this film
  • Doesn’t let us get bored

7.1 Sentiment Classification: Dealing with Negation

Das, Sanjiv and Mike Chen. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA).

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.

image-20211012171051805

8. Naïve Bayes and Language Modeling

Naïve bayes classifiers can use any sort of feature

  • URL, email address, dictionaries, network features

But if, as in the previous slides

  • We use only word features
  • we use all of the words in the text (not a subset)

Then

  • Naive bayes has an important similarity to language modeling.

Each class = a unigram language model

Assigning each word: P(word | c)

Assigning each sentence: P(s|c)=Π P(word|c)

image-20211012171543613

image-20211012171603825

9. Evaluation

image-20211012171643676

precision just represent the rate of positive example predicted by the LM, so it can’t be an evidence that the model is a good model

image-20211012172005352

10.New Generation

image-20211012172208290

image-20211012172216528

本文作者:Smurf
本文链接:http://example.com/2021/08/15/nlp%20learning/Chapter4_BOW/
版权声明:本文采用 CC BY-NC-SA 3.0 CN 协议进行许可