Section 1 : Natural language, grammar, and parsing
Commentary
Section Goals
- To introduce the concepts of language and communication in the context of agents and AI.
- To discuss formal grammars and augmented grammars for representing natural language.
- To introduce natural language parsing, or syntactic analysis, which is one of the most fundamental tasks of NLP.
Learning Objectives
Learning Objective 1
- Outline the component steps of communication.
- Explain speech act, and name some speech acts in the context of a multiagent world.
- Define formal language and break it down into several aspects.
- Describe the general representation in lexicon and grammar of a sample formal language for a fragment of English.
- Outline the principles and types of parsing algorithms.
- Describe the chart-parsing algorithms.
- Discuss the techniques used to augment grammars to represent some detail constraint or information in natural language.
- Explain the following concepts or terms:
- Speech act
- Formal language
- Grammar
- Semantics
- Pragmatics
- Phrase structure
- Rewrite rules
- Nonterminal symbol
- Lexicon
- Top-down parsing
- Bottom-up parsing
- Chart parser
- Definite clause grammar (DCG)
- Verb subcategorization
Objective Readings
Required readings:
Reading topics:
Communication as Action, Formal Grammar, Syntactic Analysis or Parsing, Augmented Grammars (see Sections 23.1 - 23.3 of AIMA3ed)
Aït-Mokhtar, S., Chanod, J. P., and Roux, C. (2002). Robustness beyond shallowness: Incremental deep parsing. Natural Language Engineering, 8(2-3), 121-144. DOI: 10.1017/S1351324902002887
Basili, R., and Zanzotto, F. M. (2002). Parsing engineering and empirical robustness. Natural Language Engineering, 8(2-3), 97-120. DOI: 10.1017/S1351324902002875
Supplemental Readings
Hammerton, J., Osborne, M., Armstrong, S., and Daelemans, W. (eds.) (2002). Special issue on machine learning approaches to shallow parsing. Journal of Machine Learning Research, 2(3).
Hammerton, J., Osborne, M., Armstrong, S., and Daelemans, W. (2002). Introduction to special issue on machine learning approaches to shallow parsing. Journal of Machine Learning Research, 2(3), 551-558.
Erik, F., and Sang, T. K. (2002). Memory-based shallow parsing. Journal of Machine Learning Research, 2(3), 559-594.
Molina, A., and Pla, F. (2002). Shallow parsing using specialized HMMs. Journal of Machine Learning Research, 2(3), 595-613.
Zhang, T., Damerau, F., and Johnson, D. (2002). Text chunking based on a generalization of Winnow. Journal of Machine Learning Research, 2(3), 615-637.
Megyesi, B. (2002). Shallow parsing with PoS taggers and linguistic features. 2(3):639-668.
Déjean, H. (2002). Learning rules and their exceptions. Journal of Machine Learning Research, 2(3), 669-693.
Osborne, M. (2002). Shallow parsing using noisy and non-stationary training material. Journal of Machine Learning Research, 2(3), 695-719.
Objective Questions
- What are the differences between speech acts and other acts, such as moving an object?
- Why should we still use formal language to represent natural language, which is far more complex and flexible than a real formal language?
- What techniques must be adopted to perform parsing algorithms?
- What kind of extra information or constraints in natural language can be represented in augmented grammars?
Objective Activities
- Explore the Internet to find and read one or two papers related to speech acts and how they are applied in agent communication. Share your findings in the course conference.
- Explore the following parsing algorithms related to this section of the textbook.
- Explore other parsing techniques and algorithms, such as dependence grammar parsing, shallow parsing, and Minipar, from the Web and journal papers.
- Complete Exercise 23.9 of AIMA3ed.
- Complete Exercise 23.12 of AIMA3ed.
- Complete Exercise 23.4 of AIMA3ed.
- Complete Exercise 23.6 of AIMA3ed.