Templates are the easiest way to implement surface NLG. A template for describing a flight noun phrase in the air travel domain might be flight departing from $city-fr at $time-dep and arriving in $city-to at $time-arr where the words starting with "$" are actually variables -- representing the departure city, and departure time, the arrival city, and the arrival time, respectively-- whose values will be extracted from the environment in which the template is used. The approach of writing individual templates is convenient, but may not scale to complex domains in which hundreds or thousands of templates would be necessary, and may have shortcomings in maintainability and text quality (e.g., see (Reiter, 1995) for a discussion). There are more sophisticated surface generation packages, such as FUF/SURGE (Elhadad and Robin, 1996), KPML (Bateman, 1996), MUMBLE (Meteer et al., 1987), and RealPro (Lavoie and Ram- bow, 1997), which produce natural language text from an abstract semantic representation. These packages require linguistic sophistication in order to write the abstract semantic representation, but they are flexible because minor changes to the input can accomplish major changes to the generated text. The only trainable approaches (known to the au- thor) to surface generation are the purely statistical machine translation (MT) systems such as (Berger et al., 1996) and the corpus-based generation system described in (Langkilde and Knight, 1998). The MT systems of (Berger et al., 1996) learn to generate text in the target language straight from the source language, without the aid of an explicit semantic representation. In contrast, (Langkilde and Knight, 1998) uses corpus-derived statistical knowl- edge to rank plausible hypotheses from a grammar- based surface generation component.
đang được dịch, vui lòng đợi..
