Java自然语言处理（影印版） PDF下载

编辑推荐

暂无

内容简介

　　自然语言处理（NLP）是应用开发中的重要领域之一，其与解决当代问题的相关性将与日俱增。对于它通过NLP任务支持实现的自然语言可访问应用的需求已有显*增长。里斯编写的《Java自然语言处理（影印版）（英文版）》将运用诸如全文检索、合适名称识别、聚类、标签、信息抽取和摘要等手段展示如何自动组织文本。本书介绍了各种NLP概念，即便你没有任何统计学自然语言处理背景也能理解。
　　自然语言处理（NLP）是应用开发中的重要领域之一，其与解决当代问题的相关性将与日俱增。对于它通过NLP任务支持实现的自然语言可访问应用的需求已有显*增长。里斯编写的《Java自然语言处理（影印版）（英文版）》将运用诸如全文检索、合适名称识别、聚类、标签、信息抽取和摘要等手段展示如何自动组织文本。本书介绍了各种NLP概念，即便你没有任何统计学自然语言处理背景也能理解。

作者简介

暂无

Java自然语言处理（影印版） PDF下载

Preface

Chapter 1: Introduction to NLP

 What is NLP?

 Why use NLP?

 Why is NLP so hard?

 Survey of NLP tools

 Apache OpenNLP

 Stanford NLP

 LingPipe

 GATE

 UIMA

 Overview of text processing tasks

 Finding parts of text

 Finding sentences

 Finding people and things

 Detecting Parts of Speech

 Classifying text and documents

 Extracting relationships

 Using combined approaches

 Understanding NLP models

 Identifying the task

 Selecting a model

 Building and training the model

 Verifying the model

 Using the model

 Preparing data

 Summary

Chapter 2: Finding Parts of Text

 Understanding the parts of text

 What is tokenization?

 Uses of tokenizers

 Simple Java tokenizers

 Using the Scanner class

 Specifying the delimiter

 Using the split method

 Using the Breaklterator class

 Using the StreamTokenizer class

 Using the StringTokenizer class

 Performance considerations with java core tokenization

 NLP tokenizer APIs

 Using the OpenNLPTokenizer class

 Using the SimpleTokenizer class

 Using the WhitespaceTokenizer class

 Using the TokenizerME class

 Using the Stanford tokenizer

 Using the PTBTokenizer class

 Using the DocumentPreprocessor class

 Using a pipeline

 Using LingPipe tokenizers

 Training a tokenizer to find parts of text

 Comparing tokenizers

 Understanding normalization

 Converting to lowercase

 Removing stopwords

 Creating a StopWords class

 Using LingPipe to remove stopwords

 Using stemming

 Using the Porter Stemmer

 Stemming with LingPipe

 Using lemmatization

 Using the StanfordLemmatizer class

 Using lemmatization in OpenNLP

 Normalizing using a pipeline

 Summary

Chapter 3: Finding Sentences

 The SBD process

 What makes SBD difficult?

 Understanding SBD rules of LingPipe's

 HeuristicSentenceModel class

 Simple Java SBDs

 Using regular expressions

 Using the Breaklterator class

 Using NLP APIs

 Using OpenNLP

 Using the SentenceDetectorME class

 Using the sentPosDetect method

 Using the Stanford API

 Using the PTBTokenizer class

 Using the DocumentPreprocessor class

 Using the StanfordCoreNLP class

 Using LingPipe

 Using the IndoEuropeanSentenceModel class

 Using the SentenceChunker class

 Using the MedlineSentenceModel class

 Training a Sentence Detector model

 Using the Trained model

 Evaluating the model using the SentenceDetectorEvaluator class

 Summary

Chapter 4: Finding People and Things

 Why NER is difficult?

 Techniques for name recognition

 Lists and regular expressions

 Statistical classifiers

 Using regular expressions for NER

 Using Java's regular expressions to find entities

 Using LingPipe's RegExChunker class

 Using NLP APIs

 Using OpenNLP for NER

 Determining the accuracy of the entity

 Using other entity types

 Processing multiple entity types

 Using the Stanford API for NER

 Using LingPipe for NER

 Using LingPipe's name entity models

 Using the ExactDictionaryChunker class

 Training a model

 Evaluating a model

 Summary

Chapter 5: Detecting Parts of Speech

 The tagging process

 Importance of POS taggers

 What makes POS difficult?

 Using the NLP APIs

 Using OpenNLP POS taggers

 Using the OpenNLP POSTaggerME class for POS taggers

 Using OpenNLP chunking

 Using the POSDictionary class

 Using Stanford POS taggers

 Using Stanford MaxentTagger

 Using the MaxentTagger class to tag textese

 Using Stanford pipeline to perform tagging

 Using LingPipe POS taggers

 Using the HmmDecoder class with BestFirst tags

 Using the HmmDecoder class with NBest tags

 Determining tag confidence with the HmmDecoder class

 Training the OpenNLP POSModel

 Summary

Chapter 6: Classi ify_~g_ Texts and Documents

 How classification is used

 Understanding sentiment analysis

 Text classifying techniques

 Using APIs to classify text

 Using OpenNLP

 Training an OpenNLP classification model

 Using DocumentCategorizerME to classify text

 Using Stanford API

 Using the ColumnDataClassifier class for classification

 Using the Stanford pipeline to perform sentiment analysis

 Using LingPipe to classify text

 Training text using the Classified class

 Using other training categories

 Classifying text using LingPipe

 Sentiment analysis using LingPipe

 Language identification using LingPipe

 Summary

Chapter 7: Using Parser to Extract Relationships

 Relationship types

 Understanding parse trees

 Using extracted relationships

 Extracting relationships

 Using NLP APIs

 Using OpenNLP

 Using the Stanford API

 Using the LexicalizedParser class

 Using the TreePrint class

 Finding word dependencies using the GrammaticalStructure class

 Finding coreference resolution entities

 Extracting relationships for a question-answer system

 Finding the word dependencies

 Determining the question type

 Searching for the answer

 Summary

Chapter 8: Combined Approaches

 Preparing data

 Using Boilerpipe to extract text from HTML

 Using POI to extract text from Word documents

 Using PDFBox to extract text from PDF documents

 Pipelines

 Using the Stanford pipeline

 Using multiple cores with the Stanford pipeline

 Creating a pipeline to search text

 Summary

Index

Java自然语言处理（影印版） pdf下载声明

本pdf资料下载仅供个人学习和研究使用，不能用于商业用途，请在下载后24小时内删除。如果喜欢，请购买正版

pdf下载地址

版权归出版社和作者所有，下载链接已删除。如果喜欢，请购买正版！

链接地址：Java自然语言处理（影印版）

Java自然语言处理（影印版） PDF下载

编辑推荐

内容简介

作者简介

目录

Java自然语言处理（影印版） pdf下载声明

pdf下载地址

相关推荐

随机推荐

热门标签