Computer ScienceAI & Machine LearningMedium

Natural Language Processing

Also known as:NLPComputational LinguisticsLanguage AI

Natural language processing (NLP) is a field of artificial intelligence focused on enabling computers to understand, interpret, and generate human language in a useful way. It combines computational linguistics with machine learning and deep learning to process text and speech data. Core tasks include tokenisation, named entity recognition, sentiment analysis, machine translation, and question answering.

Key NLP Tasks, Techniques, and Applications

TaskTechniqueExample InputExample OutputApplication
TokenisationRule-based / BPE"Hello world!"["Hello", "world", "!"]All NLP pipelines
Sentiment AnalysisBERT fine-tuning"Great product!"Positive (0.95)Product reviews
Named Entity RecognitionBiLSTM-CRF"Apple was founded by Steve Jobs"ORG: Apple; PER: Steve JobsInformation extraction
Machine TranslationTransformer (seq2seq)"Namaste" (Hindi)"Hello" (English)Google Translate
Text SummarisationAbstractive (T5/BART)Long articleShort paragraphNews digests
Question AnsweringRetrieval-augmented"What is NLP?"Factual answerChatbots, search

Interactive Tools

NLTK — Natural Language Toolkit

Open Tool

Hugging Face NLP Course

Open Tool

Stanford NLP CoreNLP

Open Tool
Diagram of a natural language processing pipeline from raw text to structured output

Wikimedia Commons, CC BY-SA

Related Terms

Computer Science

Transformer (AI)

The Transformer is a deep learning architecture introduced by Vaswani et al. in 2017 that relies entirely on self-attention mechanisms rather than recurrence or convolutions to model relationships between all positions in a sequence in parallel. It consists of an encoder–decoder structure with multi-head attention, positional encodings, and feed-forward layers. Transformers are the foundation of modern large language models including BERT, GPT, T5, and PaLM, and have also been applied to vision, audio, and multimodal tasks.

Computer Science

Convolutional Neural Network

A convolutional neural network (CNN) is a deep learning architecture designed for processing structured grid data such as images, using learnable convolutional filters that detect spatial features like edges, textures, and shapes. The network stacks convolutional layers (feature extraction) with pooling layers (spatial downsampling) and fully connected layers (classification). CNNs revolutionised computer vision after AlexNet won the ImageNet competition in 2012 with significantly lower error rates than prior methods.

Computer Science

Feature Engineering

Feature engineering is the process of using domain knowledge to select, transform, or create input variables (features) from raw data to improve the performance of machine learning models. It bridges raw data and predictive algorithms by producing representations that algorithms can learn from more effectively. Techniques include normalization, one-hot encoding, polynomial feature creation, and dimensionality reduction.

The term "natural language processing" emerged in the 1950s–1960s, with early work by Alan Turing (1950) and the Georgetown–IBM experiment (1954). "Natural language" (as opposed to formal programming languages) derives from Latin naturalis (by birth) and lingua (tongue).

nlptext-analysislanguage-modeldeep-learningcomputational-linguistics