Natural language processing (NLP) is a field of artificial intelligence focused on enabling computers to understand, interpret, and generate human language in a useful way. It combines computational linguistics with machine learning and deep learning to process text and speech data. Core tasks include tokenisation, named entity recognition, sentiment analysis, machine translation, and question answering.
| Task | Technique | Example Input | Example Output | Application |
|---|---|---|---|---|
| Tokenisation | Rule-based / BPE | "Hello world!" | ["Hello", "world", "!"] | All NLP pipelines |
| Sentiment Analysis | BERT fine-tuning | "Great product!" | Positive (0.95) | Product reviews |
| Named Entity Recognition | BiLSTM-CRF | "Apple was founded by Steve Jobs" | ORG: Apple; PER: Steve Jobs | Information extraction |
| Machine Translation | Transformer (seq2seq) | "Namaste" (Hindi) | "Hello" (English) | Google Translate |
| Text Summarisation | Abstractive (T5/BART) | Long article | Short paragraph | News digests |
| Question Answering | Retrieval-augmented | "What is NLP?" | Factual answer | Chatbots, search |
Wikimedia Commons, CC BY-SA
The Transformer is a deep learning architecture introduced by Vaswani et al. in 2017 that relies entirely on self-attention mechanisms rather than recurrence or convolutions to model relationships between all positions in a sequence in parallel. It consists of an encoder–decoder structure with multi-head attention, positional encodings, and feed-forward layers. Transformers are the foundation of modern large language models including BERT, GPT, T5, and PaLM, and have also been applied to vision, audio, and multimodal tasks.
A convolutional neural network (CNN) is a deep learning architecture designed for processing structured grid data such as images, using learnable convolutional filters that detect spatial features like edges, textures, and shapes. The network stacks convolutional layers (feature extraction) with pooling layers (spatial downsampling) and fully connected layers (classification). CNNs revolutionised computer vision after AlexNet won the ImageNet competition in 2012 with significantly lower error rates than prior methods.
Feature engineering is the process of using domain knowledge to select, transform, or create input variables (features) from raw data to improve the performance of machine learning models. It bridges raw data and predictive algorithms by producing representations that algorithms can learn from more effectively. Techniques include normalization, one-hot encoding, polynomial feature creation, and dimensionality reduction.
The term "natural language processing" emerged in the 1950s–1960s, with early work by Alan Turing (1950) and the Georgetown–IBM experiment (1954). "Natural language" (as opposed to formal programming languages) derives from Latin naturalis (by birth) and lingua (tongue).