Skip to content

Commit a13101f

Browse files
authored
Update NER (#9)
1 parent e9b0779 commit a13101f

File tree

1 file changed

+29
-4
lines changed
  • src/content/docs/reference/preparation

1 file changed

+29
-4
lines changed
+29-4
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,36 @@
11
---
22
title: Named Entity Recognition
3-
description: A guide in my new Starlight docs site.
3+
description: A concise guide to Named Entity Recognition methods, applications, and challenges.
44
---
55

6-
Guides lead a user through a specific task they want to accomplish, often with a sequence of steps.
7-
Writing a good guide requires thinking about what your users are trying to do.
6+
Named Entity Recognition (NER) is the process of automatically identifying and categorizing key elements in text such as names of people, organizations, locations, and more.
7+
It is a foundational technique in natural language processing (NLP) that facilitates a range of applications including information extraction, content classification, and search optimization.
8+
9+
## Traditional NER Methods
10+
11+
Early NER systems relied on rule-based and statistical approaches.
12+
Tools like [SpaCy](https://spacy.io/) implement a combination of hand-crafted rules and machine learning algorithms to efficiently recognize entities in text.
13+
However, traditional NER models are typically effective only for a set of predefined entity types, limiting their flexibility.
14+
15+
### GliNER: A Compact, Flexible Alternative
16+
17+
In contrast to conventional systems, GliNER introduces a compact NER model designed to identify any type of entity.
18+
Leveraging a bidirectional transformer encoder, GliNER facilitates parallel entity extraction—an advantage over the slow sequential token generation characteristic of many large language models (LLMs).
19+
Comprehensive testing shows that GliNER outperforms both ChatGPT and fine-tuned LLMs in zero-shot evaluations on various NER benchmarks, addressing limitations in traditional models while maintaining resource efficiency.
20+
21+
## NER with Large Language Models
22+
23+
Modern large language models (LLMs) have introduced a paradigm shift in how NER tasks can be approached.
24+
LLMs can perform NER in a zero-shot or few-shot learning setting, extracting arbitrary entities through natural language instructions.
25+
This offers greater flexibility compared to traditional methods, although the size and cost of LLMs can be impractical in resource-limited scenarios.
26+
27+
## Entity Linking and Entity Disambiguation
28+
29+
Beyond merely identifying entities, many applications require associating them with specific, unique identifiers—a process known as entity linking.
30+
This often involves resolving ambiguities where the same name might refer to multiple real-world entities, a challenge known as entity disambiguation.
31+
High-level solutions typically integrate NER with external knowledge bases and context-aware algorithms to ensure that each recognized entity is accurately matched to its correct reference.
832

933
## Further reading
1034

11-
- Read [about how-to guides](https://diataxis.fr/how-to-guides/) in the Diátaxis framework
35+
- [GliNER paper](https://arxiv.org/abs/2311.08526)
36+
- [SpaCy](https://spacy.io/)

0 commit comments

Comments
 (0)