Dave Troy @davetroy

**Andreas Wagner** @anwagnerdreas@hcommons.social · Aug 5

Andreas Wagner @anwagnerdreas@hcommons.social

[LangExtract](https://developers.googleblog.com/en/introducing-langextract-a-gemini-powered-information-extraction-library/) has got me curious, but I don't get what makes it different from a [spacy-llm/prodigy](https://prodi.gy/docs/large-language-models) setup. Is it just that I am spared the effort of chunking long input and/or constructing output JSON from entities and offsets by writing the corresponding python code myself?...

Ah, one more difference is that langextract is #OpenSource whereas prodigy is not (?). (On the other hand, prodigy has a better integration with a correction+training workflow.)

developers.googleblog.comIntroducing LangExtract: A Gemini powered information extraction library- Google Developers BlogExplore LangExtract: a Gemini-powered, open-source Python library for reliable, structured information extraction from unstructured text with precise source grounding.

#llm #google #langextract

**Oliver Ammann** @oa@swiss.social · Jun 27

Jun 27

Oliver Ammann @oa@swiss.social

In meinem Blogbeitag auf ETHeritage beschreibe ich, wie wir auf e-rara #NamedEntityRecognition und #NamedEntityLinking einsetzten.

https://etheritage.ethz.ch/2025/06/27/drachenkopf/

#erara #ner #nel

**Oliver Ammann** @oa@swiss.social · Jun 23

Jun 23

Oliver Ammann @oa@swiss.social

Am Donnerstag referiert mein Kollege Roman Walt für mich an der #bibliocon25 zu #OCR und #NER #NEL auf e-rara – Danke Roman!

https://bid2025.abstractserver.com/program/#/details/presentations/525
#namedentityrecognition #namedentitylinking #ethbibliothek #erara

bid2025.abstractserver.comProgram

**Monika Barget** @mob@akademienl.social · Jun 19

Jun 19

Monika Barget @mob@akademienl.social

Also presented at #dae2025 : a research platform developed by Dariah PL for uploading, annotating, enriching and sharing #humanities data. The platform also allows collaborations. Interesting features are embedded #OCR and #NER functionalities. Find out more: https://lab.dariah.pl/en/

Dariah.labHome - Dariah.labDariah.lab is an infrastructure built as a part of DARIAH-PL project. Its purpose is to expand the scope of research in the humanities and arts in Poland, both in the purely scientific context as well as in the area of economic applications.

Continued thread

**Digital History Berlin** @DigitalHistory@fedihum.org · Jun 5

Jun 5

Digital History Berlin @DigitalHistory@fedihum.org

Stelle 3/5

Für das Projekt „TextPloring. Forschungsdatenexploration in den Geisteswissenschaften mit dem LAUDATIO-Repository“ suchen wir befristet auf 2 Jahre ein*e #WiMi (65%) für die Modellierung mittelalterlicher Städtechroniken, Integration in das #LAUDATIO Repositorium & eigenständiger Beforschung der Daten mit #DigitalHistory Methoden.

In diesem Rahmen besteht die Möglichkeit zur Entwicklung eines Promotionsprojekts.

https://dhistory.hypotheses.org/10656

#DH #Mediävistik #NER #LLM #DHJobs
4/6

Digital History BerlinOffene Stellen: Die Digital History an der HU Berlin sucht MitarbeitendeFünf neue Stellen im Bereich Digital History an der HU Berlin Die Professur für Digital History an der Humboldt-Universität zu Berlin sucht neue Mitarbeitende. Über den Sommer haben wir insgesamt fünf Positionen zu besetzen – in unterschiedlichen Projekten mit jeweils unterschiedlicher Ausrichtung und Stellenumfang. Bei allen Projekten geht es letztendlich immer um die Frage, wie … „Offene Stellen: Die Digital History an der HU Berlin sucht Mitarbeitende“ weiterlesen

**Digital History Berlin** @DigitalHistory@fedihum.org · Jun 5

Jun 5

Digital History Berlin @DigitalHistory@fedihum.org

Wir stellen ein!

Am Lehrstuhl für Digital HIstory der @HumboldtUni bzw. verbundenen Drittmittelprojekten sind über den Sommer 5 Stellen zu besetzen!

Mehr Details zu den Stellen und zur Bewerbung gesammelt auf unserem Blog... https://dhistory.hypotheses.org/10656

...oder im Kurzformat hier im Thread.

@histodons @historikerinnen
#DH #DigitalHistory #NFDI #DHJobs #WiMi #RSE #InfoWiss #WissKomm #OER #NER #LLM #Mediävistik
1/6

**Rainer Simon** @aboutgeo@vis.social · May 6

May 6

Rainer Simon @aboutgeo@vis.social

We've been working on a little library that might be useful if you work with #TEI and NER or text analysis:

• Extract plaintext from TEI
• Run your NER/NLP tools
• Map results back into the original TEI—without breaking anything!

Perfect for adding automated annotations to existing markup.

https://github.com/recogito/tei-standoffconverter-js

GitHubGitHub - recogito/tei-standoffconverter-js: Converts between XML tree and a flat plaintext and standoff (position-based table) representationConverts between XML tree and a flat plaintext and standoff (position-based table) representation - recogito/tei-standoffconverter-js

#DigitalHumanities #NER #NLP

**CLSinfra** @CLSinfra@fedihum.org · Apr 15

Apr 15

CLSinfra @CLSinfra@fedihum.org

#NLP people, a reminder that the brilliant notebooks developed by #Ghent CDH are a game-changing walkthrough of #NER, #ABSA Sentiment Analysis, and #RelationalExtraction pipelines.
Please share widely!
https://github.com/GhentCDH/CLSinfra/tree/main

GitHubGitHub - GhentCDH/CLSinfra: Repository that hosts the work done in the framework of the Computational Literary Studies Project (2020-2025).Repository that hosts the work done in the framework of the Computational Literary Studies Project (2020-2025). - GhentCDH/CLSinfra

**@frueheneuzeit** @stefan_hessbrueggen@fedihum.org · Apr 3

Apr 3

@frueheneuzeit @stefan_hessbrueggen@fedihum.org

#til the German transformer model for #spacy is not trained for #ner. Room for improvement, I'd say.

**Oliver Ammann** @oa@swiss.social · Mar 19

Mar 19

Oliver Ammann @oa@swiss.social

#erara hat zum 15jährigen Jubiläum ein paar neue Features bekommen: #NamedEntityRecognition #NamedEntityLinking und verbesserte Volltexterkennung.

https://library.ethz.ch/news-und-kurse/news/news-beitraege/2025/03/15-jahre-e-rara-neue-suchmoeglichkeiten-und-erweiterte-volltexterkennung.html

ETH-Bibliothek15 Jahre e-rara: Neue Suchmöglichkeiten und erweiterte VolltexterkennungNeue Einstiege für Orte, Personen und Themen sowie erweiterte Volltexterkennung erleichtern den Zugang zu digitalisierten Drucken.

#erara15 #ethbibliothek #ethz

**CLSinfra** @CLSinfra@fedihum.org · Mar 19

Mar 19

CLSinfra @CLSinfra@fedihum.org

three CLS INFRA Deliverables on #NLP are out now!
In this video Tess Dejaghere and Pranaydeep Singh of Ghent University CDH explain and demo work on #NER (#NamedEntityRecognition), #ABSA (Aspect-based #SentimentAnalysis) and #RelationalExtraction.
https://youtu.be/RJE83eb7a6A

YouTubeCLS INFRA Deliverables 8.3 - 8.5: ExplainerBy CLS INFRA

**Mareike König** @Mareike2405@fedihum.org · Mar 7 *

Mar 7 *

Mareike König @Mareike2405@fedihum.org

Canada ist ein Pferd - @bibwiss hat die besten Beispiele für Unsicherheit /Uncertainty bei Named Entity Recognition :) #NER #DHd2025

**Holle Meding** @hmeding@mastodon.social · Mar 6

Mar 6

Holle Meding @hmeding@mastodon.social

Panel: More than Chatbots: Multimodal Large Language Models in Humanities Workflows

At #DHd2025, Nina Rastinger explores how well #AI handles abbreviations & NER:

NER works well, even with small, low-cost models
Abbreviations are tricky—costs & resource demands skyrocket
GPT o1 improves performance, even on abbreviations, but remains resource-intensive
Balancing accuracy & efficiency in text processing remains a challenge!

#NER #TextProcessing #DigitalHumanities

**Digital History Berlin** @DigitalHistory@fedihum.org · Feb 12

Feb 12

Digital History Berlin @DigitalHistory@fedihum.org

We are happy to announce that we just published our first preprint on arXiv: "NER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approach".

http://arxiv.org/abs/2502.04351

It is also our first endevour into collaborative work with such a large number of collaborators & contributors from the Chair of Digital History, NFDI4Memory's Methods Innovation Lab, & AI-Skills.

arXiv.orgNER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approachNamed entity recognition (NER) is a core task for historical research in automatically establishing all references to people, places, events and the like. Yet, do to the high linguistic and genre diversity of sources, only limited canonisation of spellings, the level of required historical domain knowledge, and the scarcity of annotated training data, established approaches to natural language processing (NLP) have been both extremely expensive and yielded only unsatisfactory results in terms of recall and precision. Our paper introduces a new approach. We demonstrate how readily-available, state-of-the-art LLMs significantly outperform two leading NLP frameworks, spaCy and flair, for NER in historical documents by seven to twentytwo percent higher F1-Scores. Our ablation study shows how providing historical context to the task and a bit of persona modelling that turns focus away from a purely linguistic approach are core to a successful prompting strategy. We also demonstrate that, contrary to our expectations, providing increasing numbers of examples in few-shot approaches does not improve recall or precision below a threshold of 16-shot. In consequence, our approach democratises access to NER for all historians by removing the barrier of scripting languages and computational skills required for established NLP tools and instead leveraging natural language prompts and consumer-grade tools and frontends.

#DigitalHistory #NER #LLM

**Harald Sack** @lysander07@sigmoid.social · Jan 29 *

Jan 29 *

Harald Sack @lysander07@sigmoid.social

ReadMe2KG: Github ReadMe to Knowledge Graph #Challenge has been published as part of the Natural Scientific Language Processing and Research Knowledge Graphs #NSLP2025 workshop co-located with #eswc2025. This #NER task aims to complement the NDFI4DataScience KG via information extraction from GitHub README files.

task description: https://nfdi4ds.github.io/nslp2025/docs/readme2kg_shared_task.html
website: https://www.codabench.org/competitions/5396/

@eswc_conf @GenAsefa @shufan @NFDI4DS #NFDIrocks #knowledgegraphs #semanticweb #nlp #informationextraction

**Gerrit Heim** @Gerrit_Heim@openbiblio.social · Oct 18, 2024

Oct 18, 2024

Gerrit Heim @Gerrit_Heim@openbiblio.social

Wir haben im Rahmen eines Projekts den Nachlass Joseph von #Laßberg digitalisiert, mit #eScriptorium Volltexte erzeugt und noch #NER mit spaCy (als Forschungsdaten) und Googles NL gemacht. Spannendes Projekt, oft festgestellt, dass Open-Source-Alternativen noch nicht so weit sind und viele Übersetzungsschritte brauchen. Trotzdem erfolgreich fertiggestellt. Steht jetzt öffentlich zur Verfügung.

https://digital.blb-karlsruhe.de/lassberg/topic/view/316114

digital.blb-karlsruhe.deJoseph von Laßberg / Laßberg, Joseph von [1770-1855] [1-20]Joseph von Laßberg

Continued thread

**e-editiones** @eeditiones@social.e-editiones.org · Sep 3, 2024

Sep 3, 2024

e-editiones @eeditiones@social.e-editiones.org

now #NER via #spaCy and how the Annotations user interface is utilised to review potential matches and to do bulk updates for entities. She continues with connectors to authority provider like #Airtable, #Wikidata and others.

**de.hypotheses** @dehypotheses@fedihum.org · Jul 1, 2024

Jul 1, 2024

de.hypotheses @dehypotheses@fedihum.org

Named Entry Recongition ist eine computergestützte Methode zur Erkennung und Klassifizierung von Eigennamen in Texten. Bei historischen Texten ergeben sich besondere Herausforderungen für NER, z.B. durch nicht-standardisierte Schreibweisen.

Selina Galka hat versucht, eigene #NER Modelle für die Memoiren der Gräfin von Schwerin zu trainieren. Die Ergebnisse sind gemischt:

https://memoiren.hypotheses.org/609

#Memoiren #NLP #DigitalHumanities

**Digital History Berlin** @DigitalHistory@fedihum.org · Jun 25, 2024

Jun 25, 2024

Digital History Berlin @DigitalHistory@fedihum.org

#NER, aber prompto!

Im morgigen #DigitalHistoryOFK demonstrieren Torsten Hiltmann, Martin Dröge & Nicole Dresselhaus (HU Berlin, #4Memory) am Bsp. des Baedeker-Reiseführers von 1921 die Potenziale von #LargeLanguageModels & prompt-basierten Ansätzen für die #NamedEntityRecognition in historischen Textquellen.

Offen für alle!

Wann? Mi., 26.06., 4-6 pm, Zoom
Abstract: https://dhistory.hypotheses.org/7870
____
#DigitalHistory #promptoNER #LLM #genAI @nfdi4memory @histodons

**Digital History Berlin** @DigitalHistory@fedihum.org · Apr 29, 2024

Apr 29, 2024

Digital History Berlin @DigitalHistory@fedihum.org

Nächste Woche startet wieder das #DigitalHistoryOFK

Wir freuen uns, auch für das SoSe 24 wieder ein vielfältiges Programm präsentieren zu dürfen.
Mit dabei sind Vorträge zu #NFDI4Memory, #DataFeminism, #NER mit #LLMs, #MedievalHistory, #DataLiteracy, #MediaHistory & vielem mehr!

Zum Programm: https://dhistory.hypotheses.org/digital-history-forschungskolloquium/programm-sommersemester-2024

Das Kolloquium findet via Zoom statt & ist offen für alle, die sich für #DigitalHistory & #DigitalHumanities interessieren.

___
@histodons #digiGW

Recent searches

Search options

Administered by:

Server stats:

#ner