Pretraining and Finetuning a Transformer Model for Location Resolution


TYPELightning Talk
TRACKGenerative AI
INDUSTRYEnterprise Technology, Financial Services
TECHNOLOGIESDatabricks Experience (DBX), AI/Machine Learning, GenAI/LLMs
SKILL LEVELIntermediate
Extracting locations from unstructured text data and linking these entities to their entries in a geographic knowledge base is a critical building block for many tasks that help unlock greater location intelligence, including risk management and building knowledge graphs. While deep learning methods have tackled entity linking as a generic task, they primarily use Wikipedia as the target knowledge base, which lacks a comprehensive coverage of location entities and struggles with resolving fine-grained locations. By leveraging Databricks and the Hugging Face Transformers library, we pretrained and finetuned a custom model that is tailored to location resolution. In this session, we will share our learnings from building this model, including how we created our training datasets and deployment considerations. We hope it will provide insights for those interested in processing locations in text data at scale or adapting entity linking methods to domain-specific knowledge bases.


Evelyn Wang

/Data Scientist
Balyasny Asset Management