Yuan He (何源)
Logo Research Associate in Computer Science

I am currently a postdoctoral Research Associate in the Knowledge Representation and Reasoning (KRR) group at the Department of Computer Science, University of Oxford.

My research interests revolve around Natural Language Processing, Knowledge Engineering, and Deep Learning, currently with the following specific topics:

  • Large Language Models for Knowledge Engineering
  • Large Language Model Hallucinations and Interpretability
  • Retrieval-Augmented Generation with Large Language Models
  • Geometric Language Models
I am also the main contributor of several packages and resources, notably: DeepOnto, OAEI Bio-ML, and HierarchyTransformers.


News
2024
Paper accepted at NeurIPS 2024!
Sep
Got an Applied Scientist offer from Amazon Rufus! Career
Sep
Officially became a PhD with leave to supplicate! Degree
Sep
Presented at GenAI BootCamp by GetSeen Ventures.
Aug
Became a Research Associate at Oxford. Career
Apr
Paper accepted in the Semantic Web journal.
Mar
Paper accepted at ESWC 2024.
Feb
2023
Won the Best Resource Paper Runner-Up at CIKM 2023. Award
Oct
Poster paper accepted at ISWC 2023.
Sep
Two papers accepted at CIKM 2023.
Sep
Joined the Program Committee for AAAI 2024.
Aug
Joined the Program Committee for CIKM 2023.
Jun
Paper accepted in Findings of ACL 2023.
May
Paper accepted in the WWW journal.
Mar
Start teaching KRR classes.
Jan
Rebuilt DeepOnto.
Jan
2022
Nominated for Best Resource Paper candidate at ISWC 2022. Award
Oct
Started organising the Bio-ML track of OAEI.
Oct
Paper accepted at DL4KG@ISWC 2022 workshop.
Sep
Joined the Program Committee for AAAI 2023.
Aug
Joined the Program Committee for ISWC 2022.
Jul
Paper accepted at ISWC 2022.
Jul
Attended and Won the Best Research Report Award at International Semantic Web Summer School (ISWS 2022). Award
Jul
Presented paper accepted at AAAI 2022.
Feb
Started teaching Knowledge Representation & Reasoning (KRR) classes at Oxford.
Jan
2021
Paper accepted at AAAI 2022.
Dec
Passed the Transfer of Status for Oxford PhD. Degree
Dec
Presented paper accepted at OM@ISWC 2021 workshop.
Oct
Paper accepted at OM@ISWC 2021 workshop.
Aug
Started the Oxford-SRUK Ontology Alignment project.
Jan
2020
Presented paper accepted at AACL 2020.
Dec
Provided teaching support for MAT marking at Oxford.
Nov
Education
  • University of Oxford
    University of Oxford
    Oct 2020 - Sep 2024
    DPhil (PhD) in Computer Science
    Thesis: Language Models for Ontology Engineering [Link]
    Supervisor: Prof. Ian Horrocks, Prof. Bernardo Cuenca Grau, Dr Jiaoyan Chen
    Funding: Fully funded by Samsung Research UK
  • University of Edinburgh
    University of Edinburgh
    Sep 2016 - May 2020
    BSc (Hons) Artificial Intelligence and Mathematics
    Thesis: Incorporating Phonetic Information in Model Design for Machine Transliteration
    Supervisor: Dr. Shay Cohen
    Grade: First class and ranked top in the class
Experience
  • University of Oxford
    University of Oxford
    Research Associate
    Apr 2024 - now
  • Teaching Assistant (Structured Data)
    Jun 2024 - Jun 2024
  • Class Tutor (KRR)
    Jan 2022 - May 2023
  • University of Edinburgh
    University of Edinburgh
    Lab Demonstrator (Reasoning and Agents)
    Jan 2019 - May 2019
  • Research Intern (Multi-lingual Machine Transliteration)
    Jun 2018 - Aug 2018
  • Research Assistant (NLP for Finance)
    May 2017 - Dec 2019
Honors & Awards
  • Best Resource Paper Runner-Up at the ACM International Conference on Information and Knowledge Management (CIKM) [Certificate]
    2023
  • Best Resource Paper Candidate at the International Semantic Web Conference (ISWC) [Nomination]
    2022
  • Best Research Report Award from the International Semantic Web Summer School (ISWS) [Certificate]
    2022
  • PhD Scholarship from Samsung Research UK
    2021
  • The 2020 Joint Class Prize (Ranked Top 1) for the degree of BSc in Artificial Intelligence and Mathematics [Certificate]
    2020
Professional Services
  • Organiser & Chair: OAEI Bio-ML Track @ ISWC, ELMKE Workshop @EKAW
  • Program Committee Member: CIKM, AAAI, ISWC, ESWC
  • Conference Reviewer: ICLR, NeurIPS, ARR (ACL, EMNLP, NAACL, etc.), ECML PKDD
  • Journal Review: Journal of Bioinformatics, Journal of Biomedical Semantics, Journal of Web Semantics, Semantic Web Journal, Data Mining and Knowledge Discovery
Selected Publications (view all )
Language Models as Hierarchy Encoders
Language Models as Hierarchy Encoders

Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks

NeurIPS 2024

TL;DR: We introduce a novel approach to re-train transformer encoder-based language models as Hierarchy Transformer encoders (HiTs), leveraging the expansive nature of hyperbolic psace.

Abstract: Interpreting hierarchical structures latent in language is a key limitation of current language models (LMs). While previous research has implicitly leveraged these hierarchies to enhance LMs, approaches for their explicit encoding are yet to be explored. To address this, we introduce a novel approach to re-train transformer encoder-based LMs as Hierarchy Transformer encoders (HiTs), harnessing the expansive nature of hyperbolic space. Our method situates the output embedding space of pre-trained LMs within a Poincaré ball with a curvature that adapts to the embedding dimension, followed by training on hyperbolic clustering and centripetal losses. These losses are designed to effectively cluster related entities (input as texts) and organise them hierarchically. We evaluate HiTs against pre-trained LMs, standard fine-tuned LMs, and several hyperbolic embedding baselines, focusing on their capabilities in simulating transitive inference, predicting subsumptions, and transferring knowledge across hierarchies. The results demonstrate that HiTs consistently outperform all baselines in these tasks, underscoring the effectiveness and transferability of our re-trained hierarchy encoders.

Language Models as Hierarchy Encoders
Language Models as Hierarchy Encoders

Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks

NeurIPS 2024

TL;DR: We introduce a novel approach to re-train transformer encoder-based language models as Hierarchy Transformer encoders (HiTs), leveraging the expansive nature of hyperbolic psace.

Abstract: Interpreting hierarchical structures latent in language is a key limitation of current language models (LMs). While previous research has implicitly leveraged these hierarchies to enhance LMs, approaches for their explicit encoding are yet to be explored. To address this, we introduce a novel approach to re-train transformer encoder-based LMs as Hierarchy Transformer encoders (HiTs), harnessing the expansive nature of hyperbolic space. Our method situates the output embedding space of pre-trained LMs within a Poincaré ball with a curvature that adapts to the embedding dimension, followed by training on hyperbolic clustering and centripetal losses. These losses are designed to effectively cluster related entities (input as texts) and organise them hierarchically. We evaluate HiTs against pre-trained LMs, standard fine-tuned LMs, and several hyperbolic embedding baselines, focusing on their capabilities in simulating transitive inference, predicting subsumptions, and transferring knowledge across hierarchies. The results demonstrate that HiTs consistently outperform all baselines in these tasks, underscoring the effectiveness and transferability of our re-trained hierarchy encoders.

Language Models for Ontology Engineering
Language Models for Ontology Engineering

Yuan He

University of Oxford (PhD Thesis) 2024

TL;DR: My PhD Thesis.

Abstract: Ontology, originally a philosophical term, refers to the study of being and existence. The concept was introduced to Artificial Intelligence (AI) as a knowledge-based system that can model and share knowledge about entities and their relationships in a machine-readable format. Ontologies offer a structured and logical formalism of human knowledge, enabling expressive representations and reliable reasoning within defined domains. Meanwhile, modern deep learning-based language models (LMs) represent a significant milestone in the field of Natural Language Processing (NLP), as they incorporate substantial background knowledge from the vast and complex distribution of textual data. This thesis explores the synergy between these two paradigms, focusing primarily on the use of LMs in ontology engineering and, more broadly, in knowledge engineering. The goal is to automate or semi-automate the process of ontology construction and curation. Ontology engineering includes a wide array of tasks within the life cycle of ontology development. This thesis concentrates on three key aspects: (i) ontology alignment, which seeks to align equivalent concepts across different ontologies to achieve data integration; (ii) ontology completion, which focuses on filling in missing subsumption relationships between ontology concepts; and (iii) hierarchy embedding, which aims to develop versatile and interpretable neural representations for hierarchical structures derived not only from ontologies but also applicable to other forms of hierarchical data. These representations can facilitate a broad spectrum of downstream ontology engineering tasks, such as (i) and (ii), and are adaptable for more general applications in hierarchy-aware contexts. This thesis is organised into three parts. The first part establishes the foundations necessary for understanding ontologies and LMs. The chapter on ontologies initiates with a basic overview of computational ontologies, then provides an introduction of the description logic formalisms that underpin them. It concludes with the formal definitions of the three ontology engineering tasks this thesis focuses on. Transitioning to LMs, the subsequent chapter begins with a chronological overview of their evolution, followed by detailed exposition of various typical LMs along this evolution. The discussion then proceeds to contemporary transformer-based LMs, elaborating on their architecture and different learning paradigms they adopt. The chapter concludes with a review of how LMs and knowledge bases (including ontologies) interact and influence each other, highlighting the mutual benefits of this integration for both fields of study. With the comprehensive background provided in the first part, the second part of the thesis delves into specific methodologies that have been developed. This part comprises three chapters, each corresponding to the application of LMs in ontology alignment, ontology completion, and hierarchy embedding, respectively. In the chapter on LMs for ontology alignment, we introduce BERTMap, a novel pipeline system that employs LM fine-tuning for improved alignment prediction and ontology semantics for alignment refinement. We will also mention the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI), which has emerged as a benchmarking platform for a variety of ontology alignment systems over the past two years. The chapter on LMs for ontology completion presents OntoLAMA, a collection of LM probing datasets and a prompt-based LM probing approach that effectively predicts subsumptions, even with limited training resources. Lastly, the section on LMs for hierarchy embedding discusses the re-training of LMs as Hierarchy Transformer encoders (HiT), addressing the limitations of LMs in explicitly interpreting and encoding hierarchies, including those extracted from ontologies. The third part of the thesis details the practical implementations. We mainly present DeepOnto, a Python package designed for ontology engineering utilising deep learning, with an emphasis on LMs. DeepOnto offers a range of basic to advanced ontology processing functionalities to support deep learning-based ontology engineering development. This package also includes polished implementations of our systems and resources mentioned in Part II. In summary, this thesis advocates for a more holistic approach in AI development, where the integration of LMs and ontologies can lead to a more advanced, explainable, and useful paradigm in knowledge engineering and beyond.

Language Models for Ontology Engineering
Language Models for Ontology Engineering

Yuan He

University of Oxford (PhD Thesis) 2024

TL;DR: My PhD Thesis.

Abstract: Ontology, originally a philosophical term, refers to the study of being and existence. The concept was introduced to Artificial Intelligence (AI) as a knowledge-based system that can model and share knowledge about entities and their relationships in a machine-readable format. Ontologies offer a structured and logical formalism of human knowledge, enabling expressive representations and reliable reasoning within defined domains. Meanwhile, modern deep learning-based language models (LMs) represent a significant milestone in the field of Natural Language Processing (NLP), as they incorporate substantial background knowledge from the vast and complex distribution of textual data. This thesis explores the synergy between these two paradigms, focusing primarily on the use of LMs in ontology engineering and, more broadly, in knowledge engineering. The goal is to automate or semi-automate the process of ontology construction and curation. Ontology engineering includes a wide array of tasks within the life cycle of ontology development. This thesis concentrates on three key aspects: (i) ontology alignment, which seeks to align equivalent concepts across different ontologies to achieve data integration; (ii) ontology completion, which focuses on filling in missing subsumption relationships between ontology concepts; and (iii) hierarchy embedding, which aims to develop versatile and interpretable neural representations for hierarchical structures derived not only from ontologies but also applicable to other forms of hierarchical data. These representations can facilitate a broad spectrum of downstream ontology engineering tasks, such as (i) and (ii), and are adaptable for more general applications in hierarchy-aware contexts. This thesis is organised into three parts. The first part establishes the foundations necessary for understanding ontologies and LMs. The chapter on ontologies initiates with a basic overview of computational ontologies, then provides an introduction of the description logic formalisms that underpin them. It concludes with the formal definitions of the three ontology engineering tasks this thesis focuses on. Transitioning to LMs, the subsequent chapter begins with a chronological overview of their evolution, followed by detailed exposition of various typical LMs along this evolution. The discussion then proceeds to contemporary transformer-based LMs, elaborating on their architecture and different learning paradigms they adopt. The chapter concludes with a review of how LMs and knowledge bases (including ontologies) interact and influence each other, highlighting the mutual benefits of this integration for both fields of study. With the comprehensive background provided in the first part, the second part of the thesis delves into specific methodologies that have been developed. This part comprises three chapters, each corresponding to the application of LMs in ontology alignment, ontology completion, and hierarchy embedding, respectively. In the chapter on LMs for ontology alignment, we introduce BERTMap, a novel pipeline system that employs LM fine-tuning for improved alignment prediction and ontology semantics for alignment refinement. We will also mention the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI), which has emerged as a benchmarking platform for a variety of ontology alignment systems over the past two years. The chapter on LMs for ontology completion presents OntoLAMA, a collection of LM probing datasets and a prompt-based LM probing approach that effectively predicts subsumptions, even with limited training resources. Lastly, the section on LMs for hierarchy embedding discusses the re-training of LMs as Hierarchy Transformer encoders (HiT), addressing the limitations of LMs in explicitly interpreting and encoding hierarchies, including those extracted from ontologies. The third part of the thesis details the practical implementations. We mainly present DeepOnto, a Python package designed for ontology engineering utilising deep learning, with an emphasis on LMs. DeepOnto offers a range of basic to advanced ontology processing functionalities to support deep learning-based ontology engineering development. This package also includes polished implementations of our systems and resources mentioned in Part II. In summary, this thesis advocates for a more holistic approach in AI development, where the integration of LMs and ontologies can lead to a more advanced, explainable, and useful paradigm in knowledge engineering and beyond.

DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Zifeng Ding, Yifeng Li, Yuan He, Antonio Norelli, Jingcheng Wu, Volker Tresp, Yunpu Ma, Michael Bronstein.

Arxiv 2024

TL;DR: We introduce DyGMamba, a model that utilizes state space models (SSMs) for continuous-time dynamic graph (CTDG) representation learning.

Abstract: Learning useful representations for continuous-time dynamic graphs (CTDGs) is challenging, due to the concurrent need to span long node interaction histories and grasp nuanced temporal details. In particular, two problems emerge: (1) Encoding longer histories requires more computational resources, making it crucial for CTDG models to maintain low computational complexity to ensure efficiency; (2) Meanwhile, more powerful models are needed to identify and select the most critical temporal information within the extended context provided by longer histories. To address these problems, we propose a CTDG representation learning model named DyGMamba, originating from the popular Mamba state space model (SSM). DyGMamba first leverages a node-level SSM to encode the sequence of historical node interactions. Another time-level SSM is then employed to exploit the temporal patterns hidden in the historical graph, where its output is used to dynamically select the critical information from the interaction history. We validate DyGMamba experimentally on the dynamic link prediction task. The results show that our model achieves state-of-the-art in most cases. DyGMamba also maintains high efficiency in terms of computational resources, making it possible to capture long temporal dependencies with a limited computation budget.

DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models
DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Zifeng Ding, Yifeng Li, Yuan He, Antonio Norelli, Jingcheng Wu, Volker Tresp, Yunpu Ma, Michael Bronstein.

Arxiv 2024

TL;DR: We introduce DyGMamba, a model that utilizes state space models (SSMs) for continuous-time dynamic graph (CTDG) representation learning.

Abstract: Learning useful representations for continuous-time dynamic graphs (CTDGs) is challenging, due to the concurrent need to span long node interaction histories and grasp nuanced temporal details. In particular, two problems emerge: (1) Encoding longer histories requires more computational resources, making it crucial for CTDG models to maintain low computational complexity to ensure efficiency; (2) Meanwhile, more powerful models are needed to identify and select the most critical temporal information within the extended context provided by longer histories. To address these problems, we propose a CTDG representation learning model named DyGMamba, originating from the popular Mamba state space model (SSM). DyGMamba first leverages a node-level SSM to encode the sequence of historical node interactions. Another time-level SSM is then employed to exploit the temporal patterns hidden in the historical graph, where its output is used to dynamically select the critical information from the interaction history. We validate DyGMamba experimentally on the dynamic link prediction task. The results show that our model achieves state-of-the-art in most cases. DyGMamba also maintains high efficiency in terms of computational resources, making it possible to capture long temporal dependencies with a limited computation budget.

DeepOnto: A Package for Ontology Engineering with Deep Learning
DeepOnto: A Package for Ontology Engineering with Deep Learning

Yuan He, Jiaoyan Chen, Hang Dong, Ian Horrocks, Carlo Allocca, Taehun Kim, Brahmananda Sapkota

Semantic Web 2024

TL;DR: A Python package for ontology engineering with deep learning and language models.

Abstract: Integrating deep learning techniques, particularly language models (LMs), with knowledge representation techniques like ontologies has raised widespread attention, urging the need of a platform that supports both paradigms. Although packages such as OWL API and Jena offer robust support for basic ontology processing features, they lack the capability to transform various types of information within ontologies into formats suitable for downstream deep learning-based applications. Moreover, widely-used ontology APIs are primarily Java-based while deep learning frameworks like PyTorch and Tensorflow are mainly for Python programming. To address the needs, we present DeepOnto, a Python package designed for ontology engineering with deep learning. The package encompasses a core ontology processing module founded on the widely-recognised and reliable OWL API, encapsulating its fundamental features in a more "Pythonic" manner and extending its capabilities to incorporate other essential components including reasoning, verbalisation, normalisation, taxonomy, projection, and more. Building on this module, DeepOnto offers a suite of tools, resources, and algorithms that support various ontology engineering tasks, such as ontology alignment and completion, by harnessing deep learning methods, primarily pre-trained LMs. In this paper, we also demonstrate the practical utility of DeepOnto through two use-cases: the Digital Health Coaching in Samsung Research UK and the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI).

DeepOnto: A Package for Ontology Engineering with Deep Learning
DeepOnto: A Package for Ontology Engineering with Deep Learning

Yuan He, Jiaoyan Chen, Hang Dong, Ian Horrocks, Carlo Allocca, Taehun Kim, Brahmananda Sapkota

Semantic Web 2024

TL;DR: A Python package for ontology engineering with deep learning and language models.

Abstract: Integrating deep learning techniques, particularly language models (LMs), with knowledge representation techniques like ontologies has raised widespread attention, urging the need of a platform that supports both paradigms. Although packages such as OWL API and Jena offer robust support for basic ontology processing features, they lack the capability to transform various types of information within ontologies into formats suitable for downstream deep learning-based applications. Moreover, widely-used ontology APIs are primarily Java-based while deep learning frameworks like PyTorch and Tensorflow are mainly for Python programming. To address the needs, we present DeepOnto, a Python package designed for ontology engineering with deep learning. The package encompasses a core ontology processing module founded on the widely-recognised and reliable OWL API, encapsulating its fundamental features in a more "Pythonic" manner and extending its capabilities to incorporate other essential components including reasoning, verbalisation, normalisation, taxonomy, projection, and more. Building on this module, DeepOnto offers a suite of tools, resources, and algorithms that support various ontology engineering tasks, such as ontology alignment and completion, by harnessing deep learning methods, primarily pre-trained LMs. In this paper, we also demonstrate the practical utility of DeepOnto through two use-cases: the Digital Health Coaching in Samsung Research UK and the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI).

Language Model Analysis for Ontology Subsumption Inference
Language Model Analysis for Ontology Subsumption Inference

Yuan He, Jiaoyan Chen, Ernesto Jiménez-Ruiz, Hang Dong, Ian Horrocks

ACL (Findings) 2023

TL;DR: Probing the conceptual (ontological) knowledge in pre-trained language models.

Abstract: Investigating whether pre-trained language models (LMs) can function as knowledge bases (KBs) has raised wide research interests recently. However, existing works focus on simple, triple-based, relational KBs, but omit more sophisticated, logic-based, conceptualised KBs such as OWL ontologies. To investigate an LM's knowledge of ontologies, we propose OntoLAMA, a set of inference-based probing tasks and datasets from ontology subsumption axioms involving both atomic and complex concepts. We conduct extensive experiments on ontologies of different domains and scales, and our results demonstrate that LMs encode relatively less background knowledge of Subsumption Inference (SI) than traditional Natural Language Inference (NLI) but can improve on SI significantly when a small number of samples are given. We will open-source our code and datasets.

Language Model Analysis for Ontology Subsumption Inference
Language Model Analysis for Ontology Subsumption Inference

Yuan He, Jiaoyan Chen, Ernesto Jiménez-Ruiz, Hang Dong, Ian Horrocks

ACL (Findings) 2023

TL;DR: Probing the conceptual (ontological) knowledge in pre-trained language models.

Abstract: Investigating whether pre-trained language models (LMs) can function as knowledge bases (KBs) has raised wide research interests recently. However, existing works focus on simple, triple-based, relational KBs, but omit more sophisticated, logic-based, conceptualised KBs such as OWL ontologies. To investigate an LM's knowledge of ontologies, we propose OntoLAMA, a set of inference-based probing tasks and datasets from ontology subsumption axioms involving both atomic and complex concepts. We conduct extensive experiments on ontologies of different domains and scales, and our results demonstrate that LMs encode relatively less background knowledge of Subsumption Inference (SI) than traditional Natural Language Inference (NLI) but can improve on SI significantly when a small number of samples are given. We will open-source our code and datasets.

BERTMap: A BERT-based Ontology Alignment System
BERTMap: A BERT-based Ontology Alignment System

Yuan He, Jiaoyan Chen, Denvar Antonyrajah, Ian Horrocks

AAAI 2022

TL;DR: We introduce BERTMap, a pipeline ontology alignment system that leverages textual information from input ontologies to fine-tune BERT for lexical matching, structural and logical information to further refine the output mappings.

Abstract: Ontology alignment (a.k.a ontology matching (OM)) plays a critical role in knowledge integration. Owing to the success of machine learning in many domains, it has been applied in OM. However, the existing methods, which often adopt ad-hoc feature engineering or non-contextual word embeddings, have not yet outperformed rule-based systems especially in an unsupervised setting. In this paper, we propose a novel OM system named BERTMap which can support both unsupervised and semi-supervised settings. It first predicts mappings using a classifier based on fine-tuning the contextual embedding model BERT on text semantics corpora extracted from ontologies, and then refines the mappings through extension and repair by utilizing the ontology structure and logic. Our evaluation with three alignment tasks on biomedical ontologies demonstrates that BERTMap can often perform better than the leading OM systems LogMap and AML.

BERTMap: A BERT-based Ontology Alignment System
BERTMap: A BERT-based Ontology Alignment System

Yuan He, Jiaoyan Chen, Denvar Antonyrajah, Ian Horrocks

AAAI 2022

TL;DR: We introduce BERTMap, a pipeline ontology alignment system that leverages textual information from input ontologies to fine-tune BERT for lexical matching, structural and logical information to further refine the output mappings.

Abstract: Ontology alignment (a.k.a ontology matching (OM)) plays a critical role in knowledge integration. Owing to the success of machine learning in many domains, it has been applied in OM. However, the existing methods, which often adopt ad-hoc feature engineering or non-contextual word embeddings, have not yet outperformed rule-based systems especially in an unsupervised setting. In this paper, we propose a novel OM system named BERTMap which can support both unsupervised and semi-supervised settings. It first predicts mappings using a classifier based on fine-tuning the contextual embedding model BERT on text semantics corpora extracted from ontologies, and then refines the mappings through extension and repair by utilizing the ontology structure and logic. Our evaluation with three alignment tasks on biomedical ontologies demonstrates that BERTMap can often perform better than the leading OM systems LogMap and AML.

All publications
Amateur
  • Literature: I am an amateur writer of novels, poetry, and prose specific to Chinese literature from a very young age.
  • Music: I am an amateur music composer (took a university-level class for composition), singer, and pianist (with an amateur level eight certificate).
  • Sport: I was a member of the university badminton squad (OuBaC) at Oxford [Photo]
Learn more