Enhancing Third Party Patent Monitoring with Machine Learning and Natural Language Processing

Application of state-of-the-art NLP models increases efficiency of third party patent monitoring in the nutrition and bioscience industry.

Objectives

Identifying relevant third-party patents using transformer-based classification models.

Background

Every year millions of patents are being published worldwide covering a vast variety of topics. Patent applications generally average ~10,000 words using unique, highly context dependent, meticulously wordsmithed language (aka “legalese” or “attornish”). Monitoring third party patents is a crucial element of business development and innovation for many companies.

Keyword-based search strategies can help to reduce screening efforts by subject matter experts (SMEs). However, even with a highly customized framework of rules it is challenging to make a selection containing mainly relevant patents. This results in a substantial time investment to manually screen irrelevant patent documents.

Results

The Institute of Data Science FHNW successfully developed a transformer-based classification model ensemble trained on third party patents annotated by DSM SMEs. A field study revealed that this model allows more efficient patent screening reducing substantially labor costs. Moreover, the model allows the pool of patents screened for relevance to be expanded, hence enabling identification of additional potentially relevant patents. Based on the PoC success, DSM intends to implement the solution on premise as a next step.

Information

Client	DSM Nutritional Products Ltd.
Execution	FHNW Institute for Data Science
Duration	6 months
Team	Prof. Dr. Daniel Perruchoud, Dr. Fernando Benites, Dominik Frefel, Joshua Meier

Enhancing Third Party Patent Monitoring with Machine Learning and Natural Language Processing

Objectives

Background

Results

Information

Contact