30–31 Oct 2023
Europe/Prague timezone

The first CACHE challenge: searching for hit molecules in ultra-large chemical databases

30 Oct 2023, 11:30
atrium (IT4Innovations)



Studentská 6231/1B 708 00 Ostrava-Poruba
Keynote Users' talks Keynote I


Pavel Polishchuk (IMTM, Palacky University)


The CACHE initiative (Critical Assessment of Computational Hit-finding Experiments) was created to improve and accelerate development of approaches for primary hit finding. The first competition involved 25 leading groups in computational chemistry and chemoinformatics from all over the world to find promising hit molecules for the WD40 repeat (WDR) domain of leucine-rich repeat kinase 2 (LRRK2) which is the most commonly mutated gene in familial Parkinson's Disease. The goal was to find hits among compound supplied by the Enamine company which maintains the database of about 2.5 million of synthesized compounds and a Enamine REAL Space which includes more than 10 billion of virtually enumerated synthetically accessible molecules. The 3D structure of the protein was resolved recently, however, no highly active hits were known for this protein that created an additional challenge.
We developed and used a multi-step pipeline to enable fast searching of potential hits in a database of billions of molecules. It included de novo generation of query molecules, similarity searching in a large database, a consensus scoring approach incorporated molecular docking and calculation of binding free energy by MM-GBSA, etc. After the first stage of the CACHE challenge we identified 8 experimentally confirmed hits, which brought us to top 5 teams. These hits were further optimized during the second stage. The final outcomes of the challenge must be released in September. In the talk we will describe challenges and opportunities in mining of ultra-large chemical libraries and the lessons we learned.

The work was supported by the Ministry of Education, Youth and Sports of the Czech Republic through INTER_EXCELLENCE II grant LUAUS23262 and the e-INFRA CZ (ID:90254) and projects ELIXIR-CZ (LM2023055) and CZ-OPENSCREEN (LM2023052). We also acknowledge the contributions from the project ENOCH (CZ.02.1.01/0.0/0.0/16_019/0000868).

Primary author

Pavel Polishchuk (IMTM, Palacky University)


Aleksandra Ivanova (IMTM, Palacky University) Alina Kutlushina (IMTM, Palacky University) Guzel Minibaeva (IMTM, Palacky University)

Presentation materials

There are no materials yet.