Controllable generative AI has the potential to transform molecule design by enabling models that do not merely sample plausible candidates, but reliably steer generation toward desired properties, activities, and biological constraints; however, current approaches remain limited by weak controllability, distorted exploration of molecular spaces, architectural difficulties in jointly modeling molecules and properties, and low experimental hit rates that undermine real-world impact. This talk will address these gaps through a set of complementary frameworks for antimicrobial peptide and protein design, outlined below, although not sure if the description of all these approaches will fit in the time limit.
HydrAMP presents a conditional variational autoencoder that disentangles peptide representations from antimicrobial conditions and supports parameter-controlled creativity, achieving strong performance across unconstrained and analogue generation tasks and validating diverse, potent peptides in wet-lab experiments. OmegAMP introduces a diffusion-based generative model with a novel conditioning mechanism and biologically informed encoding space, enabling fine-grained control over physicochemical properties and species-specific activity, while a synthetic data augmentation strategy for classifier training drastically reduces false positives, leading to an unprecedented 96% wet-lab success rate across 25 tested peptides. Hyformer tackles the challenge of joint generation and property prediction by proposing a transformer-based model with alternating attention and joint pre-training, yielding synergistic benefits in conditional sampling, out-of-distribution property prediction, and representation learning, demonstrated in an antimicrobial peptide discovery setting. PepCompass reframes peptide optimization as a geometry-aware problem, defining a Union of κ-Stable Riemannian Manifolds induced by the decoder and introducing novel local exploration and optimization methods that, together with property-enriched geodesic search, enable efficient discovery of highly active, experimentally validated peptides. Finally, ProSpero extends controllable design to proteins through an active learning framework that guides a frozen generative model with surrogate feedback and biologically constrained sampling, enabling exploration beyond wild-type neighborhoods while maintaining plausibility and robustness. Together, these works illustrate how advances in controllability, representation, geometry, joint modeling, and learning from feedback can substantially narrow the gap between generative AI promise and experimental success in molecule design.