Abstract
Epik is a computer program for predicting pKa values for drug-like molecules. Epik can use this capability in combination with technology for tautomerization to adjust the protonation state of small drug-like molecules to automatically generate one or more of the most probable forms for use in further molecular modeling studies. Many medicinal chemicals can exchange protons with their environment, resulting in various ionization and tautomeric states, collectively known as protonation states. The protonation state of a drug can affect its solubility and membrane permeability. In modeling, the protonation state of a ligand will also affect which conformations are predicted for the molecule, as well as predictions for binding modes and ligand affinities based upon protein–ligand interactions. Despite the importance of the protonation state, many databases of candidate molecules used in drug development do not store reliable information on the most probable protonation states. Epik is sufficiently rapid and accurate to process large databases of drug-like molecules to provide this information. Several new technologies are employed. Extensions to the well-established Hammett and Taft approaches are used for pKa prediction, namely, mesomer standardization, charge cancellation, and charge spreading to make the predicted results reflect the nature of the molecule itself rather just for the particular Lewis structure used on input. In addition, a new iterative technology for generating, ranking and culling the generated protonation states is employed.
Similar content being viewed by others
References
Epik 1.5 (2007) Schrödinger, LLC, New York, NY
Perrin DD, Dempsy B, Sergeant EP (1981) pKa prediction for organic acids and bases. Chapman and Hall, London
LigPrep 2.1 (2007) Schrödinger, LLC, New York, NY
Resonance structures and mesomers are synonyms as discussed on the webpage: http://en.wikipedia.org/wiki/Resonance_structures. In the context of Epik we prefer to use mesomer because of its connection with the terminology “mesomeric effect” which often arises in discussions of pK a values
ChemAxon: http://www.chemaxon.com/product/pka.html
Sparc software: http://www.epa.gov/ATHENS/publications/reports/EPA_600_R_03_033.pdf
Klicic JJ, Friesner RA, Liu S-Y, Guida WCJ (2002) Phys Chem A 106, 1327 http://www.schrodinger.com
SMARTS, SMiles ARbitrary Target Specification, is a registered trademark of Daylight Chemical Information Systems
Jaffé HH (1953) Chem Rev 53:191
Clark J, Perrin DD (1964) Quart Rev 18:295
Longuet-Higgins HC (1950) J Chem Phys 18:265. ibid. 275. ibid. 283
Perrin DD (1965) J Am Chem Soc 5590
Maestro 8.0 (2007) Schrödinger, LLC, New York, NY
Hägele G, Holzgrabe U (1999) In: Holzgrabe U, Wawer I, Diehl B (eds), NMR spectroscopy in drug development and analysis. Wiley-VCH, Weinheim Germany, pp 61–76
Jaguar 7.0 (2007) Schrödinger, LLC, New York, NY
Hansch C, Leo A, Hoekman D (1995) Exploring QSAR, hyrdophobic, electronic and steric constants. American Chemical Society, Washington, DC
Serjeant EP, Dempsey B (1979) Ionization constants of organic acids in aqueous solution. Pergamon Press, Oxford England
Perrin DD (1965) Dissociation constants of organic bases in aqueous solution. Butterworths, London
Perrin DD (1972) Dissociation constants of organic bases in aqueous solution: supplement 1972. Butterworths, London
CrossFire Beilstein, version 7.0; MDL Information Systems GmbH, Frankfurt am Main, Germany (http://www.mdl.com)
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
10822_2007_9133_MOESM1_ESM.doc
Comparison between experimental and pKa values predicted by Epik for molecules listed in the drugbank with pKa values and SMILES patterns. Only predicted pKa values between 0.0 and 14 are reported except where the best matching pKa value exceeds 14 (nitrofurazone, mitomycin, and ethanol)
Appendix: Prioritizing SMARTS patterns for ABGs
Appendix: Prioritizing SMARTS patterns for ABGs
All SMARTS patterns for ABGs are assigned numeric priorites. The pattern with the highest priority that matches a particular ABG is selected and the parameters (e.g. pKa0 and ρ) associated with that pattern are used in the HT calculations. These priorities are assigned in one of three ways:
-
1.
The ABG was manually assigned a negative priority and thus are matched only if a more specific pattern is not found. This was only done for very general patterns (e.g. primary, secondary or tertiary amines) which would match many functionalities, most of which are better described by more specific patterns (e.g. amides and anilines). Roughly 5% of the patterns in the database are assigned negative priorities.
-
2.
The priority for the ABG was calculated from the SMARTS pattern.
-
3.
In a couple of cases the priority was calculated as described in the last item (2) except that a manually assigned shift was added to distinguish very closely related patterns.
The procedure for calculating the priority from the SMARTS pattern will be outlined in detail below.
All SMARTS patterns for ABGs are recorded in the acidic form and begin with the acidic hydrogen followed immediately by the atom to with the acidic hydrogen is bound. We will refer that atom as the first heavy atom.
The SMARTS pattern is translated into a list of atoms and a list of bonds. The type of bond is noted or inferred from the SMARTS pattern consistent with the SMARTS standard. Each atom is classified SP3-like unless it meets one of two conditions:
-
1.
If any of the bonds involving this atom are double, triple or aromatic
-
2.
If it is a O, S or N and bonded to an aromatic atom
The priority, P, of a SMARTS pattern for an ABG is calculated using the equation:
where: a i is the weighting for atom i in the SMARTS pattern and p i is an attenuation factor that depends on the shortest topological path from atom 2, the first heavy atom, to atom i. All atoms in the SMARTS pattern are included in the sum except the acidic hydrogen atom (atom 1). The a i values were determined by trial and error and are given in Table 3. The p i values were calculated using the equation:
where s j is a attenuation factor corresponding to a portion of the shortest path from atom 2 to atom i. Each non-aromatic bond has a separate propagation factor while each set of consecutive aromatic bonds gets a single factor. Aromatic bonds are treated differently because the influence of atoms in aromatic systems does not monotonically decrease with the number of bonds. The attenuation factors for non-aromatic bonds are given in Table 4 while those for aromatic bonds are given in Table 5.
Rights and permissions
About this article
Cite this article
Shelley, J.C., Cholleti, A., Frye, L.L. et al. Epik: a software program for pK a prediction and protonation state generation for drug-like molecules. J Comput Aided Mol Des 21, 681–691 (2007). https://doi.org/10.1007/s10822-007-9133-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-007-9133-z